metabeeai.process_pdfs#
metabeeai.process_pdfs Package#
Functions#
|
Split PDFs in the specified directory into single-page or overlapping 2-page segments. |
|
Process papers in the specified directory using Vision Agentic Document Analysis, starting from an optional folder. |
|
|
|
|
|
Process all merged_v2.json files in the base directory. |
|
Analyze the uniqueness of chunks in a paper. |
|
Remove duplicate chunks based on text content while preserving chunk IDs. |
|
Process a merged JSON file to deduplicate chunks and save the result. |