process_merged_json_file#

metabeeai.process_pdfs.process_merged_json_file(json_file_path, output_path=None)[source]#

Process a merged JSON file to deduplicate chunks and save the result.

Parameters:
  • json_file_path (Path) – Path to the input merged JSON file.

  • output_path (Path) – Path to save the deduplicated output (defaults to overwrite input).

Returns:

Deduplication statistics and results.

Return type:

dict