Skip to content

Latest commit

 

History

History
154 lines (100 loc) · 4.27 KB

File metadata and controls

154 lines (100 loc) · 4.27 KB

OmniDocBench Benchmarks

OmniDocBench on HuggingFace

Create OmniDocBench evaluation datasets:

# Make the ground-truth
docling-eval create-gt --benchmark OmniDocBench --output-dir ./benchmarks/OmniDocBench-gt/ 

# Make predictions for different modalities.
docling-eval create-eval \
  --benchmark OmniDocBench \
  --gt-dir ./benchmarks/OmniDocBench-gt/ \
  --output-dir ./benchmarks/OmniDocBench-e2e/ \
  --prediction-provider Docling # use full-document predictions from docling
  
docling-eval create-eval \
  --benchmark DPBench \
  --gt-dir ./benchmarks/OmniDocBench-gt/ \
  --output-dir ./benchmarks/OmniDocBench-tables/ \
  --prediction-provider TableFormer # use tableformer predictions only

Layout Evaluation

Create the evaluation report:

docling-eval evaluate \
  --modality layout \
  --benchmark OmniDocBench \
  --output-dir ./benchmarks/OmniDocBench-e2e/ 

Layout evaluation json

Visualize the report:

docling-eval visualize \
  --modality layout \
  --benchmark OmniDocBench \
  --output-dir ./benchmarks/OmniDocBench-e2e/ 

mAP[0.5:0.95] report

mAP[0.5:0.95] plot

Tableformer Evaluation

Create the evaluation report:

docling-eval evaluate \
  --modality table_structure \
  --benchmark OmniDocBench \
  --output-dir ./benchmarks/OmniDocBench-tables/ 

Tableformer evaluation json

Visualize the report:

docling-eval visualize \
  --modality table_structure \
  --benchmark OmniDocBench \
  --output-dir ./benchmarks/OmniDocBench-tables/ 

TEDS plot

TEDS struct only plot

TEDS struct only report

TEDS struct with text plot

TEDS struct with text report

Reading order Evaluation

Create the evaluation report:

docling-eval evaluate \
  --modality reading_order \
  --benchmark OmniDocBench \
  --output-dir ./benchmarks/OmniDocBench-e2e/ 

Reading order json

Visualize the report:

docling-eval visualize \
  --modality reading_order \
  --benchmark OmniDocBench \
  --output-dir ./benchmarks/OmniDocBench-e2e/ 

ARD report

Weighted ARD report

ARD plot

Weighted ARD plot

Markdown text evaluation

Create the evaluation report:

docling-eval evaluate \
  --modality markdown_text \
  --benchmark OmniDocBench \
  --output-dir ./benchmarks/OmniDocBench-e2e/ 

Markdown text json

Visualize the report:

docling-eval visualize \
  --modality markdown_text \
  --benchmark OmniDocBench \
  --output-dir ./benchmarks/OmniDocBench-e2e/ 

Markdown text report

BLEU plot

Edit distance plot

F1 plot

Meteor plot

Precision plot

Recall plot