OmniDocBench
v1.5Built by OpenDataLab. 1,355 pages from papers, books, slides, exams, newspapers, and magazines. Scores text extraction via edit distance, formula recognition via CDM, table structure via TEDS, and reading order accuracy. Overall = ((1 − Text Edit) × 100 + Table TEDS + Formula CDM) / 3.
Overall Score = ((1 - Text Edit) x 100 + Table TEDS + Formula CDM) / 3
Rankings
# | Model | Overall | Text Edit↓ | CDM↑ | TEDS↑ | TEDS-S↑ | Read Order↓ |
|---|---|---|---|---|---|---|---|
| 1 | Gemini-3-FlashGoogle | 90.1 | 0.077 | 90.2 | 87.7 | 92.6 | 0.081 |
| 2 | Nanonets OCR2+Nanonets | 89.5 | 0.056 | 90.3 | 79.1 | 83.6 | 0.090 |
| 3 | Gemini-3-ProGoogle | 88.8 | 0.078 | 87.3 | 87.0 | 91.7 | 0.084 |
| 4 | GPT-5.2OpenAI | 88.0 | 0.111 | 90.1 | 84.9 | 89.5 | 0.098 |
| 5 | Claude Sonnet 4.6Anthropic | 86.9 | 0.165 | 90.2 | 87.1 | 91.2 | 0.149 |
| 6 | Claude Opus 4.6Anthropic | 85.9 | 0.151 | 88.5 | 84.4 | 89.1 | 0.136 |
| 7 | Datalab MarkerDatalab | 85.5 | 0.109 | 88.3 | 79.1 | 83.7 | 0.106 |
| 8 | Gemini 3.1 ProGoogle | 85.3 | 0.082 | 83.3 | 80.8 | 85.4 | 0.073 |
| 9 | GPT-5.4OpenAI | 85.3 | 0.089 | 83.4 | 81.3 | 86.7 | 0.077 |
| 10 | GPT-5-MiniOpenAI | 82.5 | 0.138 | 86.7 | 74.6 | 80.1 | 0.121 |
| 11 | GPT-4.1OpenAI | 79.9 | 0.167 | 82.2 | 74.0 | 83.8 | 0.115 |
| 12 | Claude Haiku 4.5Anthropic | 79.6 | 0.224 | 84.2 | 77.1 | 83.8 | 0.178 |
| 13 | Ministral-8BMistral AI | 78.3 | 0.157 | 83.3 | 67.1 | 73.8 | 0.125 |
| 14 | GLM-OCRZhipu AI | 69.2 | 0.144 | 84.7 | 37.4 | 39.3 | 0.141 |
| 15 | GPT-5-NanoOpenAI | 63.4 | 0.319 | 61.0 | 61.2 | 69.5 | 0.243 |
| 16 | Gemma-3-12B-ITGoogle | 44.6 | 0.476 | 50.0 | 31.6 | 46.9 | 0.364 |
| 17 | Llama-3.2-Vision-11BMeta | 44.6 | 0.541 | 55.4 | 32.6 | 42.9 | 0.340 |
| 18 | Pixtral-12BMistral AI | 42.3 | 0.641 | 58.8 | 32.1 | 50.8 | 0.422 |
Metrics
Character-level edit distance between predicted and ground-truth text blocks. Lower values indicate more accurate text extraction.
Character Detection Matching score for display formulas. Measures structural and symbolic accuracy of recognized mathematical expressions.
Tree Edit Distance-based Similarity for tables. Evaluates both content and structure of extracted tables.
Structure-only TEDS that ignores cell content. Focuses purely on table layout and cell spanning.
Edit distance measuring how well the model preserves the correct reading order across multi-column and complex layouts.