Intelligent Document Processing Leaderboard
Comprehensive document AI leaderboard comparing the best models across OCR, table extraction, key information extraction, and visual QA. Compare performance, accuracy, and cost.
This work is sponsored by Nanonets.
0 Benchmarks0 ModelsOpen evaluation
# | Model | Overall | OlmOCR | OmniDoc | IDP | Size |
|---|---|---|---|---|---|---|
| 1 | Nanonets OCR2+Nanonets | 81.8 | 82.2 | 89.5 | 73.8 | — |
| 2 | Gemini-3-ProGoogle | 81.4 | 73.5 | 88.8 | 81.8 | — |
| 3 | Claude Sonnet 4.6Anthropic | 80.8 | 74.4 | 86.9 | 81.2 | — |
| 4 | Claude Opus 4.6Anthropic | 80.3 | 73.9 | 85.9 | 81.1 | — |
| 5 | Gemini-3-FlashGoogle | 79.9 | 69.2 | 90.1 | 80.5 | — |
| 6 | GPT-5.2OpenAI | 79.2 | 72.2 | 88.0 | 77.4 | — |
| 7 | GPT-5-MiniOpenAI | 70.8 | 56.7 | 82.5 | 73.3 | — |
| 8 | GPT-4.1OpenAI | 70.0 | 55.5 | 79.9 | 74.7 | — |
| 9 | Claude Haiku 4.5Anthropic | 69.6 | 56.2 | 79.6 | 72.9 | — |
| 10 | Ministral-8BMistral AI | 68.0 | 57.8 | 78.3 | 67.9 | 8B |
| 11 | GLM-OCRZhipu AI | 63.6 | 66.7 | 69.2 | 54.9 | — |
| 12 | GPT-5-NanoOpenAI | 50.7 | 22.8 | 63.4 | 65.8 | — |
| 13 | Llama-3.2-Vision-11BMeta | 50.1 | 47.2 | 44.6 | 58.6 | 11B |
| 14 | Pixtral-12BMistral AI | 46.0 | 36.8 | 42.3 | 59.0 | 12B |
About the Leaderboard
The Intelligent Document Processing (IDP) Leaderboard provides a comprehensive evaluation framework for assessing the capabilities of various AI models in document understanding and processing tasks. All models are evaluated using identical prompts, images, and scoring pipelines. The overall score is the mean of all benchmark scores.