A unified leaderboard for OCR, KIE, classification, QA, table extraction, and confidence score evaluation
The Intelligent Document Processing (IDP) Leaderboard provides a comprehensive evaluation framework for assessing the capabilities of various AI models in document understanding and processing tasks. This benchmark covers seven critical aspects of document intelligence:
This benchmark is included in the Intelligent Document Processing (IDP) Leaderboard, which assesses the performance of different models in table extraction task. For a comprehensive evaluation of document understanding tasks, please visit the full leaderboard.
Table Extraction evaluates how well models can identify, understand, and extract tabular data from documents. This includes preserving table structure, relationships between cells, and accurately extracting both numerical and textual content.
Rank | Model | Avg | nanonets_small_sparse_structured | nanonets_small_dense_structured | nanonets_small_sparse_unstructured | nanonets_long_dense_structured | nanonets_long_sparse_structured | nanonets_long_sparse_unstructured |
---|---|---|---|---|---|---|---|---|
1 | claude-sonnet-4 | 93.44 | 98.14 | 98.98 | 83.95 | 92.75 | 94.87 | 91.96 |
2 | claude-3.7-sonnet (reasoning:low) | 91.23 | 98.29 | 99.06 | 84.89 | 92.82 | 92.92 | 79.38 |
2 | gemini-2.5-pro-preview-03-25 (reasoning: low) | 79.51 | 81.80 | 86.58 | 72.58 | 91.72 | 88.36 | 55.99 |
3 | qwen2.5-vl-32b-instruct | 77.46 | 99.07 | 98.89 | 34.55 | 89.22 | 86.29 | 56.74 |
4 | gemini-2.5-flash-preview-04-17 | 75.82 | 89.00 | 94.30 | 60.36 | 89.99 | 84.33 | 36.92 |
5 | gpt-4.1-2025-04-14 | 74.34 | 90.23 | 97.02 | 66.22 | 75.79 | 69.99 | 46.76 |
6 | llama-4-maverick | 74.15 | 89.94 | 97.90 | 52.57 | 92.50 | 86.24 | 25.74 |
7 | gemini-2.0-flash | 71.32 | 86.35 | 93.09 | 52.12 | 85.07 | 72.70 | 38.62 |
8 | o4-mini-2025-04-16 | 70.76 | 95.48 | 98.70 | 66.64 | 66.56 | 68.51 | 28.65 |
9 | mistral-medium-3 | 70.21 | 82.58 | 97.30 | 65.61 | 75.60 | 64.33 | 35.86 |
10 | gpt-4o-2024-08-06 | 64.30 | 76.14 | 94.11 | 61.00 | 65.04 | 54.11 | 35.38 |
11 | mistral-small-3.1-24b-instruct | 61.64 | 72.19 | 89.96 | 58.10 | 64.95 | 57.52 | 27.13 |
12 | gpt-4o-2024-11-20 | 60.74 | 76.46 | 92.10 | 61.62 | 53.02 | 49.83 | 31.39 |
13 | InternVL3-38B-Instruct | 58.03 | 84.37 | 92.19 | 74.46 | 32.00 | 33.31 | 31.85 |
14 | gemma-3-27b-it | 52.38 | 77.11 | 88.60 | 56.01 | 39.42 | 28.55 | 24.57 |
15 | gpt-4.1-nano-2025-04-14 | 50.83 | 68.48 | 89.32 | 47.21 | 50.82 | 33.23 | 15.89 |
16 | gpt-4o-mini-2024-07-18 | 50.47 | 57.31 | 82.50 | 51.34 | 53.37 | 32.60 | 25.67 |
17 | qwen2.5-vl-72b-instruct | 48.58 | 63.03 | 80.77 | 59.60 | 28.45 | 33.41 | 26.21 |