Models
29 models evaluated across 3 document AI benchmarks.

1
Nanonets OCR-3
Nanonets
Overall
85.9

2
GPT-5.4
OpenAI
Overall
83.5

3
Gemini-3-Pro
Overall
82.8

4
Gemini-3-Flash
Overall
82.0

5
Nanonets OCR2+
Nanonets
Overall
81.8

6
Gemini 3.1 Pro
Overall
81.6

7
GPT-5.2
OpenAI
Overall
81.5

8
Claude Sonnet 4.6
Anthropic
Overall
80.7

9
Claude Opus 4.6
Anthropic
Overall
80.4

10
Qwen3-VL-Plus
Alibaba
Overall
80.1

11
Qwen3-VL-235B
Alibaba
Overall
79.6

12
Qwen3.5-9B
Alibaba
Overall
76.7

13
GPT-5-Mini
OpenAI
Overall
75.2

14
Qwen3.5-4B
Alibaba
Overall
72.5

15
Mistral Small 4
Mistral AI
Overall
71.5

16
Claude Haiku 4.5
Anthropic
Overall
71.2

17
Ministral-8B
Mistral AI
Overall
69.5

18
GPT-4.1
OpenAI
Overall
68.0

19
GLM-OCR
Zhipu AI
Overall
64.2

20
Qwen3.5-2B
Alibaba
Overall
62.6

21
Qwen3.5-0.8B
Alibaba
Overall
57.8

22
GPT-5-Nano
OpenAI
Overall
54.8

23
Gemma-4-E4B-it
Overall
53.9

24
Llama-3.2-Vision-11B
Meta
Overall
50.8

25
Pixtral-12B
Mistral AI
Overall
46.5

26
Gemma-4-E2B-it
Overall
41.9

0
Gemma-3-12B-IT
Overall
0.0

0
Datalab Marker
Datalab
Overall
0.0

0
Qwen-VL-OCR
Alibaba
Overall
0.0