LeaderboardBenchmarksModels1v1ArenaResults ExplorerGitHub

Models

29 models evaluated across 3 document AI benchmarks.

1

Nanonets OCR-3

Nanonets

Overall
85.9
2

GPT-5.4

OpenAI

Overall
83.5
3

Gemini-3-Pro

Google

Overall
82.8
4

Gemini-3-Flash

Google

Overall
82.0
5

Nanonets OCR2+

Nanonets

Overall
81.8
6

Gemini 3.1 Pro

Google

Overall
81.6
7

GPT-5.2

OpenAI

Overall
81.5
8

Claude Sonnet 4.6

Anthropic

Overall
80.7
9

Claude Opus 4.6

Anthropic

Overall
80.4
10

Qwen3-VL-Plus

Alibaba

Overall
80.1
11

Qwen3-VL-235B

Alibaba

Overall
79.6
12

Qwen3.5-9B

Alibaba

Overall
76.7
13

GPT-5-Mini

OpenAI

Overall
75.2
14

Qwen3.5-4B

Alibaba

Overall
72.5
15

Mistral Small 4

Mistral AI

Overall
71.5
16

Claude Haiku 4.5

Anthropic

Overall
71.2
17

Ministral-8B

Mistral AI

Overall
69.5
18

GPT-4.1

OpenAI

Overall
68.0
19

GLM-OCR

Zhipu AI

Overall
64.2
20

Qwen3.5-2B

Alibaba

Overall
62.6
21

Qwen3.5-0.8B

Alibaba

Overall
57.8
22

GPT-5-Nano

OpenAI

Overall
54.8
23

Gemma-4-E4B-it

Google

Overall
53.9
24

Llama-3.2-Vision-11B

Meta

Overall
50.8
25

Pixtral-12B

Mistral AI

Overall
46.5
26

Gemma-4-E2B-it

Google

Overall
41.9
0

Gemma-3-12B-IT

Google

Overall
0.0
0

Datalab Marker

Datalab

Overall
0.0
0

Qwen-VL-OCR

Alibaba

Overall
0.0

Open benchmark for document AI models.

v1.5