Skip to content

Aider Polyglot Leaderboard

coding40 models · Updated June 2026

On the Aider Polyglot benchmark, GPT-5 ranks #1 with a score of 88.0, while DeepSeek: DeepSeek V3.2 offers the best score-per-dollar at $0.34/1M output tokens. The full ranking, with cost per million tokens, is below.

💰 Best value
DeepSeek: DeepSeek V3.2 — score 74.2 at $0.34/1M output tokens
#ModelScoreOutput / 1MContext
1GPT-5OpenAI88.0
2OpenAI o3-pro (2025-06-10)NanoGPT84.9$19.99200K
3Google: Gemini 2.5 Pro Preview 06-05Google83.1$10.001M
4o3OpenAI81.3
5Grok 4xAI79.6
6Google: Gemini 2.5 Pro Preview 05-06Google76.9$10.001M
7DeepSeek: DeepSeek V3.2DeepSeek74.2$0.34131K
8OpenAI o4-mini highNanoGPT72.0$4.40200K
9Claude Opus 4Anthropic72.0
10DeepSeek-R1 (May 2025)DeepSeek71.4
11Claude 3.7 SonnetAnthropic64.9
12o1OpenAI61.7
13Claude Sonnet 4Anthropic61.3
14o3-miniOpenAI60.4
15Kimi K2 ThinkingMoonshot59.1
16DeepSeek-V3 (Mar 2025)DeepSeek55.1
17gemini-2.5-flash-preview-05-20Jiekou.AI55.1$3.151M
18Grok 3xAI53.3
19GPT 4.1NanoGPT52.4$8.001M
20Claude 3.5 SonnetAnthropic51.6
21Grok 3 MiniGitHub Models49.3$0.00128K
22chatgpt-4o-latest302.AI45.3$15.00128K
23GPT-4.5OpenAI44.9
24gpt-oss-120bOpenAI41.8
25Qwen: Qwen3 32BQwen40.0$0.28131K
26o1-miniOpenAI32.9
27GPT 4.1 MiniNanoGPT32.4$1.601M
28Claude 3.5 HaikuQiniu28.0200K
29GPT-4o (Mar 2025)OpenAI23.1
30Gemini 2.0 FlashQiniu22.21M
31Qwen MaxAlibaba (China)21.8$1.38131K
32QwQ-32BAlibaba20.9
33DeepSeek-V2.5DeepSeek17.8
34Qwen2.5 Coder 32B Instruct16.4$1.00128K
35Llama 4 MaverickMeta AI15.6
36Yi-Lightning01.AI12.9
37Codestral 25.01GitHub Models11.1$0.0032K
38GPT 4.1 NanoNanoGPT8.9$0.401M
39Gemma 3 27B ITNanoGPT4.9$0.30128K
40GPT-4o miniOpenAI3.6

Frequently asked questions

Pricing is indicative — confirm with the provider before production use. Updated June 2026.