Aider Polyglot Leaderboard
coding40 models · Updated June 2026
On the Aider Polyglot benchmark, GPT-5 ranks #1 with a score of 88.0, while DeepSeek: DeepSeek V3.2 offers the best score-per-dollar at $0.34/1M output tokens. The full ranking, with cost per million tokens, is below.
💰 Best value
DeepSeek: DeepSeek V3.2 — score 74.2 at $0.34/1M output tokens
| # | Model | Score | Output / 1M | Context |
|---|---|---|---|---|
| 1 | GPT-5OpenAI | 88.0 | — | — |
| 2 | OpenAI o3-pro (2025-06-10)NanoGPT | 84.9 | $19.99 | 200K |
| 3 | Google: Gemini 2.5 Pro Preview 06-05Google | 83.1 | $10.00 | 1M |
| 4 | o3OpenAI | 81.3 | — | — |
| 5 | Grok 4xAI | 79.6 | — | — |
| 6 | Google: Gemini 2.5 Pro Preview 05-06Google | 76.9 | $10.00 | 1M |
| 7 | DeepSeek: DeepSeek V3.2DeepSeek | 74.2 | $0.34 | 131K |
| 8 | OpenAI o4-mini highNanoGPT | 72.0 | $4.40 | 200K |
| 9 | Claude Opus 4Anthropic | 72.0 | — | — |
| 10 | DeepSeek-R1 (May 2025)DeepSeek | 71.4 | — | — |
| 11 | Claude 3.7 SonnetAnthropic | 64.9 | — | — |
| 12 | o1OpenAI | 61.7 | — | — |
| 13 | Claude Sonnet 4Anthropic | 61.3 | — | — |
| 14 | o3-miniOpenAI | 60.4 | — | — |
| 15 | Kimi K2 ThinkingMoonshot | 59.1 | — | — |
| 16 | DeepSeek-V3 (Mar 2025)DeepSeek | 55.1 | — | — |
| 17 | gemini-2.5-flash-preview-05-20Jiekou.AI | 55.1 | $3.15 | 1M |
| 18 | Grok 3xAI | 53.3 | — | — |
| 19 | GPT 4.1NanoGPT | 52.4 | $8.00 | 1M |
| 20 | Claude 3.5 SonnetAnthropic | 51.6 | — | — |
| 21 | Grok 3 MiniGitHub Models | 49.3 | $0.00 | 128K |
| 22 | chatgpt-4o-latest302.AI | 45.3 | $15.00 | 128K |
| 23 | GPT-4.5OpenAI | 44.9 | — | — |
| 24 | gpt-oss-120bOpenAI | 41.8 | — | — |
| 25 | Qwen: Qwen3 32BQwen | 40.0 | $0.28 | 131K |
| 26 | o1-miniOpenAI | 32.9 | — | — |
| 27 | GPT 4.1 MiniNanoGPT | 32.4 | $1.60 | 1M |
| 28 | Claude 3.5 HaikuQiniu | 28.0 | — | 200K |
| 29 | GPT-4o (Mar 2025)OpenAI | 23.1 | — | — |
| 30 | Gemini 2.0 FlashQiniu | 22.2 | — | 1M |
| 31 | Qwen MaxAlibaba (China) | 21.8 | $1.38 | 131K |
| 32 | QwQ-32BAlibaba | 20.9 | — | — |
| 33 | DeepSeek-V2.5DeepSeek | 17.8 | — | — |
| 34 | Qwen2.5 Coder 32B Instruct | 16.4 | $1.00 | 128K |
| 35 | Llama 4 MaverickMeta AI | 15.6 | — | — |
| 36 | Yi-Lightning01.AI | 12.9 | — | — |
| 37 | Codestral 25.01GitHub Models | 11.1 | $0.00 | 32K |
| 38 | GPT 4.1 NanoNanoGPT | 8.9 | $0.40 | 1M |
| 39 | Gemma 3 27B ITNanoGPT | 4.9 | $0.30 | 128K |
| 40 | GPT-4o miniOpenAI | 3.6 | — | — |
Frequently asked questions
Pricing is indicative — confirm with the provider before production use. Updated June 2026.