LiveBench Leaderboard
reasoning/coding71 models · Updated June 2026
On the LiveBench benchmark, GPT-5.5 ranks #1 with a score of 81.3, while Qwen: Qwen3 235B A22B Thinking 2507 offers the best score-per-dollar at $0.10/1M output tokens. The full ranking, with cost per million tokens, is below.
💰 Best value
Qwen: Qwen3 235B A22B Thinking 2507 — score 52.9 at $0.10/1M output tokens
| # | Model | Score | Output / 1M | Context |
|---|---|---|---|---|
| 1 | GPT-5.5OpenAI | 81.3 | — | — |
| 2 | GPT-5.4OpenAI | 80.9 | — | — |
| 3 | Gemini 3.1 ProGoogle DeepMind | 80.7 | — | — |
| 4 | Claude Opus 4.8 ThinkingNanoGPT | 79.5 | $25.01 | 1M |
| 5 | Claude Fable 5Anthropic | 78.6 | — | — |
| 6 | Claude Opus 4.7Anthropic | 77.1 | — | — |
| 7 | Anthropic: Claude Opus 4.6 (Fast)Anthropic | 76.8 | $150.00 | 1M |
| 8 | Z.ai: GLM 5.2Z.ai | 76.2 | $3.00 | 1M |
| 9 | Claude Opus 4.5Anthropic | 76.0 | — | — |
| 10 | Gemini 3.5 FlashNanoGPT | 75.8 | $9.00 | 1M |
| 11 | Claude Sonnet 4.6 ThinkingNanoGPT | 75.7 | $14.99 | 1M |
| 12 | GPT-5.2OpenAI | 75.4 | — | — |
| 13 | Qwen3.7 Max ThinkingNanoGPT | 75.2 | $7.50 | 1M |
| 14 | DeepSeek-V4-ProDeepSeek | 74.4 | — | — |
| 15 | GPT-5.1-Codex-MaxOpenAI | 74.4 | — | — |
| 16 | GPT-5.2 CodexOpenAI | 74.3 | — | — |
| 17 | Gemini 3 ProGoogle DeepMind | 73.5 | — | — |
| 18 | GPT-5.3 CodexOpenAI | 73.2 | — | — |
| 19 | Gemini 3 Flash ThinkingNanoGPT | 73.0 | $3.00 | 1M |
| 20 | GPT-5.1Poe | 72.6 | $9.00 | 400K |
| 21 | Kimi K2.6Moonshot | 72.4 | — | — |
| 22 | MoonshotAI: Kimi K2.7 CodeMoonshotAI | 71.9 | $4.00 | 262K |
| 23 | GPT 5.4 NanoNanoGPT | 71.3 | $1.25 | 400K |
| 24 | GPT-5 ProOpenAI | 71.3 | — | — |
| 25 | Qwen3.6 PlusAlibaba Token Plan | 70.8 | $0.00 | 1M |
| 26 | GLM-5.1Z.ai (Zhipu AI) | 70.6 | — | — |
| 27 | MiniMax M3 ThinkingNanoGPT | 70.0 | $1.20 | 512K |
| 28 | Grok Build 0.1NanoGPT | 69.6 | $2.00 | 256K |
| 29 | GPT-5.1-CodexOpenAI | 69.3 | — | — |
| 30 | Kimi K2.5Moonshot | 69.2 | — | — |
| 31 | Grok 4.20 (Reasoning)xAI | 69.0 | $2.50 | 1M |
| 32 | GLM-5Z.ai (Zhipu AI) | 68.7 | — | — |
| 33 | Claude Sonnet 4.5Anthropic | 67.9 | — | — |
| 34 | GPT 5.4 MiniNanoGPT | 67.7 | $4.50 | 400K |
| 35 | DeepSeek-V4-FlashDeepSeek | 67.7 | — | — |
| 36 | Grok 4.3NanoGPT | 67.4 | $2.50 | 1M |
| 37 | GPT-5 miniOpenAI | 66.6 | — | — |
| 38 | Qwen: Qwen3.6 27BQwen | 65.6 | $2.39 | 262K |
| 39 | MiniMax M2.7NanoGPT | 65.0 | $1.20 | 205K |
| 40 | DeepSeek: DeepSeek V3.2DeepSeek | 63.1 | $0.34 | 131K |
| 41 | Gemma 4 31B ITLilac | 62.4 | $0.35 | 262K |
| 42 | Kimi K2 ThinkingMoonshot | 62.3 | — | — |
| 43 | Gemini 3.1 Flash LiteNanoGPT | 62.1 | $1.50 | 1M |
| 44 | grok-4-0709Jiekou.AI | 61.8 | $13.50 | 256K |
| 45 | Claude 4.1 Opus Thinking (32K)NanoGPT | 61.4 | $75.00 | 200K |
| 46 | Claude Haiku 4.5Anthropic | 61.0 | — | — |
| 47 | GPT 5.1 Codex MiniNanoGPT | 60.8 | $2.00 | 400K |
| 48 | Claude 4 SonnetNanoGPT | 60.6 | $14.99 | 200K |
| 49 | Qwen: Qwen3.6 FlashQwen | 60.5 | $1.13 | 1M |
| 50 | MiniMax M2.5NanoGPT | 60.3 | $1.20 | 205K |
| 51 | Grok 4.1 FastxAI | 60.1 | — | — |
| 52 | GPT-5.3-InstantPoe | 59.8 | $13.00 | 128K |
| 53 | MiMo-V2-ProXiaomi Corp | 58.4 | — | — |
| 54 | Google: Gemini 2.5 Pro Preview 06-05Google | 57.5 | $10.00 | 1M |
| 55 | GLM-4.7Z.ai (Zhipu AI) | 57.3 | — | — |
| 56 | GLM-4.6Z.ai (Zhipu AI),Tsinghua University | 54.7 | — | — |
| 57 | Qwen: Qwen3 235B A22B Thinking 2507Qwen | 52.9 | $0.10 | 262K |
| 58 | Gemini 2.5 Flash PreviewNanoGPT | 52.3 | $0.60 | 1M |
| 59 | Qwen: Qwen3 Next 80B A3B ThinkingQwen | 51.0 | $0.78 | 262K |
| 60 | NVIDIA: Nemotron 3 UltraNVIDIA | 50.7 | $2.20 | 1M |
| 61 | GPT-5 nanoOpenAI | 48.8 | — | — |
| 62 | GLM 5V Turbo ThinkingNanoGPT | 48.8 | $4.00 | 203K |
| 63 | gpt-oss-120bOpenAI | 46.4 | — | — |
| 64 | Qwen: Qwen3 32BQwen | 42.7 | $0.28 | 131K |
| 65 | Gemini 2.5 Flash LiteNanoGPT | 41.5 | $0.40 | 1M |
| 66 | Z.ai: GLM 4.6VZ.ai | 38.9 | $0.90 | 131K |
| 67 | Qwen: Qwen3 30B A3BQwen | 38.8 | $0.50 | 131K |
| 68 | Mistral: Devstral 2 2512Mistral | 38.8 | $2.00 | 262K |
| 69 | Grok 4.20 (Non-Reasoning)xAI | 37.9 | $2.50 | 1M |
| 70 | Nemotron 3 Super 120B A12BSynthetic | 32.0 | $1.00 | 262K |
| 71 | X-Ai/Grok 4.1 Fast Non ReasoningQiniu | 31.6 | — | 2M |
Frequently asked questions
Pricing is indicative — confirm with the provider before production use. Updated June 2026.