Is Qwen: Qwen3 Max Thinking or DeepSeek: R1 cheaper?

DeepSeek: R1 is cheaper on output tokens ($2.50 vs $3.90 per 1M).

Which has the larger context window, Qwen: Qwen3 Max Thinking or DeepSeek: R1?

Qwen: Qwen3 Max Thinking has the larger context window (262K tokens).

Qwen: Qwen3 Max Thinking vs DeepSeek: R1

DeepSeek: R1 is cheaper on output tokens, while Qwen: Qwen3 Max Thinking offers a larger context window. Choose Qwen: Qwen3 Max Thinking or DeepSeek: R1 based on the trade-off between cost, context, and the benchmarks that matter for your use case.

Spec	Qwen: Qwen3 Max Thinking	DeepSeek: R1
Provider	Kilo Gateway	Kilo Gateway
Input / 1M tokens	$0.78	$0.70
Output / 1M tokens	$3.90	$2.50
Context window	262K	64K
Parameters	—	671B
Open weights	No	Yes
Released	Jan 2026	Jan 2025

Qwen: Qwen3 Max Thinking details →DeepSeek: R1 details →

FAQ

Pricing is indicative — confirm with the provider before production use. Highlighted values indicate the better figure for that row.