Is Llama 3.2 3B or GLM 5 cheaper?

Llama 3.2 3B is cheaper on output tokens ($0.60 vs $3.20 per 1M).

Which has the larger context window, Llama 3.2 3B or GLM 5?

GLM 5 has the larger context window (198K tokens).

Llama 3.2 3B vs GLM 5

Llama 3.2 3B is cheaper on output tokens, while GLM 5 offers a larger context window. Choose Llama 3.2 3B or GLM 5 based on the trade-off between cost, context, and the benchmarks that matter for your use case.

Spec	Llama 3.2 3B	GLM 5
Provider	Venice AI	Venice AI
Input / 1M tokens	$0.15	$1.00
Output / 1M tokens	$0.60	$3.20
Context window	128K	198K
Parameters	—	744B
Open weights	Yes	Yes
Released	Oct 2024	Feb 2026

Llama 3.2 3B details →GLM 5 details →

FAQ

Pricing is indicative — confirm with the provider before production use. Highlighted values indicate the better figure for that row.