Model cost calculator
Pricing verified .
Cheapest for this workload: DeepSeek V4, $1.14 per month.
Ranked cheapest → priciest
Prices move monthly. Get the Briefing and we'll flag when your stack's costs change.
Free. One click to unsubscribe.
How the math works
For each model, the per-request input cost depends on whether your token is served from the prefix cache. So the effective input rate is a weighted average:
effective_input_$_per_Mtok =
cache_hit_rate × cached_input_price
+ (1 − cache_hit_rate) × full_input_price
monthly_cost =
input_tokens × effective_input_$_per_Mtok / 1_000_000
+ output_tokens × output_price / 1_000_000 If you're running RAG or any workflow with a stable system prompt, your real hit rate is 60-90%, far above the 0% the vendor price table implicitly assumes. That gap is the whole reason a Sonnet workload that "should" cost $X often costs $X/3 in practice. The calculator above models it honestly.
Caveats before you trust these numbers
- Prices verified against vendor pricing pages on 2026-06-04. Tap any model row in the table to open its current vendor page and sanity-check before committing budget.
- API list pricing only. Enterprise discounts, committed-spend contracts, and provisioned-throughput pricing all differ.
- Long-context surcharges (some vendors charge a higher rate above 200K or 500K tokens per request) aren't modeled here. If you're routinely sending huge prompts, treat the result as a floor.
- Tool calls, embeddings for retrieval, fine-tuning, and image/audio modalities are billed separately and aren't included.
- Cache hit rate is the % of input tokens served from a prefix cache. Output is never cached.
Get pinged when these prices move
We rerun this calculator and email the diff whenever any vendor changes API pricing. No other email.
Free. Unsubscribe in one click.