Claude API Pricing 2026: Fable 5 vs Opus 5 vs Sonnet 5 vs Haiku 4.5

Claude API pricing is quoted per million tokens, which sounds cheap until you realize how fast tokens add up in real workflows. The headline rates are easy to find; what is hard is translating them into "what will this actually cost me per month?" This guide does that translation with real use-case math, then shows the two levers that cut the bill the most.

Current Claude API rates (2026)

Claude Fable 5: $10.00 input / $50.00 output per million tokens.
Claude Opus 5: $5.00 input / $25.00 output per million tokens.
Claude Sonnet 5: introductory $2.00 input / $10.00 output per million tokens through August 31, 2026. Standard $3.00 / $15.00 pricing begins September 1, 2026.
Claude Haiku 4.5 (fast/cheap): $1.00 input / $5.00 output per million tokens.
You pay separately for input tokens (your prompts plus context) and output tokens (Claude's responses).

These are standard API prices, not a universal monthly bill. Prompt caching, batch processing, model routing, context size, and output length all change the effective cost. Sonnet 5 also uses a newer tokenizer, so the same text can produce more tokens than Sonnet 4.6.

Real monthly cost by use case

Use these as workload patterns, not bill forecasts. Your actual numbers depend on prompt size, output length, how much context you re-send, caching, and the model mix you route to.

Solo developer using Claude for code review and debugging: start with Sonnet 5 and measure the prompt and output volume before choosing a subscription or API budget.
Power user running an agentic coding loop most days: use Sonnet 5 by default, then measure how often difficult tasks justify Opus 5 or Fable 5.
Small team running a customer-facing feature on the API: route routine requests to Haiku 4.5, then selectively escalate to Sonnet 5 or a premium model.
High-volume production app (Haiku-first with caching): cost-per-request matters far more than headline rate, this is where caching and batch pay off.

The two levers that cut your bill most

1. Prompt caching (up to 90% off cached input)

If you re-send the same system prompt, instructions, or document context across many requests, prompt caching cuts the cost of that cached input by up to 90%. For RAG apps, coding agents, and anything with a large fixed prompt, this is the single biggest lever, often larger than switching models.

2. Batch processing (50% off)

For workloads that do not need a real-time response, overnight processing, bulk classification, data enrichment, the Batch API is 50% cheaper across all models. If latency does not matter, batching halves the cost with zero quality loss.

Model choice is a cost decision

The most common waste is running Fable or Opus on tasks Haiku or Sonnet would handle fine. Fable output costs 10x Haiku output, while Opus costs 5x. Reserve premium models for genuinely hard reasoning or long-horizon agents, then route routine work down. A default-to-Sonnet rule with deliberate escalation usually cuts spend meaningfully.

FAQ

How much does the Claude API cost per million tokens?

Prices verified July 26, 2026: Fable 5 is $10 input / $50 output, Opus 5 is $5 / $25, Sonnet 5 is introductory $2 / $10 through August 31 before moving to $3 / $15, and Haiku 4.5 is $1 / $5 per million tokens. Input and output are billed separately.

Is the Claude API cheaper than a Claude subscription?

It depends on volume. Light, occasional use is often cheaper on the API because you only pay for what you use. Heavy daily use is usually cheaper on a flat-rate Pro or Max subscription. The break-even depends on your token volume, tracking your real usage is the only way to know which side you are on.

How do I reduce my Claude API bill?

In order of impact: enable prompt caching for repeated context, use the Batch API for non-urgent work, and route simple tasks to Haiku or Sonnet instead of Fable or Opus. Check the official pricing page for the current cache and batch rates before modelling spend.

Comparing API cost against a flat subscription?

Use the AI plan comparator to weigh Claude API usage against Pro and Max subscriptions side-by-side.

See Full Plan Breakdown