Claude API Pricing Explained — Every Model, Every Tier (April 2026)
Complete Claude API pricing guide and full cost breakdown for April 2026. Covers Opus, Sonnet, and Haiku with cost estimates and savings strategies.
How Claude API Pricing Works
Anthropic charges for the Claude API based on token usage. Tokens are the fundamental unit of text that language models process — roughly 4 characters or 0.75 words per token in English.
Every API request has two token counts:
Input and output tokens are priced differently, with output tokens costing more because they require more computation to generate. The exact prices depend on which Claude model you use.
Official Anthropic Pricing (April 2026)
Anthropic currently offers three model families. Here is the complete pricing breakdown.
Claude Opus 4.6
Opus is Anthropic's most capable model. It delivers the deepest reasoning, handles the most complex tasks, and produces the highest quality output. It is also the most expensive.
| Metric | Price |
|--------|-------|
| Input tokens | $15.00 per 1M tokens |
| Output tokens | $75.00 per 1M tokens |
| Cache write | $18.75 per 1M tokens |
| Cache read | $1.50 per 1M tokens |
| Batch input | $7.50 per 1M tokens |
| Batch output | $37.50 per 1M tokens |
| Context window | 200K tokens |
| Max output | 32K tokens |
Best for: Complex architecture decisions, deep code analysis, research synthesis, multi-step reasoning, tasks where quality matters more than cost.
Claude Sonnet 4.6
Sonnet is the workhorse model. It balances capability and cost better than any other option, making it the most popular choice for production applications.
| Metric | Price |
|--------|-------|
| Input tokens | $3.00 per 1M tokens |
| Output tokens | $15.00 per 1M tokens |
| Cache write | $3.75 per 1M tokens |
| Cache read | $0.30 per 1M tokens |
| Batch input | $1.50 per 1M tokens |
| Batch output | $7.50 per 1M tokens |
| Context window | 200K tokens |
| Max output | 64K tokens |
Best for: Code generation, debugging, writing, analysis, customer support, document processing, and most production workloads.
Claude Haiku 4.5
Haiku is the fastest and most affordable model. It is designed for high-volume, latency-sensitive applications where speed and cost matter more than maximum reasoning depth.
| Metric | Price |
|--------|-------|
| Input tokens | $1.00 per 1M tokens |
| Output tokens | $5.00 per 1M tokens |
| Cache write | $1.25 per 1M tokens |
| Cache read | $0.10 per 1M tokens |
| Batch input | $0.50 per 1M tokens |
| Batch output | $2.50 per 1M tokens |
| Context window | 200K tokens |
| Max output | 16K tokens |
Best for: Classification, routing, extraction, summarization, chatbots, simple Q&A, and any task where response time matters.
Side-by-Side Model Comparison
| Feature | Opus 4.6 | Sonnet 4.6 | Haiku 4.5 |
|---------|----------|-----------|----------|
| Input (per 1M) | $15.00 | $3.00 | $1.00 |
| Output (per 1M) | $75.00 | $15.00 | $5.00 |
| Context window | 200K | 200K | 200K |
| Max output | 32K | 64K | 16K |
| Speed | Slower | Moderate | Fastest |
| Reasoning depth | Deepest | Strong | Good |
| Cost ratio vs Haiku | 15x | 3x | 1x |
Sonnet is 5x cheaper than Opus per token. Haiku is 3x cheaper than Sonnet. Choosing the right model for each task is the single most impactful cost optimization.
Understanding Tokens
If you are new to API pricing, tokens can feel abstract. Here are some concrete examples to calibrate your intuition.
How Many Tokens Is That?
| Content | Approximate Tokens |
|---------|-------------------|
| A short sentence | 10-15 tokens |
| A paragraph | 50-100 tokens |
| A full page of text | 300-400 tokens |
| A 2,000-word blog post | 2,500-3,000 tokens |
| A typical code file (200 lines) | 1,500-2,500 tokens |
| A conversation with 10 exchanges | 3,000-8,000 tokens |
Token Cost in Plain English
Using Sonnet 4.6 at standard pricing:
| What You Do | Approximate Cost |
|-------------|------------------|
| Ask a simple question, get a paragraph back | $0.001 |
| Generate a 500-word article | $0.01 |
| Analyze a 10-page document | $0.03 |
| A 30-minute coding session (Claude Code) | $1-3 |
| Process 1,000 customer support tickets | $5-15 |
| Generate 100 blog posts | $3-10 |
Individual requests are cheap. Costs accumulate through volume and long sessions.
How to Estimate Your Monthly Costs
To estimate what you will spend, you need three numbers:
1. Number of requests per month — how many API calls you make.
2. Average input size — how many tokens you send per request (prompt + context).
3. Average output size — how many tokens Claude generates per response.
Formula
Monthly cost = (input_tokens * input_price / 1,000,000) + (output_tokens * output_price / 1,000,000)Example Calculations
Customer support chatbot (Sonnet, 10,000 conversations/month, 500 input + 300 output tokens avg):
Code generation tool (Sonnet, 5,000 requests/month, 2,000 input + 1,000 output tokens avg):
Document analysis pipeline (Haiku, 50,000 documents/month, 1,000 input + 200 output tokens avg):
Claude Code daily usage (Sonnet, 8 hours/day, 20 working days):
Cost Optimization Strategies
Here are six proven approaches to reduce your Claude API spending, ordered by ease of implementation.
1. Model Routing — Save 40-60%
Not every request needs your most expensive model. Implement a routing layer that picks the right model for each task.
| Task Type | Recommended Model | Cost vs Opus |
|-----------|------------------|-------------|
| Classification, tagging | Haiku 4.5 | 93% cheaper |
| Simple Q&A, extraction | Haiku 4.5 | 93% cheaper |
| Code generation, analysis | Sonnet 4.6 | 80% cheaper |
| Complex reasoning, research | Opus 4.6 | Baseline |
A simple approach: use Haiku as the default, escalate to Sonnet for coding and analysis tasks, and reserve Opus for explicit user requests or detected complexity.
2. Prompt Caching — Save Up to 90% on Repeated Context
If your requests share a common prefix — a system prompt, reference document, or set of examples — prompt caching lets you pay a one-time write cost and then read the cached tokens at a 90% discount.
| Model | Standard Input | Cache Write | Cache Read | Savings on Read |
|-------|---------------|-------------|------------|----------------|
| Opus 4.6 | $15.00/M | $18.75/M | $1.50/M | 90% |
| Sonnet 4.6 | $3.00/M | $3.75/M | $0.30/M | 90% |
| Haiku 4.5 | $1.00/M | $1.25/M | $0.10/M | 90% |
For a customer support bot with a 4,000-token system prompt handling 10,000 requests/month, caching saves roughly $12/month on Sonnet. For Opus, the savings are $60/month on the system prompt alone.
3. Batch API — Save 50% on Async Work
Anthropic's Message Batches API lets you submit up to 10,000 requests in a single batch. Each request is processed within 24 hours and billed at 50% of standard pricing.
| Model | Standard Output | Batch Output | Savings |
|-------|----------------|-------------|--------|
| Opus 4.6 | $75.00/M | $37.50/M | 50% |
| Sonnet 4.6 | $15.00/M | $7.50/M | 50% |
| Haiku 4.5 | $5.00/M | $2.50/M | 50% |
Batch is ideal for content generation, data processing, evaluation pipelines, and any workload that does not need real-time responses.
4. Token Optimization — Save 15-25%
Reduce token usage without changing models or infrastructure:
5. Response Caching — Save 100% on Repeated Queries
For applications where users ask similar or identical questions, implement your own response cache:
Even a 10% cache hit rate saves 10% on your total bill. For FAQ-style applications, hit rates of 30-50% are common.
6. API Proxy Discount — Save 50-70% on Everything
After implementing the optimizations above, a discounted proxy multiplies your savings further. claudeapi.cheap offers three tiers:
| Tier | Discount | Annual Fee | Best For |
|------|----------|-----------|----------|
| Basic | 50% off | Free | Individual developers |
| Pro | 60% off | $29/year | Small teams |
| Enterprise | 70% off | $49/year | Production workloads |
The discount applies to all tokens — input, output, cached, and non-cached. It stacks with every other optimization.
Combined Savings Example
Let's say you have a production application with the following monthly usage:
Without Any Optimization
With Prompt Caching + claudeapi.cheap Enterprise (70% Off)
Pricing for Special Features
Some Claude API features have their own pricing considerations.
Extended Thinking
Extended thinking tokens (the internal reasoning Claude performs before responding) are billed as output tokens. If Claude thinks for 5,000 tokens before writing a 1,000-token response, you pay for 6,000 output tokens.
This makes extended thinking expensive on Opus ($75/M for output) but more reasonable on Sonnet ($15/M). Use it when you need the reasoning quality, but be aware of the cost.
Vision (Image Input)
Images are converted to tokens based on their dimensions. A typical screenshot or photograph uses 1,000-2,000 tokens. These are billed as input tokens at the standard rate for the model.
Tool Use (Function Calling)
Tool definitions and tool results are included in the input token count. Complex tool schemas can add hundreds of tokens per request. Keep your tool definitions concise.
How claudeapi.cheap Pricing Compares
Here is the full comparison across providers for every model.
Input Tokens (per 1M)
| Provider | Opus 4.6 | Sonnet 4.6 | Haiku 4.5 |
|----------|----------|-----------|----------|
| Anthropic | $15.00 | $3.00 | $1.00 |
| OpenRouter (+5.5%) | $15.83 | $3.17 | $1.06 |
| Wisdom Gate (~20% off) | $12.00 | $2.40 | $0.80 |
| claudeapi.cheap Basic | $7.50 | $1.50 | $0.50 |
| claudeapi.cheap Pro | $6.00 | $1.20 | $0.40 |
| claudeapi.cheap Enterprise | $4.50 | $0.90 | $0.30 |
Output Tokens (per 1M)
| Provider | Opus 4.6 | Sonnet 4.6 | Haiku 4.5 |
|----------|----------|-----------|----------|
| Anthropic | $75.00 | $15.00 | $5.00 |
| OpenRouter (+5.5%) | $79.13 | $15.83 | $5.28 |
| Wisdom Gate (~20% off) | $60.00 | $12.00 | $4.00 |
| claudeapi.cheap Basic | $37.50 | $7.50 | $2.50 |
| claudeapi.cheap Pro | $30.00 | $6.00 | $2.00 |
| claudeapi.cheap Enterprise | $22.50 | $4.50 | $1.50 |
Text-Based Cost Calculator
Estimate your monthly cost with this quick reference.
Sonnet 4.6 — Monthly Cost by Token Volume
Assuming a 1:2 input-to-output ratio (1 part input, 2 parts output):
| Monthly Tokens | Anthropic | Basic (50% off) | Enterprise (70% off) |
|---------------|-----------|-----------------|---------------------|
| 1M | $11.00 | $5.50 | $3.30 + $4.08 |
| 5M | $55.00 | $27.50 | $16.50 + $4.08 |
| 10M | $110.00 | $55.00 | $33.00 + $4.08 |
| 25M | $275.00 | $137.50 | $82.50 + $4.08 |
| 50M | $550.00 | $275.00 | $165.00 + $4.08 |
| 100M | $1,100.00 | $550.00 | $330.00 + $4.08 |
Opus 4.6 — Monthly Cost by Token Volume
Same 1:2 ratio:
| Monthly Tokens | Anthropic | Basic (50% off) | Enterprise (70% off) |
|---------------|-----------|-----------------|---------------------|
| 1M | $55.00 | $27.50 | $16.50 + $4.08 |
| 5M | $275.00 | $137.50 | $82.50 + $4.08 |
| 10M | $550.00 | $275.00 | $165.00 + $4.08 |
| 25M | $1,375.00 | $687.50 | $412.50 + $4.08 |
Break-Even: When to Upgrade Tiers
The Pro tier ($29/year) saves more than Basic when your monthly API spend exceeds about $25/month. The Enterprise tier ($49/year) saves more than Pro when your monthly spend exceeds about $50/month. At any usage level above $50/month, Enterprise is the most cost-effective option.
Getting Started
Claude API pricing is straightforward once you understand the token model. The key takeaways:
1. Pick the right model. Haiku for speed, Sonnet for balance, Opus for depth.
2. Use caching and batching. Anthropic's built-in discounts are free to use.
3. Optimize your prompts. Shorter, clearer prompts cost less and work better.
4. Consider a proxy. A 50-70% discount on every token adds up fast.
For a complete pricing comparison and to start saving immediately, visit claudeapi.cheap. The Basic tier is free and gives you 50% off every API call from day one.
Get started at claudeapi.cheap | Read the setup guide | See the discount tiers