Skip to content
All posts
·13 min readpricingguideclaude api pricingclaude api costanthropic api pricing

Claude API Pricing Explained — Every Model, Every Tier (April 2026)

Complete Claude API pricing guide and full cost breakdown for April 2026. Covers Opus, Sonnet, and Haiku with cost estimates and savings strategies.

How Claude API Pricing Works

Anthropic charges for the Claude API based on token usage. Tokens are the fundamental unit of text that language models process — roughly 4 characters or 0.75 words per token in English.

Every API request has two token counts:

  • Input tokens — the text you send to Claude (your prompt, system message, conversation history, and any attached files or images).
  • Output tokens — the text Claude generates in response.
  • Input and output tokens are priced differently, with output tokens costing more because they require more computation to generate. The exact prices depend on which Claude model you use.

    Official Anthropic Pricing (April 2026)

    Anthropic currently offers three model families. Here is the complete pricing breakdown.

    Claude Opus 4.6

    Opus is Anthropic's most capable model. It delivers the deepest reasoning, handles the most complex tasks, and produces the highest quality output. It is also the most expensive.

    | Metric | Price |

    |--------|-------|

    | Input tokens | $15.00 per 1M tokens |

    | Output tokens | $75.00 per 1M tokens |

    | Cache write | $18.75 per 1M tokens |

    | Cache read | $1.50 per 1M tokens |

    | Batch input | $7.50 per 1M tokens |

    | Batch output | $37.50 per 1M tokens |

    | Context window | 200K tokens |

    | Max output | 32K tokens |

    Best for: Complex architecture decisions, deep code analysis, research synthesis, multi-step reasoning, tasks where quality matters more than cost.

    Claude Sonnet 4.6

    Sonnet is the workhorse model. It balances capability and cost better than any other option, making it the most popular choice for production applications.

    | Metric | Price |

    |--------|-------|

    | Input tokens | $3.00 per 1M tokens |

    | Output tokens | $15.00 per 1M tokens |

    | Cache write | $3.75 per 1M tokens |

    | Cache read | $0.30 per 1M tokens |

    | Batch input | $1.50 per 1M tokens |

    | Batch output | $7.50 per 1M tokens |

    | Context window | 200K tokens |

    | Max output | 64K tokens |

    Best for: Code generation, debugging, writing, analysis, customer support, document processing, and most production workloads.

    Claude Haiku 4.5

    Haiku is the fastest and most affordable model. It is designed for high-volume, latency-sensitive applications where speed and cost matter more than maximum reasoning depth.

    | Metric | Price |

    |--------|-------|

    | Input tokens | $1.00 per 1M tokens |

    | Output tokens | $5.00 per 1M tokens |

    | Cache write | $1.25 per 1M tokens |

    | Cache read | $0.10 per 1M tokens |

    | Batch input | $0.50 per 1M tokens |

    | Batch output | $2.50 per 1M tokens |

    | Context window | 200K tokens |

    | Max output | 16K tokens |

    Best for: Classification, routing, extraction, summarization, chatbots, simple Q&A, and any task where response time matters.

    Side-by-Side Model Comparison

    | Feature | Opus 4.6 | Sonnet 4.6 | Haiku 4.5 |

    |---------|----------|-----------|----------|

    | Input (per 1M) | $15.00 | $3.00 | $1.00 |

    | Output (per 1M) | $75.00 | $15.00 | $5.00 |

    | Context window | 200K | 200K | 200K |

    | Max output | 32K | 64K | 16K |

    | Speed | Slower | Moderate | Fastest |

    | Reasoning depth | Deepest | Strong | Good |

    | Cost ratio vs Haiku | 15x | 3x | 1x |

    Sonnet is 5x cheaper than Opus per token. Haiku is 3x cheaper than Sonnet. Choosing the right model for each task is the single most impactful cost optimization.

    Understanding Tokens

    If you are new to API pricing, tokens can feel abstract. Here are some concrete examples to calibrate your intuition.

    How Many Tokens Is That?

    | Content | Approximate Tokens |

    |---------|-------------------|

    | A short sentence | 10-15 tokens |

    | A paragraph | 50-100 tokens |

    | A full page of text | 300-400 tokens |

    | A 2,000-word blog post | 2,500-3,000 tokens |

    | A typical code file (200 lines) | 1,500-2,500 tokens |

    | A conversation with 10 exchanges | 3,000-8,000 tokens |

    Token Cost in Plain English

    Using Sonnet 4.6 at standard pricing:

    | What You Do | Approximate Cost |

    |-------------|------------------|

    | Ask a simple question, get a paragraph back | $0.001 |

    | Generate a 500-word article | $0.01 |

    | Analyze a 10-page document | $0.03 |

    | A 30-minute coding session (Claude Code) | $1-3 |

    | Process 1,000 customer support tickets | $5-15 |

    | Generate 100 blog posts | $3-10 |

    Individual requests are cheap. Costs accumulate through volume and long sessions.

    How to Estimate Your Monthly Costs

    To estimate what you will spend, you need three numbers:

    1. Number of requests per month — how many API calls you make.

    2. Average input size — how many tokens you send per request (prompt + context).

    3. Average output size — how many tokens Claude generates per response.

    Formula

    Monthly cost = (input_tokens * input_price / 1,000,000) + (output_tokens * output_price / 1,000,000)

    Example Calculations

    Customer support chatbot (Sonnet, 10,000 conversations/month, 500 input + 300 output tokens avg):

  • Input: 10,000 x 500 = 5M tokens x $3.00/M = $15.00
  • Output: 10,000 x 300 = 3M tokens x $15.00/M = $45.00
  • Total: $60.00/month
  • Code generation tool (Sonnet, 5,000 requests/month, 2,000 input + 1,000 output tokens avg):

  • Input: 5,000 x 2,000 = 10M tokens x $3.00/M = $30.00
  • Output: 5,000 x 1,000 = 5M tokens x $15.00/M = $75.00
  • Total: $105.00/month
  • Document analysis pipeline (Haiku, 50,000 documents/month, 1,000 input + 200 output tokens avg):

  • Input: 50,000 x 1,000 = 50M tokens x $1.00/M = $50.00
  • Output: 50,000 x 200 = 10M tokens x $5.00/M = $50.00
  • Total: $100.00/month
  • Claude Code daily usage (Sonnet, 8 hours/day, 20 working days):

  • Typical: 500K-1M tokens/hour for active coding
  • Monthly: ~12M tokens (4M input + 8M output)
  • Input: 4M x $3.00/M = $12.00
  • Output: 8M x $15.00/M = $120.00
  • Total: $132.00/month
  • Cost Optimization Strategies

    Here are six proven approaches to reduce your Claude API spending, ordered by ease of implementation.

    1. Model Routing — Save 40-60%

    Not every request needs your most expensive model. Implement a routing layer that picks the right model for each task.

    | Task Type | Recommended Model | Cost vs Opus |

    |-----------|------------------|-------------|

    | Classification, tagging | Haiku 4.5 | 93% cheaper |

    | Simple Q&A, extraction | Haiku 4.5 | 93% cheaper |

    | Code generation, analysis | Sonnet 4.6 | 80% cheaper |

    | Complex reasoning, research | Opus 4.6 | Baseline |

    A simple approach: use Haiku as the default, escalate to Sonnet for coding and analysis tasks, and reserve Opus for explicit user requests or detected complexity.

    2. Prompt Caching — Save Up to 90% on Repeated Context

    If your requests share a common prefix — a system prompt, reference document, or set of examples — prompt caching lets you pay a one-time write cost and then read the cached tokens at a 90% discount.

    | Model | Standard Input | Cache Write | Cache Read | Savings on Read |

    |-------|---------------|-------------|------------|----------------|

    | Opus 4.6 | $15.00/M | $18.75/M | $1.50/M | 90% |

    | Sonnet 4.6 | $3.00/M | $3.75/M | $0.30/M | 90% |

    | Haiku 4.5 | $1.00/M | $1.25/M | $0.10/M | 90% |

    For a customer support bot with a 4,000-token system prompt handling 10,000 requests/month, caching saves roughly $12/month on Sonnet. For Opus, the savings are $60/month on the system prompt alone.

    3. Batch API — Save 50% on Async Work

    Anthropic's Message Batches API lets you submit up to 10,000 requests in a single batch. Each request is processed within 24 hours and billed at 50% of standard pricing.

    | Model | Standard Output | Batch Output | Savings |

    |-------|----------------|-------------|--------|

    | Opus 4.6 | $75.00/M | $37.50/M | 50% |

    | Sonnet 4.6 | $15.00/M | $7.50/M | 50% |

    | Haiku 4.5 | $5.00/M | $2.50/M | 50% |

    Batch is ideal for content generation, data processing, evaluation pipelines, and any workload that does not need real-time responses.

    4. Token Optimization — Save 15-25%

    Reduce token usage without changing models or infrastructure:

  • Trim system prompts. Remove redundant instructions, examples, and formatting. Every token in your system prompt is sent with every request.
  • Use structured output. Request JSON responses instead of free-text. Structured outputs are typically 30-50% shorter.
  • Set appropriate max_tokens. If you expect a 200-token answer, do not set max_tokens to 4096.
  • Avoid asking for explanations when you only need the answer.
  • 5. Response Caching — Save 100% on Repeated Queries

    For applications where users ask similar or identical questions, implement your own response cache:

  • Hash the user's prompt.
  • Check your cache for an existing response.
  • If found, return the cached response without making an API call.
  • If not found, call the API and cache the response.
  • Even a 10% cache hit rate saves 10% on your total bill. For FAQ-style applications, hit rates of 30-50% are common.

    6. API Proxy Discount — Save 50-70% on Everything

    After implementing the optimizations above, a discounted proxy multiplies your savings further. claudeapi.cheap offers three tiers:

    | Tier | Discount | Annual Fee | Best For |

    |------|----------|-----------|----------|

    | Basic | 50% off | Free | Individual developers |

    | Pro | 60% off | $29/year | Small teams |

    | Enterprise | 70% off | $49/year | Production workloads |

    The discount applies to all tokens — input, output, cached, and non-cached. It stacks with every other optimization.

    Combined Savings Example

    Let's say you have a production application with the following monthly usage:

  • 50M input tokens on Sonnet (with 30M cached via prompt caching)
  • 20M output tokens on Sonnet
  • 10M input/output tokens on Haiku for classification
  • Without Any Optimization

  • Sonnet input: 50M x $3.00/M = $150.00
  • Sonnet output: 20M x $15.00/M = $300.00
  • Haiku input: 5M x $1.00/M = $5.00
  • Haiku output: 5M x $5.00/M = $25.00
  • Total: $480.00/month
  • With Prompt Caching + claudeapi.cheap Enterprise (70% Off)

  • Sonnet cached input: 30M x $0.30/M x 0.3 = $2.70
  • Sonnet uncached input: 20M x $3.00/M x 0.3 = $18.00
  • Sonnet output: 20M x $15.00/M x 0.3 = $90.00
  • Haiku input: 5M x $1.00/M x 0.3 = $1.50
  • Haiku output: 5M x $5.00/M x 0.3 = $7.50
  • Subscription: $4.08/month
  • Total: $123.78/month
  • Savings: $356.22/month (74% reduction)
  • Pricing for Special Features

    Some Claude API features have their own pricing considerations.

    Extended Thinking

    Extended thinking tokens (the internal reasoning Claude performs before responding) are billed as output tokens. If Claude thinks for 5,000 tokens before writing a 1,000-token response, you pay for 6,000 output tokens.

    This makes extended thinking expensive on Opus ($75/M for output) but more reasonable on Sonnet ($15/M). Use it when you need the reasoning quality, but be aware of the cost.

    Vision (Image Input)

    Images are converted to tokens based on their dimensions. A typical screenshot or photograph uses 1,000-2,000 tokens. These are billed as input tokens at the standard rate for the model.

    Tool Use (Function Calling)

    Tool definitions and tool results are included in the input token count. Complex tool schemas can add hundreds of tokens per request. Keep your tool definitions concise.

    How claudeapi.cheap Pricing Compares

    Here is the full comparison across providers for every model.

    Input Tokens (per 1M)

    | Provider | Opus 4.6 | Sonnet 4.6 | Haiku 4.5 |

    |----------|----------|-----------|----------|

    | Anthropic | $15.00 | $3.00 | $1.00 |

    | OpenRouter (+5.5%) | $15.83 | $3.17 | $1.06 |

    | Wisdom Gate (~20% off) | $12.00 | $2.40 | $0.80 |

    | claudeapi.cheap Basic | $7.50 | $1.50 | $0.50 |

    | claudeapi.cheap Pro | $6.00 | $1.20 | $0.40 |

    | claudeapi.cheap Enterprise | $4.50 | $0.90 | $0.30 |

    Output Tokens (per 1M)

    | Provider | Opus 4.6 | Sonnet 4.6 | Haiku 4.5 |

    |----------|----------|-----------|----------|

    | Anthropic | $75.00 | $15.00 | $5.00 |

    | OpenRouter (+5.5%) | $79.13 | $15.83 | $5.28 |

    | Wisdom Gate (~20% off) | $60.00 | $12.00 | $4.00 |

    | claudeapi.cheap Basic | $37.50 | $7.50 | $2.50 |

    | claudeapi.cheap Pro | $30.00 | $6.00 | $2.00 |

    | claudeapi.cheap Enterprise | $22.50 | $4.50 | $1.50 |

    Text-Based Cost Calculator

    Estimate your monthly cost with this quick reference.

    Sonnet 4.6 — Monthly Cost by Token Volume

    Assuming a 1:2 input-to-output ratio (1 part input, 2 parts output):

    | Monthly Tokens | Anthropic | Basic (50% off) | Enterprise (70% off) |

    |---------------|-----------|-----------------|---------------------|

    | 1M | $11.00 | $5.50 | $3.30 + $4.08 |

    | 5M | $55.00 | $27.50 | $16.50 + $4.08 |

    | 10M | $110.00 | $55.00 | $33.00 + $4.08 |

    | 25M | $275.00 | $137.50 | $82.50 + $4.08 |

    | 50M | $550.00 | $275.00 | $165.00 + $4.08 |

    | 100M | $1,100.00 | $550.00 | $330.00 + $4.08 |

    Opus 4.6 — Monthly Cost by Token Volume

    Same 1:2 ratio:

    | Monthly Tokens | Anthropic | Basic (50% off) | Enterprise (70% off) |

    |---------------|-----------|-----------------|---------------------|

    | 1M | $55.00 | $27.50 | $16.50 + $4.08 |

    | 5M | $275.00 | $137.50 | $82.50 + $4.08 |

    | 10M | $550.00 | $275.00 | $165.00 + $4.08 |

    | 25M | $1,375.00 | $687.50 | $412.50 + $4.08 |

    Break-Even: When to Upgrade Tiers

    The Pro tier ($29/year) saves more than Basic when your monthly API spend exceeds about $25/month. The Enterprise tier ($49/year) saves more than Pro when your monthly spend exceeds about $50/month. At any usage level above $50/month, Enterprise is the most cost-effective option.

    Getting Started

    Claude API pricing is straightforward once you understand the token model. The key takeaways:

    1. Pick the right model. Haiku for speed, Sonnet for balance, Opus for depth.

    2. Use caching and batching. Anthropic's built-in discounts are free to use.

    3. Optimize your prompts. Shorter, clearer prompts cost less and work better.

    4. Consider a proxy. A 50-70% discount on every token adds up fast.

    For a complete pricing comparison and to start saving immediately, visit claudeapi.cheap. The Basic tier is free and gives you 50% off every API call from day one.

    Get started at claudeapi.cheap | Read the setup guide | See the discount tiers