How to Use Claude Code for 70% Less — Developer Guide
Cut your cheap Claude Code API costs by up to 70% with a single config change. Full guide with .bashrc and .zshrc setup, cost tables, and savings tips.
Claude Code Is Worth It — But the Bill Hurts
Claude Code is arguably the most capable AI coding assistant available today. It runs in your terminal, reads your entire codebase, writes and edits files, runs tests, manages git, and can operate semi-autonomously on complex tasks. For developers who use it daily, it has become indispensable.
The cost, however, is real. Every interaction with Claude Code is an API call billed at Anthropic's standard token rates. A focused coding session burns through tokens fast — especially when using Opus for complex refactoring or architecture work.
Developers on Reddit and Hacker News regularly report monthly Claude Code bills in the hundreds of dollars. Some power users have reported spending over $600 in a single month. That is more than most developers spend on all their other tools combined.
The good news: you can cut that bill by 50% to 70% with a two-minute configuration change.
What Claude Code Actually Costs
Claude Code's cost depends on three factors: the model you use, how long your sessions are, and how much context gets sent with each request.
Here are realistic monthly estimates based on usage patterns:
Sonnet 4.6 (Most Common)
| Usage Level | Daily Coding Time | Tokens/Month | Official Cost | With 50% Off | With 70% Off |
|-------------|------------------|-------------|--------------|-------------|-------------|
| Light | 30 min | ~3M | $33 | $16.50 | $9.90 |
| Moderate | 1-2 hours | ~10M | $110 | $55.00 | $33.00 |
| Heavy | 3+ hours | ~25M | $275 | $137.50 | $82.50 |
| Power user | Full day | ~50M | $550 | $275.00 | $165.00 |
Opus 4.6 (For Complex Work)
| Usage Level | Daily Coding Time | Tokens/Month | Official Cost | With 50% Off | With 70% Off |
|-------------|------------------|-------------|--------------|-------------|-------------|
| Occasional | A few sessions/week | ~5M | $275 | $137.50 | $82.50 |
| Regular | Daily sessions | ~15M | $825 | $412.50 | $247.50 |
| Heavy | Extended sessions | ~30M | $1,650 | $825.00 | $495.00 |
Opus is roughly 5x more expensive than Sonnet per token. The discount matters even more when you are using the most powerful model.
How to Set ANTHROPIC_BASE_URL for Claude Code
Claude Code reads its API configuration from environment variables. You need to set exactly two variables.
For Bash Users
Open ~/.bashrc in your editor and add these lines at the end:
# Claude Code — route through claudeapi.cheap for discounted rates
export ANTHROPIC_API_KEY="sk-cc-your-api-key-here"
export ANTHROPIC_BASE_URL="https://api.claudeapi.cheap"Apply the changes:
source ~/.bashrcFor Zsh Users
Open ~/.zshrc and add the same lines:
# Claude Code — route through claudeapi.cheap for discounted rates
export ANTHROPIC_API_KEY="sk-cc-your-api-key-here"
export ANTHROPIC_BASE_URL="https://api.claudeapi.cheap"Apply the changes:
source ~/.zshrcFor Fish Shell Users
Open ~/.config/fish/config.fish and add:
# Claude Code — route through claudeapi.cheap for discounted rates
set -gx ANTHROPIC_API_KEY "sk-cc-your-api-key-here"
set -gx ANTHROPIC_BASE_URL "https://api.claudeapi.cheap"For Windows PowerShell Users
Add to your PowerShell profile ($PROFILE):
# Claude Code — route through claudeapi.cheap for discounted rates
$env:ANTHROPIC_API_KEY = "sk-cc-your-api-key-here"
$env:ANTHROPIC_BASE_URL = "https://api.claudeapi.cheap"Verify It Works
Launch Claude Code and check that it connects:
claudeIf the session starts normally and Claude responds to your prompts, the proxy is working. Every API call from this point forward is billed at your discounted rate.
Cost Comparison: Before vs After
Let's look at a concrete example. Say you are a moderate Claude Code user who spends about 2 hours a day coding with Sonnet, and occasionally switches to Opus for complex tasks.
Monthly usage estimate:
Before (Official Anthropic Pricing)
| Model | Tokens | Input Cost | Output Cost | Total |
|-------|--------|-----------|------------|-------|
| Sonnet 4.6 | 8M (2.7M in, 5.3M out) | $8.10 | $79.50 | $87.60 |
| Opus 4.6 | 2M (0.7M in, 1.3M out) | $10.50 | $97.50 | $108.00 |
| Total | | | | $195.60 |
After — Basic Tier (50% Off, Free)
| Model | Total Before | 50% Off | Savings |
|-------|--------------|---------|---------|
| Sonnet 4.6 | $87.60 | $43.80 | $43.80 |
| Opus 4.6 | $108.00 | $54.00 | $54.00 |
| Total | $195.60 | $97.80 | $97.80 |
After — Enterprise Tier (70% Off, $49/year)
| Model | Total Before | 70% Off | Savings |
|-------|--------------|---------|---------|
| Sonnet 4.6 | $87.60 | $26.28 | $61.32 |
| Opus 4.6 | $108.00 | $32.40 | $75.60 |
| Total | $195.60 | $58.68 + $4.08 | $132.84/mo |
The Enterprise tier saves $132.84 per month in this scenario, or about $1,594 per year. The $49 annual subscription pays for itself in the first two weeks.
Tips to Reduce Claude Code Costs Even Further
The proxy discount is the fastest win, but there are several other strategies that stack on top.
1. Use Sonnet as Your Default Model
Sonnet 4.6 handles the vast majority of coding tasks — writing functions, debugging, refactoring, generating tests, writing documentation. It costs about 5x less than Opus per token.
Reserve Opus for situations where you genuinely need deeper reasoning: complex multi-file refactors, architectural decisions, or algorithmic challenges. You can switch models mid-session using the /model command in Claude Code.
2. Use Compact Mode
Claude Code's /compact command summarizes the conversation history, reducing the number of tokens sent with each subsequent request. This is especially useful in long sessions where the context window fills up.
A long session without compaction might send 50K+ tokens of context with every request. After compacting, that drops to 5-10K tokens.
3. Be Specific in Your Prompts
Vague prompts generate vague responses, which lead to follow-up clarifications and more token usage. Instead of:
> "Fix this file"
Try:
> "The calculateTotal function on line 45 doesn't handle the case where items is an empty array. Add a guard clause that returns 0."
Specific prompts get the right answer on the first try, saving the tokens you would have spent on back-and-forth.
4. Start New Sessions for New Tasks
Context accumulates over a session. If you finish one task and start a completely different one, the old context is still being sent with every request. Starting a fresh session for a new task keeps your context lean.
5. Use Haiku for Simple Tasks
For quick lookups, generating boilerplate, formatting code, or simple questions, Haiku 4.5 is fast and very cheap. At $1/$5 per million tokens (or $0.30/$1.50 with the 70% discount), it costs almost nothing.
6. Leverage Prompt Caching
If you have a long system prompt or project instructions (like a CLAUDE.md file), prompt caching reduces the cost of resending that context on every request by up to 90%. Claude Code handles this automatically when the conditions are met.
Multiple Environments Setup
If you want to use the proxy for personal projects but go direct for work, you can set up per-project overrides.
Option 1: Directory-Specific Environment
Create a .envrc file in your project directory (requires direnv):
# ~/personal-project/.envrc
export ANTHROPIC_API_KEY="sk-cc-your-discounted-key"
export ANTHROPIC_BASE_URL="https://api.claudeapi.cheap"Option 2: Shell Aliases
Add aliases to your shell config:
# Use discounted API
alias claude-cheap='ANTHROPIC_API_KEY="sk-cc-your-key" ANTHROPIC_BASE_URL="https://api.claudeapi.cheap" claude'
# Use official API
alias claude-direct='ANTHROPIC_API_KEY="sk-ant-your-key" claude'Then launch with claude-cheap or claude-direct depending on the project.
Frequently Asked Questions
Is this the same Claude Code from Anthropic?
Yes. You are running the exact same claude binary. The only change is which server receives the API calls.
Does the proxy affect response quality?
No. Your requests are forwarded to the real Claude API without modification. The model, reasoning, and output are identical.
Does streaming work?
Yes. Full streaming support is maintained through the proxy. You see tokens appear in real time, same as direct.
Can I still use extended thinking?
Yes. All features — extended thinking, tool use, vision, prompt caching — work through the proxy.
What about rate limits?
Rate limits depend on your tier. Basic tier uses standard limits. Pro and Enterprise tiers have higher rate limits.
How do I switch back?
Remove or comment out the two environment variables in your shell config and reload. That is it.
Start Coding for Less
Claude Code is one of the best investments a developer can make in productivity. But there is no reason to pay full price when you can get 50% off for free, or 70% off for less than the cost of a coffee per month.
Two environment variables. Two minutes. Half the cost.
Get started at claudeapi.cheap | Full setup docs | Pricing details