Skip to content
All posts
·4 min readclaude-codetutorialanthropic

Claude Code: How to Cut Your $200/Month Bill to $40 with Two Env Vars

Claude Code is Anthropic's official terminal coding agent — and at full speed it costs $200-1000/month. Here's how to keep every feature and pay 80% less by exporting two environment variables.

Claude Code is the gold standard. The bill is the gold standard, too.

Anthropic's Claude Code is the most capable coding agent shipping today — autonomous planning, parallel subagents, MCP integration, full repo awareness. A real all-day session burns 5-10M tokens, which on Anthropic Direct is $50-150 per day. Easy to hit $1000/month if you use it for serious work.

If you have a Pro Max subscription with Anthropic, that absorbs some of it. If you don't, every minute is metered. Either way, switching the underlying API endpoint to a cheaper Anthropic-compatible proxy cuts the bill 70-80% while changing literally nothing about Claude Code's behavior.

This post walks through the two env vars, what stays the same, and what to expect on your usage.

The two env vars

Claude Code reads ANTHROPIC_BASE_URL and ANTHROPIC_AUTH_TOKEN (or ANTHROPIC_API_KEY) from the environment. Set them and Claude Code routes through any Anthropic-compatible endpoint.

export ANTHROPIC_BASE_URL="https://claudeapi.cheap/api/proxy"
export ANTHROPIC_AUTH_TOKEN="sk-cc-your-key-here"

# Then launch as usual
claude

That's the entire setup. Claude Code's planning, tool use, subagent fan-out, MCP servers, file edits — everything works identically.

For permanent setup, add the exports to ~/.zshrc (zsh) or ~/.bashrc (bash). Or use a tool like direnv to scope them to specific projects.

What stays the same

A short list of features that explicitly keep working:

  • All built-in tools — Bash, Read, Edit, Write, WebFetch, Grep, Glob, Agent (subagents), TodoWrite, etc.
  • MCP servers — they run client-side. The proxy never sees them.
  • Subagents — when Claude Code launches a parallel subagent, that subagent's API calls also route through claudeapi.cheap.
  • Streaming responses — fully forwarded.
  • Tool/function calling — the entire tool-use protocol works end-to-end.
  • Prompt caching — write at 1.25× input, read at 0.1× input — discounts preserved through us.
  • Session resumptionclaude --resume works because session state is local.
  • Hooks — PreToolUse, PostToolUse, Stop hooks execute on your machine.
  • Plan mode — runs identically.
  • What does NOT change

    We forward requests as-is. Vendor-fingerprint headers (billing IDs, internal trace IDs) are stripped to keep the upstream identity hidden, but those don't affect Claude Code's behavior.

    Response quality, latency, model temperature, context window — all identical to Anthropic Direct.

    Cost math: real Claude Code session

    From a real morning session of mine:

  • 47 minutes of active work
  • 1 main agent + 3 subagent fan-outs
  • ~4.2M input tokens (mostly cached file reads + system prompts)
  • ~85k output tokens
  • Used Sonnet 4.6 for main, Haiku 4.5 for subagents
  • Anthropic direct cost: ~$8 (~$10/hour pace)

    Through claudeapi.cheap Pro (80% off): ~$1.60 (~$2/hour pace)

    Savings on this 47-minute session: $6.40. At 8 hours/day × 20 days/month = ~$1,000/month direct → $200/month through us. Pro plan ($19 lifetime) pays back in the first hour.

    Best models for Claude Code

  • Claude Sonnet 4.6 — main agent default. Best speed/quality.
  • Claude Opus 4.7 — for hard architecture work, multi-system refactors. Use sparingly — even at 80% off it's the most expensive in the catalog.
  • Claude Haiku 4.5 — subagent fan-out. Great for parallel grep/scan tasks where speed > depth. $0.12/$0.60 per 1M.
  • Claude Code accepts model overrides per call: claude --model claude-sonnet-4-6 or set in ~/.claude/settings.json.

    Verifying it worked

    After exporting env vars, launch Claude Code and have it do anything (claude, then "list files in this dir"). In your claudeapi.cheap dashboard, within ~5 seconds the request appears in usage logs. Token count should match what Claude Code reports in /cost if you run that slash command.

    If the dashboard shows nothing, the env vars didn't take effect — open a fresh shell and re-export, or check echo $ANTHROPIC_BASE_URL to confirm.

    Bonus: same key for Cline, OpenClaw, Aider, Cursor

    The sk-cc-... key works for every Anthropic-compatible tool. If you also use Cline, OpenClaw, Aider, Cursor, OpenCode, OpenHands, Continue, or Goose, no separate billing needed. See the tools page for one-page guides.

    GPT-5 family and Gemini work through the same key on the OpenAI-compat path (/v1/chat/completions).

    Get started

    Free Basic plan gives 70% off and 200 RPM. Pro at $19 lifetime unlocks 80% off and 500 RPM. With Claude Code at full pace, Pro pays itself back in the first session.

    Best agent. Same models. 1/5 the bill.