Gemini 3 Flash. Balanced speed + quality, newer than 2.5 series.
from openai import OpenAI
client = OpenAI(
base_url="https://claudeapi.cheap/api/proxy/v1",
api_key="sk-cc-your-key-here",
)
resp = client.chat.completions.create(
model="gemini-3-flash-preview",
messages=[{"role": "user", "content": "Hello!"}],
)
print(resp.choices[0].message.content)curl https://claudeapi.cheap/api/proxy/v1/chat/completions \
-H "Authorization: Bearer sk-cc-your-key-here" \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-3-flash-preview",
"messages": [{"role": "user", "content": "Hello!"}]
}'On the Pro plan ($19 lifetime), Gemini 3 Flash is $0.24 per 1M input tokens and $1.44 per 1M output tokens. That's 80% off the official price ($1.2/$7.2). The Basic plan is free forever at 70% off.
Yes. The proxy forwards requests to the same underlying model with the same context window (1000k tokens) and capabilities. Only vendor-fingerprint headers are stripped. Behavior, output quality, and reasoning are identical.
Change one line — the base URL — in your existing Anthropic or OpenAI SDK initialization. Use https://claudeapi.cheap/api/proxy for Anthropic-format calls or https://claudeapi.cheap/api/proxy/v1 for OpenAI-format. Use your sk-cc-... key as the API key. No code changes beyond that.
Pro plan caps at 500 requests/min and 2M tokens/min globally across all models. Basic plan is 200 RPM / 1M TPM. Newer models may have lower upstream caps that float — see /status for live availability.
Streaming for Gemini is on the roadmap — non-streaming text generation is fully supported today. Tool calls and multimodal inputs are partially supported. Anthropic and OpenAI vendors have full streaming + tool-call support.
Free Basic plan, $19 lifetime Pro. Crypto only. No subscription.
Get an API key — free