Skip to content
All models
Fastest · Google

Gemini 2.5 Flash

Fast & cheap Gemini. High-volume, latency-sensitive apps.

Context
1000k
Max output
66k
Pro input
$0.144/M
Pro output
$1.2/M

Pricing — every plan

Plan
Input / 1M
Output / 1M
Official direct
$0.72
$6.00
Basic (free, 70% off)
$0.22
$1.80
Pro ($19 lifetime, 80% off)
$0.14
$1.20

Code samples

OpenAI SDK (Python) — works for all 3 vendors
from openai import OpenAI

client = OpenAI(
    base_url="https://claudeapi.cheap/api/proxy/v1",
    api_key="sk-cc-your-key-here",
)
resp = client.chat.completions.create(
    model="gemini-2.5-flash",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(resp.choices[0].message.content)
cURL
curl https://claudeapi.cheap/api/proxy/v1/chat/completions \
  -H "Authorization: Bearer sk-cc-your-key-here" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-2.5-flash",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Tools that work great with Gemini 2.5 Flash

FAQ

What does Gemini 2.5 Flash cost via ClaudeAPI.cheap?

On the Pro plan ($19 lifetime), Gemini 2.5 Flash is $0.14 per 1M input tokens and $1.20 per 1M output tokens. That's 80% off the official price ($0.72/$6). The Basic plan is free forever at 70% off.

Is Gemini 2.5 Flash the same model as direct API access?

Yes. The proxy forwards requests to the same underlying model with the same context window (1000k tokens) and capabilities. Only vendor-fingerprint headers are stripped. Behavior, output quality, and reasoning are identical.

How do I switch my existing code to Gemini 2.5 Flash via ClaudeAPI.cheap?

Change one line — the base URL — in your existing Anthropic or OpenAI SDK initialization. Use https://claudeapi.cheap/api/proxy for Anthropic-format calls or https://claudeapi.cheap/api/proxy/v1 for OpenAI-format. Use your sk-cc-... key as the API key. No code changes beyond that.

What's the rate limit on Gemini 2.5 Flash?

Pro plan caps at 500 requests/min and 2M tokens/min globally across all models. Basic plan is 200 RPM / 1M TPM. Newer models may have lower upstream caps that float — see /status for live availability.

Does Gemini 2.5 Flash support streaming and tools?

Streaming for Gemini is on the roadmap — non-streaming text generation is fully supported today. Tool calls and multimodal inputs are partially supported. Anthropic and OpenAI vendors have full streaming + tool-call support.

Ready to use Gemini 2.5 Flash cheaper?

Free Basic plan, $19 lifetime Pro. Crypto only. No subscription.

Get an API key — free