How much does Gemini Flash cost?

Gemini Flash is billed per generation rather than per token. See the pricing page for the current rate.

What is the context window of Gemini Flash?

Gemini Flash supports a context window of up to 1M tokens. It can return up to 66K tokens in a single response.

What can Gemini Flash do?

Gemini Flash supports Vision, Tool calling, Reasoning, Documents, Prompt caching.

Is Gemini Flash free to use?

Gemini Flash is a paid, pay-as-you-go model — no subscription, you are only charged for usage.

How do I use Gemini Flash via the API?

Gemini Flash is OpenAI-compatible. Point any OpenAI SDK at https://api.airforce/v1 and pass the model ID gemini-flash with your Api.Airforce API key.

Who makes Gemini Flash?

Gemini Flash is Google's chat model, served through the unified Api.Airforce gateway alongside 65+ other models.

GooglePaidOperational

Gemini Flash

API model name: gemini-flash

Gemini Flash is Google's chat model, served on the Api.Airforce unified API. It has a 1M-token context window. Beyond text, it accepts image, video, file, audio as input. Capabilities include Vision, Tool calling, Reasoning, Documents, Prompt caching. Knowledge cutoff: 2025-01-01. Access it through the OpenAI-compatible API with one key, alongside 65+ other models on Api.Airforce.

Get an API key View pricing

Pricing

Input / 1M tokens

—

Output / 1M tokens

—

Specifications

Provider: Google
Type: chat model
Context window: 1M tokens
Max output: 66K tokens
Knowledge cutoff: 2025-01-01
Input: text, image, video, file, audio
Output: text
Prompt caching: Supported

Capabilities

VisionTool callingReasoningDocumentsPrompt cachingStreaming

What is Gemini Flash used for?

Chatbots & assistants — conversational AI, drafting, summarizing and Q&A.
Image understanding — analyze photos, screenshots, charts and scanned documents.
Agents & automation — function calling and tool use for multi-step workflows.
Complex reasoning — math, coding and step-by-step problem solving.
Document analysis — summarize and answer questions across long files.
Long-context tasks — process entire documents or codebases in a single prompt.
Real-time experiences — stream tokens for responsive chat and apps.

Gemini Flash vs. similar models

Model	Context	Input / 1M	Output / 1M
Gemini Flash	1M	—	—
Gemini 2.5 Flash	1M	$0.40	$2.50
Gemini 2.5 Pro	1M	$0.70	$2.20
Gemini 3 Flash	1M	$0.15	$0.80

Prices are Api.Airforce pay-as-you-go rates per 1M tokens. Context is the maximum input length.

Gemini Flash — frequently asked questions

How much does Gemini Flash cost?: Gemini Flash is billed per generation rather than per token. See the pricing page for the current rate.
What is the context window of Gemini Flash?: Gemini Flash supports a context window of up to 1M tokens. It can return up to 66K tokens in a single response.
What can Gemini Flash do?: Gemini Flash supports Vision, Tool calling, Reasoning, Documents, Prompt caching.
Is Gemini Flash free to use?: Gemini Flash is a paid, pay-as-you-go model — no subscription, you are only charged for usage.
How do I use Gemini Flash via the API?: Gemini Flash is OpenAI-compatible. Point any OpenAI SDK at https://api.airforce/v1 and pass the model ID gemini-flash with your Api.Airforce API key.
Who makes Gemini Flash?: Gemini Flash is Google's chat model, served through the unified Api.Airforce gateway alongside 65+ other models.

All models·Quickstart·Chat API reference

Use Gemini Flash via the API

OpenAI-compatible — point any OpenAI SDK at https://api.airforce/v1 and pass gemini-flash as the model.

cURL

curl https://api.airforce/v1/chat/completions \
  -H "Authorization: Bearer $AIRFORCE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-flash",
    "messages": [{ "role": "user", "content": "Hello!" }]
  }'

Python

from openai import OpenAI
client = OpenAI(base_url="https://api.airforce/v1", api_key="$AIRFORCE_API_KEY")
r = client.chat.completions.create(
    model="gemini-flash",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(r.choices[0].message.content)

JavaScript

import OpenAI from "openai";
const client = new OpenAI({ baseURL: "https://api.airforce/v1", apiKey: process.env.AIRFORCE_API_KEY });
const r = await client.chat.completions.create({
  model: "gemini-flash",
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(r.choices[0].message.content);