How much does Gemini 3.1 Flash Lite cost?

Gemini 3.1 Flash Lite is billed pay-as-you-go at $0.14 per 1M input tokens and $0.75 per 1M output tokens. There is no subscription — you only pay for what you use.

What is the context window of Gemini 3.1 Flash Lite?

Gemini 3.1 Flash Lite supports a context window of up to 1M tokens. It can return up to 33K tokens in a single response.

What can Gemini 3.1 Flash Lite do?

Gemini 3.1 Flash Lite supports Vision, Tool calling, Reasoning, Documents, Prompt caching.

Is Gemini 3.1 Flash Lite free to use?

Gemini 3.1 Flash Lite is a paid, pay-as-you-go model — no subscription, you are only charged for usage.

How do I use Gemini 3.1 Flash Lite via the API?

Gemini 3.1 Flash Lite is OpenAI-compatible. Point any OpenAI SDK at https://api.airforce/v1 and pass the model ID gemini-3.1-flash-lite with your Api.Airforce API key.

Who makes Gemini 3.1 Flash Lite?

Gemini 3.1 Flash Lite is Google's chat model, served through the unified Api.Airforce gateway alongside 100+ other models.

GooglePaidOperational

Gemini 3.1 Flash Lite

API model name: gemini-3.1-flash-lite

Gemini 3.1 Flash Lite is Google's chat model, served on the Api.Airforce unified API. It has a 1M-token context window. Beyond text, it accepts image, audio, video, document as input. Capabilities include Vision, Tool calling, Reasoning, Documents, Prompt caching. It is priced at $0.14 per million input tokens and $0.75 per million output tokens. That is below the provider's $0.25 official input rate. Knowledge cutoff: 2026-03. Access it through the OpenAI-compatible API with one key, alongside 100+ other models on Api.Airforce.

Get an API key View pricing

Pricing

Input / 1M tokens

$0.14

Output / 1M tokens

$0.75

Cache read / 1M tokens

$0.02

Official input rate

$0.25

Official output rate

$1.50

Api.Airforce price vs. the provider's official rate.

Specifications

Provider: Google
Type: chat model
Context window: 1M tokens
Max output: 33K tokens
Knowledge cutoff: 2026-03
Input: text, image, audio, video, document
Output: text
Prompt caching: Supported

Capabilities

VisionTool callingReasoningDocumentsPrompt cachingStreaming

Benchmarks

Independent evaluations and measured speed from Artificial Analysis.

Intelligence Index

25.0/100

Coding Index

34.7/100

GPQA Diamond82%

Humanity's Last Exam16%

Output speed311.0 tok/s

Time to first token5.03 s

Source: Benchmark data by Artificial Analysis (artificialanalysis.ai)

What is Gemini 3.1 Flash Lite used for?

Chatbots & assistants — conversational AI, drafting, summarizing and Q&A.
Image understanding — analyze photos, screenshots, charts and scanned documents.
Agents & automation — function calling and tool use for multi-step workflows.
Complex reasoning — math, coding and step-by-step problem solving.
Document analysis — summarize and answer questions across long files.
Long-context tasks — process entire documents or codebases in a single prompt.
Real-time experiences — stream tokens for responsive chat and apps.