GooglePaidOperational

Gemini 3.1 Flash Lite

API model name: gemini-3.1-flash-lite

Gemini 3.1 Flash Lite is Google's chat model, served on the Api.Airforce unified API. It has a 1M-token context window. Beyond text, it accepts image, audio, video, document as input. Capabilities include Vision, Tool calling, Documents, Prompt caching. It is priced at $0.14 per million input tokens and $0.75 per million output tokens. That is below the provider's $0.25 official input rate. Knowledge cutoff: 2026-03. Access it through the OpenAI-compatible API with one key, alongside 100+ other models on Api.Airforce.

Pricing

Input / 1M tokens
$0.14
Output / 1M tokens
$0.75
Cache read / 1M tokens
$0.02
Official input rate
$0.25
Official output rate
$1.50

Api.Airforce price vs. the provider's official rate.

Specifications

Provider
Google
Type
chat model
Context window
1M tokens
Max output
33K tokens
Knowledge cutoff
2026-03
Input
text, image, audio, video, document
Output
text
Prompt caching
Supported

Capabilities

VisionTool callingDocumentsPrompt cachingStreaming

Benchmarks

Independent evaluations and measured speed from Artificial Analysis.

Intelligence Index
33.5/100
Coding Index
30.1/100
GPQA Diamond82%
Humanity's Last Exam16%
Output speed325.2 tok/s
Time to first token5.23 s

Source: Benchmark data by Artificial Analysis (artificialanalysis.ai)

What is Gemini 3.1 Flash Lite used for?

  • Chatbots & assistants — conversational AI, drafting, summarizing and Q&A.
  • Image understanding — analyze photos, screenshots, charts and scanned documents.
  • Agents & automation — function calling and tool use for multi-step workflows.
  • Document analysis — summarize and answer questions across long files.
  • Long-context tasks — process entire documents or codebases in a single prompt.
  • Real-time experiences — stream tokens for responsive chat and apps.

Gemini 3.1 Flash Lite vs. similar models

ModelIntelligenceContextInput / 1MOutput / 1M
Gemini 3.1 Flash Lite33.51M$0.14$0.75
Gemini 2.5 Flash20.61M$0.40$2.50
Gemini 2.5 Pro34.62M$0.70$2.20
Gemini 3 Flash35.01M$0.40$2.40

Prices are Api.Airforce pay-as-you-go rates per 1M tokens. Context is the maximum input length.

Related models

Gemini 3.1 Flash Lite — frequently asked questions

How much does Gemini 3.1 Flash Lite cost?
Gemini 3.1 Flash Lite is billed pay-as-you-go at $0.14 per 1M input tokens and $0.75 per 1M output tokens. There is no subscription — you only pay for what you use.
What is the context window of Gemini 3.1 Flash Lite?
Gemini 3.1 Flash Lite supports a context window of up to 1M tokens. It can return up to 33K tokens in a single response.
What can Gemini 3.1 Flash Lite do?
Gemini 3.1 Flash Lite supports Vision, Tool calling, Documents, Prompt caching.
Is Gemini 3.1 Flash Lite free to use?
Gemini 3.1 Flash Lite is a paid, pay-as-you-go model — no subscription, you are only charged for usage.
How do I use Gemini 3.1 Flash Lite via the API?
Gemini 3.1 Flash Lite is OpenAI-compatible. Point any OpenAI SDK at https://api.airforce/v1 and pass the model ID gemini-3.1-flash-lite with your Api.Airforce API key.
Who makes Gemini 3.1 Flash Lite?
Gemini 3.1 Flash Lite is Google's chat model, served through the unified Api.Airforce gateway alongside 100+ other models.