聊天完成狀況
透過一個 API 產生跨 100 多個模型的聊天回應。與 OpenAI 聊天完成、Anthropic Messages 和 Anthropic Responses 相容。
Airforce 在同一組模型上同時支援 OpenAI Chat Completions 與 Anthropic Messages 兩種通訊格式。挑選你已在使用的 SDK,只需更改 base URL — 非 Claude 模型會在任一介面後透明轉發。
本頁涵蓋驗證、兩種介面的 request 與 response 結構、streaming、tool calling、vision、reasoning 以及 prompt caching。第一次使用?先從下方的基本範例開始,讓單次呼叫運作起來,成功後再逐步加上 streaming、tools 或 caching。
驗證
每個請求都需要一個 Bearer 令牌(您的 Airforce API 金鑰)。Anthropic x-api-key 標頭也被接受 /v1/messages 用於 SDK 相容性。
Authorization: Bearer sk-air-YOUR_API_KEY
# alt for /v1/messages:
x-api-key: sk-air-YOUR_API_KEYPOST /v1/chat/completions
OpenAI 相容的聊天完成。與官方合作 openai SDK透過覆蓋 base_url 到 https://api.airforce/v1.
https://api.airforce/v1/chat/completions請求正文
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | 型號 ID。使用 GET /v1/models 發現可用的 ID。 |
| messages | array | Required | 對話歷史記錄。每個條目都有 { role: "system" | “使用者” | “助理”| “工具”,內容}。內容是一個字串或內容塊數組(願景,見下文)。 |
| max_tokens | integer | Optional | 產生的最大令牌數。上限為模型的 max_output_tokens。 |
| temperature | float | Optional | 採樣溫度,0–2。越低則更具確定性。預設值取決於上游提供者。 |
| top_p | float | Optional | 細胞核取樣。使用溫度或top_p,而不是兩者都使用。 |
| stream | boolean | Optional | 當為 true 時,回應是伺服器發送的事件流。請參閱下面的“串流”。 |
| models | array | Optional | Fallback models (max 3), e.g. ["deepseek-v3.2", "gpt-4o-mini"]. If every channel of the primary model fails, each candidate is tried in order. You are billed for — and response.model reports — the model that actually answered. Unknown or plan-gated candidates are skipped. With the OpenAI SDK pass it via extra_body. |
| transforms | array | Optional | Prompt transforms. Supported: ["middle-out"] — when the conversation overflows the model's context window, whole messages are dropped from the middle (system prompts, the first message and the most recent turns are kept), so long roleplay or agent histories keep working instead of erroring. Opt-in; off by default. |
| stream_options | object | Optional | { include_usage: boolean }。用量一律包含在最後一個串流分塊中;此欄位為相容 OpenAI 而被接受,但無法將其關閉。 |
| stop | string | array | Optional | 最多 4 個停止序列。一旦生產出來,生產就會停止。 |
| tools | array | Optional | 模型可能呼叫的函數定義。請參閱下面的「工具呼叫」。 |
| tool_choice | string | object | Optional | 「auto」(預設)、「none」或 { type: "function", function: { name } } 強制執行特定呼叫。 |
| response_format | object | Optional | { type: "json_object" } 強制模型發出有效的 JSON。不支援的型號將被忽略。 |
| reasoning_effort | string | Optional | OpenAI o1/o3 式推理深度:「低」| 「中」| 「高的」。參見“推理與思考”。 |
| thinking | string | object | Optional | 跨供應商的思考開關。"on" | "off" | "auto",或 Anthropic 結構 { type: "enabled", budget_tokens: N }。請參閱「Reasoning & thinking」。 |
| thinking_budget | integer | Optional | 模型推理追蹤的令牌上限(當提供者公開時)。 |
| ignore_defaults | boolean | Optional | 跳過使用者為此請求保存的每個模型的預設參數(在儀表板中配置)。 |
基本範例
curl https://api.airforce/v1/chat/completions \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.1-chat",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
],
"max_tokens": 200,
"temperature": 0.7
}'回應形狀
| Parameter | Type | Required | Description |
|---|---|---|---|
| id | string | Optional | 穩定的完成 ID,例如“chatcmpl-abc123”。 |
| object | string | Optional | 「chat.completion」用於非串流傳輸,「chat.completion.chunk」用於串流傳輸。 |
| created | integer | Optional | Unix 時間戳(秒)。 |
| model | string | Optional | 回顯所請求的型號 ID。 |
| choices | array | Optional | 完成候選數組:[{索引,訊息:{角色,內容,工具呼叫? },完成原因}]。 |
| choices[].finish_reason | string | Optional | “停止”| “長度”| “工具呼叫”| “內容過濾器”。 |
| usage | object | Optional | { prompt_tokens, completion_tokens, total_tokens, completion_tokens_details?, prompt_tokens_details?, cache_creation_input_tokens?, cache_creation? }。當模型生成推理痕跡時設定 completion_tokens_details.reasoning_tokens。當上游回傳提示快取資訊時會出現快取欄位:prompt_tokens_details.cached_tokens 回報快取讀取(OpenAI 標準),cache_creation_input_tokens 彙總寫入,cache_creation.ephemeral_5m_input_tokens / ephemeral_1h_input_tokens 提供 TTL 拆分。 |
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1710000000,
"model": "gpt-5.1-chat",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris."
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 20,
"completion_tokens": 8,
"total_tokens": 28
}
}推理與思考
支援擴展推理的模型會在常規輸出旁公開思維軌跡。Airforce 將三種不同的上游約定標準化為一組適用於任何地方的規範參數。
查看 supports_reasoning: true 在模型上 GET /v1/models 了解哪些 ID 接受這些參數。
具有推理支持的模型
…· live規範參數
| Parameter | Type | Required | Description |
|---|---|---|---|
| reasoning_effort | string | Optional | “低”| “中”| “高的”。 OpenAI o1/o3、GPT-5 推理模型以及映射到它們的任何路由器。 |
| thinking | string | object | Optional | "on" | "off" | "auto" 用於快速切換,或 { type: "enabled", budget_tokens: N } 用於 Anthropic 原生結構。對應到 Claude extended thinking、Gemini thinking 與 DeepSeek reasoning。 |
| thinking_budget | integer | Optional | 模型在發出可見輸出之前可以花費推理的最大令牌。鏡像budget_tokens。 |
推理工作(OpenAI 風格)
curl https://api.airforce/v1/chat/completions \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "o3-mini",
"messages": [{"role": "user", "content": "Prove the Pythagorean theorem."}],
"reasoning_effort": "high"
}'擴展思維(Anthropic 風格)
curl https://api.airforce/v1/chat/completions \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4.6",
"messages": [{"role": "user", "content": "Plan a 7-day Italy trip."}],
"thinking": {"type": "enabled", "budget_tokens": 4000}
}'推理痕跡本身出現在 choices[0].message.reasoning (OpenAI 形狀)或作為 thinking 阻塞在 content (Anthropic 格式)。推理令牌的計費和報告在 usage.completion_tokens_details.reasoning_tokens.
該 completion_tokens_details.reasoning_tokens 細目只有在上游供應商回報時才會出現。在 stream 回應中,該追蹤資訊會逐 chunk 透過 delta.reasoning_content 傳來。
視覺與影像輸入
型號有 supports_vision: true 接受作為內容區塊嵌入的圖像。公用 URL 或 Base64 資料 URL 皆可;大小限制取決於上游模型。
具有視覺支援的型號
…· livecurl https://api.airforce/v1/chat/completions \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.1-chat",
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{"type": "image_url", "image_url": {"url": "https://example.com/cat.jpg"}}
]
}]
}'工具調用
型號有 supports_tools: true 可以呼叫您定義的函數。該模型返回一個 tool_calls 大批;您運行該調用,然後將結果傳回 tool 訊息.
支援工具調用的型號
…· live要求
curl https://api.airforce/v1/chat/completions \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.1-chat",
"messages": [{"role": "user", "content": "What is the weather in Paris?"}],
"tools": [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City name"}
},
"required": ["location"]
}
}
}],
"tool_choice": "auto"
}'透過工具呼叫進行回應
{
"id": "chatcmpl-abc123",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": null,
"tool_calls": [{
"id": "call_1",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\":\"Paris\"}"
}
}]
},
"finish_reason": "tool_calls"
}]
}跟進工具結果
{
"model": "gpt-5.1-chat",
"messages": [
{"role": "user", "content": "What is the weather in Paris?"},
{
"role": "assistant",
"content": null,
"tool_calls": [{
"id": "call_1",
"type": "function",
"function": {"name": "get_weather", "arguments": "{\"location\":\"Paris\"}"}
}]
},
{"role": "tool", "tool_call_id": "call_1", "content": "{\"temp_c\": 14, \"sky\": \"cloudy\"}"}
]
}Structured outputs
Set response_format to make the model return JSON. Two modes are supported:
{ "type": "json_object" }— the response is a single valid JSON value.{ "type": "json_schema", "json_schema": { "name", "schema", "strict" } }— the model is steered to produce JSON that matches your JSON Schema.
curl https://api.airforce/v1/chat/completions \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.1-chat",
"messages": [{"role": "user", "content": "Extract the city and country: I live in Paris, France."}],
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "location",
"schema": {
"type": "object",
"properties": { "city": {"type": "string"}, "country": {"type": "string"} },
"required": ["city", "country"]
}
}
}
}'Reliability: even when a model wraps its answer in prose or a markdown code fence, Airforce extracts the JSON payload so you always receive parseable content. If no valid JSON can be recovered, the original text is returned unchanged — so the guarantee never makes a response worse. This applies to non-streamed responses; streamed responses are passed through unchanged.
串流媒體
放 stream: true 接收部分完成作為伺服器發送的事件。每個事件都是一個 JSON 區塊,其形狀與非串流響應相同,除了 message 被替換為 delta. 串流結束於 data: [DONE].
curl https://api.airforce/v1/chat/completions \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.1-chat",
"messages": [{"role": "user", "content": "Write a haiku about Berlin."}],
"stream": true,
"stream_options": {"include_usage": true}
}'接線格式
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1710000000,"model":"gpt-5.1-chat","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1710000000,"model":"gpt-5.1-chat","choices":[{"index":0,"delta":{"content":"Cold "},"finish_reason":null}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1710000000,"model":"gpt-5.1-chat","choices":[{"index":0,"delta":{"content":"stone "},"finish_reason":null}]}
…
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1710000000,"model":"gpt-5.1-chat","choices":[{"index":0,"delta":{},"finish_reason":"stop"}],"usage":{"prompt_tokens":12,"completion_tokens":17,"total_tokens":29}}
data: [DONE]POST /v1/messages
與 Anthropic 相容的訊息 API。與官方合作 @anthropic-ai/sdk 透過設定 baseURL 到 https://api.airforce. 對於非 Claude 模型,會透明地轉發至 OpenAI/Google 等。
https://api.airforce/v1/messages請求正文
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | 模型 ID(Anthropic 格式或路由別名)。 |
| messages | array | Required | 每個條目:{ 角色:“用戶”| “助理”,內容:字串 |大批 }。 |
| max_tokens | integer | Required | Anthropic 需要。響應的令牌上限。 |
| system | string | array | Optional | 系統提示。傳遞一個 { type: "text", text, cache_control? 的陣列} 區塊來標記快取的前綴段。請參閱“提示快取”。 |
| temperature | float | Optional | 0–1。 |
| top_p | float | Optional | 細胞核取樣。 |
| top_k | integer | Optional | 將採樣池限制為前 K 個代幣。 |
| stop_sequences | array | Optional | 最多 4 個停止序列。 |
| stream | boolean | Optional | 如果為 true,則發出 Anthropic 風格的 SSE 事件流(請參閱「流」)。 |
| fallbacks | array | Optional | Fallback models (max 3) in Anthropic form: [{"model": "gpt-4o-mini"}]. If every channel of the primary model fails, each candidate is tried in order; you are billed for — and the response model field reports — the model that actually answered. A plain models string array is accepted too. |
| tools | array | Optional | Anthropic 工具定義:{ 名稱、描述、input_schema }。回應可能包含 tool_use 內容區塊。 |
| tool_choice | object | Optional | { 類型:“自動”| “任何”| “工具”,名字? }。 |
| thinking | object | Optional | Anthropic 擴展思維:{ type: "enabled",budget_tokens: N }。 |
例子
curl https://api.airforce/v1/messages \
-H "x-api-key: sk-air-YOUR_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4.6",
"max_tokens": 256,
"system": "You are a helpful assistant.",
"messages": [
{"role": "user", "content": "Hello, Claude!"}
]
}'回應形狀
| Parameter | Type | Required | Description |
|---|---|---|---|
| id | string | Optional | 訊息 ID,例如“msg_01ABCxyz”。 |
| type | string | Optional | 總是“消息”。 |
| role | string | Optional | 永遠是「助手」。 |
| content | array | Optional | 內容塊陣列:{ type: "text" | “工具使用” | “思考”,…}。 |
| model | string | Optional | 所請求型號的迴聲。 |
| stop_reason | string | Optional | “結束轉彎”| “最大代幣”| “停止序列”| “工具使用”。 |
| usage | object | Optional | { input_tokens, output_tokens, cache_read_input_tokens?, cache_creation_input_tokens?, cache_creation? }。當使用了提示快取時會出現快取欄位。cache_creation.ephemeral_5m_input_tokens 和 ephemeral_1h_input_tokens 提供按 TTL 的寫入拆分。 |
串流媒體活動
Anthropic SSE 使用命名事件而不是一次性 JSON 區塊。每個事件都有一個 event: 姓名和一個 data: JSON 有效負載。
event: message_start
data: {"type":"message_start","message":{"id":"msg_01","role":"assistant","content":[],"model":"claude-sonnet-4.6","stop_reason":null,"usage":{"input_tokens":12,"output_tokens":1}}}
event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Hello"}}
event: content_block_stop
data: {"type":"content_block_stop","index":0}
event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":17}}
event: message_stop
data: {"type":"message_stop"}POST /v1/messages/count_tokens
Anthropic-compatible token counting. Send the same system / messages / tools you would pass to /v1/messages and get an input-token estimate back without running the model — nothing is billed.
https://api.airforce/v1/messages/count_tokenscurl https://api.airforce/v1/messages/count_tokens \
-H "x-api-key: sk-air-YOUR_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4.6",
"system": "You are a helpful assistant.",
"messages": [{"role": "user", "content": "Hello, Claude!"}]
}'
# → {"input_tokens": 34}The count is a fast character-based estimate (about 4 characters per token) over system, messages and tools — close enough for context-budget checks, not an exact tokenizer run.
提示快取
在 /v1/messages 對於 Claude 模型,透過傳遞將前綴標記為緩存 system 作為緩存段攜帶的塊數組 cache_control: { type: "ephemeral" }. 以相同前綴開頭的後續請求將收取更便宜的快取讀取速率。型號有 supports_caching: true 在 /v1/models 支持這一點。
具有提示快取的模型
…· live{
"model": "claude-sonnet-4.6",
"max_tokens": 1024,
"system": [
{"type": "text", "text": "You are a senior staff engineer at Airforce."},
{
"type": "text",
"text": "<repository-snapshot>...</repository-snapshot>",
"cache_control": {"type": "ephemeral"}
}
],
"messages": [
{"role": "user", "content": "Where is rate limiting enforced?"}
]
}快取計數在回應中的報告方式
快取令牌計數以每種格式的原生形狀傳遞,因此 SDK (openai、@anthropic-ai/sdk、@google/genai) 無需自訂程式碼即可讀取。當值為零時省略欄位,使未快取的回應保持精簡。
/v1/chat/completions (OpenAI 格式)
"usage": {
"prompt_tokens": 2104,
"completion_tokens": 147,
"total_tokens": 2251,
"prompt_tokens_details": { "cached_tokens": 1980 },
"cache_creation_input_tokens": 124,
"cache_creation": {
"ephemeral_5m_input_tokens": 124,
"ephemeral_1h_input_tokens": 0
}
}/v1/messages (Anthropic 格式)
"usage": {
"input_tokens": 2104,
"output_tokens": 147,
"cache_read_input_tokens": 1980,
"cache_creation_input_tokens": 124,
"cache_creation": {
"ephemeral_5m_input_tokens": 124,
"ephemeral_1h_input_tokens": 0
}
}/v1beta/.../generateContent (Gemini 格式)
"usageMetadata": {
"promptTokenCount": 2104,
"candidatesTokenCount": 147,
"totalTokenCount": 2251,
"cachedContentTokenCount": 1980
}快取在哪些情況下生效
對於 Claude 模型,明確的 cache_control 標記在 /v1/messages 與 /v1/chat/completions 上皆生效——把它們放在 system 或 message 的內容區塊上。許多其他供應商(OpenAI 系、DeepSeek、Gemini)會自動快取:你無需傳送任何標記,只要重用夠長的前綴,回應中就會出現 cached_tokens。
快取時長:5 分鐘或 1 小時
快取的前綴預設存活 5 分鐘,每次命中都會刷新計時。若需更長存活的前綴,請在標記中加入 ttl: "1h" 。回應會在 cache_creation 下分別回報每種 TTL。
"cache_control": { "type": "ephemeral", "ttl": "1h" }範例:先寫入,再讀取
把完全相同的請求傳送兩次(上面的快取範例)。第一次看到該前綴的呼叫支付一次性的快取寫入;TTL 內相同的呼叫支付便宜許多的快取讀取。
第一次呼叫——快取寫入(usage 摘錄):
"usage": {
"input_tokens": 2104,
"output_tokens": 12,
"cache_creation_input_tokens": 1980,
"cache_read_input_tokens": 0
}TTL 內第二次相同呼叫——快取讀取:
"usage": {
"input_tokens": 2104,
"output_tokens": 12,
"cache_creation_input_tokens": 0,
"cache_read_input_tokens": 1980
}限制與費用
- Claude 要求最小可快取前綴(約 1024 個 token;某些模型更大)。更短的前綴根本不會被快取。
- 每個請求最多 4 個快取斷點,且快取的前綴在多次呼叫間必須逐位元組相同——哪怕改動一個字元也會錯過快取。
- 快取寫入比一般輸入更貴(5m ≈ 1.25×,1h ≈ 2×);讀取便宜許多(≈ 0.1×)。各模型的快取價格見定價頁面。
POST /v1/responses
用於有狀態對話的 OpenAI Responses-API 表面。相同的 Bearer/x-api-key 認證。快取計數顯示為 input_tokens_details.cached_tokens(讀取)加上平面的 cache_creation_input_tokens + cache_creation.ephemeral_*(寫入),與 /v1/chat/completions 對等。
https://api.airforce/v1/responsesPOST /v1beta/models/{model}:generateContent
Google Gemini-compatible endpoint. Works with the official @google/genai SDK and the Gemini CLI by pointing the base URL at https://api.airforce/v1beta. Any routed model works — requests are translated to and from the native Gemini shape, and the model is taken from the URL path (not the body).
https://api.airforce/v1beta/models/{model}:generateContentAuthentication
Pass your Airforce API key any of the three ways Google clients use:
# 1) query parameter (Google default)
?key=sk-air-YOUR_API_KEY
# 2) header
x-goog-api-key: sk-air-YOUR_API_KEY
# 3) bearer token
Authorization: Bearer sk-air-YOUR_API_KEYRequest body
| Parameter | Type | Required | Description |
|---|---|---|---|
| contents | array | Required | Conversation turns. Each: { role: "user" | "model", parts: [...] }. A part is { text }, { functionCall: { name, args } }, or { functionResponse: { name, response } }. "model" is Gemini's term for the assistant role. |
| systemInstruction | object | Optional | System prompt: { parts: [{ text }] }. |
| generationConfig | object | Optional | { temperature, maxOutputTokens, topP, stopSequences } — mapped to the canonical sampling parameters. |
| tools | array | Optional | Tool definitions: [{ functionDeclarations: [{ name, description, parameters }] }]. functionDeclarations are flattened across entries. |
| toolConfig | object | Optional | Tool-choice control: { functionCallingConfig: { mode: "AUTO" | "ANY" | "NONE" } }. ANY forces a call, NONE disables tools. |
Example
curl "https://api.airforce/v1beta/models/gemini-3.1-pro:generateContent" \
-H "x-goog-api-key: sk-air-YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [
{"role": "user", "parts": [{"text": "What is the capital of France?"}]}
],
"systemInstruction": {"parts": [{"text": "You are a helpful assistant."}]},
"generationConfig": {"temperature": 0.7, "maxOutputTokens": 256}
}'Response shape
| Parameter | Type | Required | Description |
|---|---|---|---|
| candidates | array | Optional | Generated turns: [{ content: { role: "model", parts }, finishReason, index }]. Only the first candidate is populated. |
| candidates[].finishReason | string | Optional | "STOP" | "MAX_TOKENS" | "SAFETY" | "OTHER". |
| usageMetadata | object | Optional | { promptTokenCount, candidatesTokenCount, totalTokenCount, cachedContentTokenCount? }. cachedContentTokenCount appears when the upstream reported a cache read. |
| modelVersion | string | Optional | Echo of the requested model. |
{
"candidates": [{
"content": {
"role": "model",
"parts": [{"text": "The capital of France is Paris."}]
},
"finishReason": "STOP",
"index": 0
}],
"usageMetadata": {
"promptTokenCount": 16,
"candidatesTokenCount": 8,
"totalTokenCount": 24
},
"modelVersion": "gemini-3.1-pro"
}POST /v1beta/models/{model}:streamGenerateContent
Streaming uses the :streamGenerateContent action and returns Server-Sent Events. Each data: line is a full Gemini-shaped chunk (not a delta object); the final chunk carries usageMetadata.
data: {"candidates":[{"content":{"role":"model","parts":[{"text":"The capital"}]},"index":0}],"modelVersion":"gemini-3.1-pro"}
data: {"candidates":[{"content":{"role":"model","parts":[{"text":" is Paris."}]},"index":0}],"modelVersion":"gemini-3.1-pro"}
data: {"candidates":[{"content":{"role":"model","parts":[]},"finishReason":"STOP","index":0}],"usageMetadata":{"promptTokenCount":16,"candidatesTokenCount":8,"totalTokenCount":24}}List models
The catalog is also exposed in Gemini Model-resource shape so Google clients can enumerate models.
curl https://api.airforce/v1beta/modelsNotes: the base URL is https://api.airforce/v1beta (or /v1), not Google's host. The model name comes from the URL path, not the request body. Only the first candidate is returned, and a subset of Gemini fields is translated — safetySettings and cachedContent are currently ignored. Billing, rate limits and smart routing apply exactly as on /v1/chat/completions.
錯誤
Airforce 為兩個端點傳回標準 HTTP 狀態代碼和統一的錯誤信封。
| Parameter | Type | Required | Description |
|---|---|---|---|
| 400 | invalid_request_error | Optional | JSON 格式錯誤、缺少必填欄位、未知型號。 |
| 401 | invalid_request_error / auth_required | Optional | API 金鑰缺失或無效。 |
| 402 | insufficient_quota | Optional | 此模型需要有效的訂閱或正的 Pay-as-you-Go 餘額。 |
| 403 | model_access_denied / insufficient_scope | Optional | 計劃或每鍵權限拒絕此請求。 |
| 404 | model_not_found | Optional | 請求的模型不存在或你無權存取。 |
| 429 | rate_limit_error | Optional | 超出請求率或每日代幣上限。 |
| 503 | api_error / moderation_unavailable | Optional | 所請求的提供程序的所有上游金鑰均失敗。 |
{
"error": {
"message": "The requested model does not exist or you do not have access to it.",
"type": "model_not_found",
"param": null,
"code": "404"
}
}描述性的 slug 位於 type。code 是以字串表示的 HTTP 狀態(例如 "404"),而 param 除了參數範圍驗證錯誤外皆為 null,在該情況下它會指出有問題的參數。
探索型號
請參閱模型 ID 及其功能標誌(視覺、工具、推理、快取、上下文長度等)的完整清單: /docs/api/models.
curl https://api.airforce/v1/models \
-H "Authorization: Bearer sk-air-YOUR_API_KEY"