ZaiPaidOperational

Glm 5.1

API model name: glm-5.1

Glm 5.1 is Zai's chat model, served on the Api.Airforce unified API. It has a 200K-token context window. Capabilities include Tool calling, Reasoning, Prompt caching. It is priced at $0.80 per million input tokens and $2.40 per million output tokens. That is below the provider's $1.40 official input rate. Access it through the OpenAI-compatible API with one key, alongside 65+ other models on Api.Airforce.

Pricing

Input / 1M tokens
$0.80
Output / 1M tokens
$2.40
Official input rate
$1.40

Api.Airforce price vs. the provider's official rate.

Specifications

Provider
Zai
Type
chat model
Context window
200K tokens
Input
text
Output
text

Capabilities

Tool callingReasoningPrompt cachingStreaming

Use Glm 5.1 via the API

OpenAI-compatible — point any OpenAI SDK at https://api.airforce/v1 and pass glm-5.1 as the model.

cURL
curl https://api.airforce/v1/chat/completions \
  -H "Authorization: Bearer $AIRFORCE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "glm-5.1",
    "messages": [{ "role": "user", "content": "Hello!" }]
  }'
Python
from openai import OpenAI
client = OpenAI(base_url="https://api.airforce/v1", api_key="$AIRFORCE_API_KEY")
r = client.chat.completions.create(
    model="glm-5.1",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(r.choices[0].message.content)
JavaScript
import OpenAI from "openai";
const client = new OpenAI({ baseURL: "https://api.airforce/v1", apiKey: process.env.AIRFORCE_API_KEY });
const r = await client.chat.completions.create({
  model: "glm-5.1",
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(r.choices[0].message.content);

Live performance

Real throughput and latency across the suppliers serving this model.

Loading live metrics…

Related models