NVIDIAFreeOperational

Nemotron Nano 12b V2 Vl

API model name: nemotron-nano-12b-v2-vl

Nemotron Nano 12b V2 Vl is NVIDIA's chat model, served on the Api.Airforce unified API. It has a 128K-token context window. Beyond text, it accepts image, video as input. Capabilities include Vision, Tool calling, Reasoning. It is available on the free tier at no per-token cost. Access it through the OpenAI-compatible API with one key, alongside 65+ other models on Api.Airforce.

Pricing

Input / 1M tokens
Free
Output / 1M tokens
Free

Specifications

Provider
NVIDIA
Type
chat model
Context window
128K tokens
Max output
128K tokens
Input
image, text, video
Output
text

Capabilities

VisionTool callingReasoningStreaming

Use Nemotron Nano 12b V2 Vl via the API

OpenAI-compatible — point any OpenAI SDK at https://api.airforce/v1 and pass nemotron-nano-12b-v2-vl as the model.

cURL
curl https://api.airforce/v1/chat/completions \
  -H "Authorization: Bearer $AIRFORCE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nemotron-nano-12b-v2-vl",
    "messages": [{ "role": "user", "content": "Hello!" }]
  }'
Python
from openai import OpenAI
client = OpenAI(base_url="https://api.airforce/v1", api_key="$AIRFORCE_API_KEY")
r = client.chat.completions.create(
    model="nemotron-nano-12b-v2-vl",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(r.choices[0].message.content)
JavaScript
import OpenAI from "openai";
const client = new OpenAI({ baseURL: "https://api.airforce/v1", apiKey: process.env.AIRFORCE_API_KEY });
const r = await client.chat.completions.create({
  model: "nemotron-nano-12b-v2-vl",
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(r.choices[0].message.content);

Live performance

Real throughput and latency across the suppliers serving this model.

Loading live metrics…

Related models