Audio
Tekst-naar-spraak, spraak-naar-tekst, muziek, geluidseffecten, stemverandering, nasynchronisatie en stemklonen: één API-sleutel, elke provider.
Endpoints in this section: /v1/audio/speech, /music, /sound-effects, /transcriptions, /audio-isolation, /voice-changer, /dubbing, /voices, plus /v1/voices/* for cloning.
Tekst-naar-spraak
Synthetiseer spraak uit tekst. Retourneert onbewerkte audiobytes met het overeenkomende inhoudstype (bijvoorbeeld audio/mpeg). PCM- en µ-law-formaten bevatten een WAV-header, zodat ze in elke browser kunnen worden afgespeeld.
https://api.airforce/v1/audio/speechTTS-modellen
…· live| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | TTS model ID. See /v1/models for IDs with input_modalities containing "text" and output_modalities containing "audio". |
| input | string | Required | Text to synthesise. Long inputs are chunked automatically. |
| voice | string | Required | Voice ID. Use GET /v1/audio/voices to list options. Cloned voices appear here too. |
| response_format | string | Optional | "mp3" (default), "mp3_44100_128", "mp3_44100_192", "pcm_22050", "pcm_24000", "pcm_44100", "ulaw_8000". |
| speed | float | Optional | 0.25 – 4.0. OpenAI-compatible. Some upstream providers ignore this. |
| voice_settings | object | Optional | ElevenLabs-shape: { stability: 0–1, similarity_boost: 0–1, style: 0–1, use_speaker_boost: bool }. |
| language_code | string | Optional | ISO-639-1 hint, e.g. "de", "en", "ja". Improves prosody for multilingual models. |
| seed | integer | Optional | Reproducibility seed where supported. |
Voorbeeld
curl https://api.airforce/v1/audio/speech \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-H "Content-Type: application/json" \
--output speech.mp3 \
-d '{
"model": "elevenlabs-multilingual-v2",
"input": "Willkommen bei Airforce.",
"voice": "21m00Tcm4TlvDq8ikWAM",
"response_format": "mp3_44100_128",
"voice_settings": {"stability": 0.6, "similarity_boost": 0.8}
}'Maak een lijst van stemmen
Retourneert elke stem die u kunt doorgeven als de parameter 'stem' bij TTS-/voice-over-/audioboekoproepen. Gekloonde stemmen worden hier ook teruggestuurd zodra hun status actief is.
https://api.airforce/v1/audio/voicescurl https://api.airforce/v1/audio/voices \
-H "Authorization: Bearer sk-air-YOUR_API_KEY"Antwoord-structuur
| Parameter | Type | Required | Description |
|---|---|---|---|
| voices[] | array | Optional | List of voice descriptors. |
| voices[].id | string | Optional | Provider-native voice identifier. Pass this as "voice". |
| voices[].name | string | Optional | Human-readable name. |
| voices[].category | string | Optional | "premade" | "cloned" | "professional". |
| voices[].preview_url | string | Optional | Short audio sample, when the upstream exposes one. |
| voices[].labels | object | Optional | Free-form metadata: gender, language, accent, age, use case. |
Muziek generatie
Genereer volledige muziektracks via een tekstprompt. Retourneert binaire audio.
https://api.airforce/v1/audio/music| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | Music model ID, e.g. "music-v1". |
| prompt | string | Required | Style / mood / structure description. |
| duration_seconds | integer | Optional | Track length. Range depends on the model (typically 15–120 s). |
| response_format | string | Optional | "mp3" (default) or provider-native. |
| instrumental | boolean | Optional | When true, suppresses vocals. |
| style | string | Optional | Optional genre tag list, e.g. "EDM, bass, dark". |
curl https://api.airforce/v1/audio/music \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-H "Content-Type: application/json" \
--output track.mp3 \
-d '{
"model": "music-v1",
"prompt": "Lofi hip-hop beat with soft piano and rain",
"duration_seconds": 60,
"instrumental": true
}'Geluidseffecten
Korte SFX vanaf een tekstprompt. Dezelfde vorm als muziek, alleen een kortere duur.
https://api.airforce/v1/audio/sound-effects| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | SFX model ID. |
| prompt | string | Required | Effect description, e.g. "thunder rumble fading into rain". |
| duration_seconds | integer | Optional | Length, typically 0.5–22 s. |
| response_format | string | Optional | "mp3" (default). |
curl https://api.airforce/v1/audio/sound-effects \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-H "Content-Type: application/json" \
--output thunder.mp3 \
-d '{
"model": "sfx-v1",
"prompt": "Distant thunder rolling, then rain",
"duration_seconds": 8
}'Transcripties (spraak-naar-tekst)
Meerdelige upload van een audiobestand. Retourneert de getranscribeerde tekst.
https://api.airforce/v1/audio/transcriptionsTranscriptiemodellen
…· live| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | Transcription model ID, e.g. "whisper-1". |
| file | binary | Required | Audio file. Supports mp3, wav, m4a, flac, ogg, webm. |
| language | string | Optional | ISO-639-1 language hint. |
| prompt | string | Optional | Optional bias prompt to steer vocabulary (jargon, names). |
| response_format | string | Optional | "json" (default), "text", "srt", "vtt", "verbose_json". |
| temperature | float | Optional | 0–1. Higher allows more creative interpretation of unclear audio. |
curl https://api.airforce/v1/audio/transcriptions \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-F "[email protected]" \
-F "model=whisper-1" \
-F "language=de" \
-F "response_format=verbose_json"Response (verbose_json)
{
"text": "Willkommen zum Meeting...",
"language": "de",
"duration": 412.5,
"segments": [
{"id": 0, "start": 0.0, "end": 4.2, "text": "Willkommen zum Meeting", "speaker": "S1"},
{"id": 1, "start": 4.2, "end": 9.8, "text": "...", "speaker": "S2"}
]
}Audio-isolatie
Verwijder achtergrondgeluiden uit een clip terwijl de voorgrondstem behouden blijft. Uploaden uit meerdere delen, retourneert audio.
https://api.airforce/v1/audio/audio-isolation| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | Isolation model ID. |
| file | binary | Required | Input audio. |
curl https://api.airforce/v1/audio/audio-isolation \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-F "model=isolation-v1" \
-F "[email protected]" \
--output clean.mp3Stemwisselaar (spraak-naar-spraak)
Neem invoerspraak en geef deze opnieuw weer met een andere stem, waarbij de timing en verbuiging behouden blijven.
https://api.airforce/v1/audio/voice-changer| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | Voice-change model ID. |
| voice | string | Required | Target voice ID. Same catalog as TTS. |
| file | binary | Required | Input audio. |
| voice_settings | object | Optional | Optional ElevenLabs-shape settings (stability, similarity_boost, …). |
curl https://api.airforce/v1/audio/voice-changer \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-F "model=voice-changer-v1" \
-F "voice=21m00Tcm4TlvDq8ikWAM" \
-F "[email protected]" \
--output transformed.mp3Nasynchronisatie
Asynchrone nasynchronisatie in meerdere talen. Retourneert onmiddellijk een taak-ID; peilen naar de status en ophalen van de nagesynchroniseerde audio per doeltaal.
1. Create job
https://api.airforce/v1/audio/dubbing| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | Dubbing model ID. |
| file | binary | Required | Source audio (mp3, wav, m4a, mp4 video extracted automatically). |
| target_lang | string | Required | Target language code, ISO-639-1. Pass multiple via repeated form fields for multi-language output. |
| source_lang | string | Optional | Source language. Auto-detected when omitted. |
| num_speakers | integer | Optional | Hint for diarization. Auto when omitted. |
| watermark | boolean | Optional | Add an audible watermark to the output. |
curl https://api.airforce/v1/audio/dubbing \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-F "model=dubbing-v1" \
-F "[email protected]" \
-F "target_lang=de" \
-F "target_lang=es" \
-F "source_lang=en"{
"task_id": "dub_01HXY...",
"status": "queued"
}2. Poll status
https://api.airforce/v1/audio/dubbing/:task_id{
"task_id": "dub_01HXY...",
"status": "completed",
"progress": 100,
"available_languages": ["de", "es"]
}3. Download per language
https://api.airforce/v1/audio/dubbing/:task_id/audio/:langcurl https://api.airforce/v1/audio/dubbing/dub_01HXY.../audio/de \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
--output german.mp3Stemklonen
Kloon een stem uit korte audiofragmenten en hergebruik deze op elk spraakeindpunt. Voor het klonen van stemmen is expliciete toestemming vereist. Haal de huidige toestemmingstekst op, hash deze en verzend de hash met uw voorbeelden.
1. Fetch consent text
https://api.airforce/v1/voices/consent-text{
"text": "I confirm that the voice samples I am uploading are either my own voice or a voice I have explicit permission to clone…",
"hash": "9f4b0c8d2e…"
}2. Create the clone
https://api.airforce/v1/voices/clone| Parameter | Type | Required | Description |
|---|---|---|---|
| name | string | Required | Public voice name shown in the library. |
| description | string | Optional | Optional free-text description. |
| consent_hash | string | Required | SHA-256 of the consent paragraph. Fetch the current text via GET /v1/voices/consent-text and pass its hash field. |
| files | binary | Required | 1–25 audio samples. Repeat the form field per file. Total ≤ 200 MB. 30 s – 3 min per clip works best. |
curl https://api.airforce/v1/voices/clone \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-F "name=My voice" \
-F "description=Calm, conversational" \
-F "consent_hash=9f4b0c8d2e..." \
-F "[email protected]" \
-F "[email protected]"{
"provider_voice_id": "voice_01HXY...",
"name": "My voice",
"description": "Calm, conversational",
"created_at": "2026-05-06T22:30:00Z",
"status": "active",
"provider": "elevenlabs"
}3. List your library
https://api.airforce/v1/voices/librarycurl https://api.airforce/v1/voices/library \
-H "Authorization: Bearer sk-air-YOUR_API_KEY"| Parameter | Type | Required | Description |
|---|---|---|---|
| voices[].provider_voice_id | string | Optional | Pass as "voice" on TTS / voice-changer endpoints. |
| voices[].status | string | Optional | "active" | "errored" | "deleting". |
| voices[].provider | string | Optional | Upstream that hosts the clone. |
| voices[].last_error | string | Optional | Set when status is "errored". |
4. Update / delete
https://api.airforce/v1/voices/clone/:idhttps://api.airforce/v1/voices/clone/:idPATCH accepts name and description in a JSON body. DELETE removes the voice both locally and at the upstream provider.
Opmerkingen
- Audio responses are returned as raw bytes with the right
Content-Type. PCM / µ-law formats are wrapped in a minimal WAV header so they're browser-playable as-is. - Multipart endpoints (transcriptions, isolation, voice-changer, dubbing, cloning) accept up to 200 MB per request.
- Voice IDs work across providers: a cloned ElevenLabs voice can be passed straight to
/v1/audio/voice-changer. - Cost is metered per character (TTS), per second (music / SFX / dubbing / changer), or per audio minute (transcription). Check
X-Cost-Centson the response.