Audio
Teks-ke-ucapan, ucapan-ke-teks, musik, efek suara, perubahan suara, dubbing, dan kloning suara — satu kunci API, setiap penyedia.
Endpoints in this section: /v1/audio/speech, /music, /sound-effects, /transcriptions, /audio-isolation, /voice-changer, /dubbing, /voices, plus /v1/voices/* for cloning.
Teks-ke-ucapan
Sintesis pidato dari teks. Mengembalikan byte audio mentah dengan Tipe Konten yang cocok (misalnya audio/mpeg). Format PCM dan µ-law menyertakan header WAV sehingga dapat diputar di browser apa pun.
https://api.airforce/v1/audio/speechmodel TTS
…· live| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | TTS model ID. See /v1/models for IDs with input_modalities containing "text" and output_modalities containing "audio". |
| input | string | Required | Text to synthesise. Long inputs are chunked automatically. |
| voice | string | Required | Voice ID. Use GET /v1/audio/voices to list options. Cloned voices appear here too. |
| response_format | string | Optional | "mp3" (default), "mp3_44100_128", "mp3_44100_192", "pcm_22050", "pcm_24000", "pcm_44100", "ulaw_8000". |
| speed | float | Optional | 0.25 – 4.0. OpenAI-compatible. Some upstream providers ignore this. |
| voice_settings | object | Optional | ElevenLabs-shape: { stability: 0–1, similarity_boost: 0–1, style: 0–1, use_speaker_boost: bool }. |
| language_code | string | Optional | ISO-639-1 hint, e.g. "de", "en", "ja". Improves prosody for multilingual models. |
| seed | integer | Optional | Reproducibility seed where supported. |
Contoh
curl https://api.airforce/v1/audio/speech \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-H "Content-Type: application/json" \
--output speech.mp3 \
-d '{
"model": "elevenlabs-multilingual-v2",
"input": "Willkommen bei Airforce.",
"voice": "21m00Tcm4TlvDq8ikWAM",
"response_format": "mp3_44100_128",
"voice_settings": {"stability": 0.6, "similarity_boost": 0.8}
}'Daftar suara
Mengembalikan setiap suara yang dapat Anda sampaikan sebagai parameter "suara" pada panggilan TTS / sulih suara / buku audio. Suara kloning juga dikembalikan ke sini setelah statusnya aktif.
https://api.airforce/v1/audio/voicescurl https://api.airforce/v1/audio/voices \
-H "Authorization: Bearer sk-air-YOUR_API_KEY"Struktur respons
| Parameter | Type | Required | Description |
|---|---|---|---|
| voices[] | array | Optional | List of voice descriptors. |
| voices[].id | string | Optional | Provider-native voice identifier. Pass this as "voice". |
| voices[].name | string | Optional | Human-readable name. |
| voices[].category | string | Optional | "premade" | "cloned" | "professional". |
| voices[].preview_url | string | Optional | Short audio sample, when the upstream exposes one. |
| voices[].labels | object | Optional | Free-form metadata: gender, language, accent, age, use case. |
Generasi musik
Hasilkan trek musik lengkap dari prompt teks. Mengembalikan audio biner.
https://api.airforce/v1/audio/music| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | Music model ID, e.g. "music-v1". |
| prompt | string | Required | Style / mood / structure description. |
| duration_seconds | integer | Optional | Track length. Range depends on the model (typically 15–120 s). |
| response_format | string | Optional | "mp3" (default) or provider-native. |
| instrumental | boolean | Optional | When true, suppresses vocals. |
| style | string | Optional | Optional genre tag list, e.g. "EDM, bass, dark". |
curl https://api.airforce/v1/audio/music \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-H "Content-Type: application/json" \
--output track.mp3 \
-d '{
"model": "music-v1",
"prompt": "Lofi hip-hop beat with soft piano and rain",
"duration_seconds": 60,
"instrumental": true
}'Efek suara
SFX pendek dari prompt teks. Bentuknya sama seperti musik, hanya saja durasinya lebih pendek.
https://api.airforce/v1/audio/sound-effects| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | SFX model ID. |
| prompt | string | Required | Effect description, e.g. "thunder rumble fading into rain". |
| duration_seconds | integer | Optional | Length, typically 0.5–22 s. |
| response_format | string | Optional | "mp3" (default). |
curl https://api.airforce/v1/audio/sound-effects \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-H "Content-Type: application/json" \
--output thunder.mp3 \
-d '{
"model": "sfx-v1",
"prompt": "Distant thunder rolling, then rain",
"duration_seconds": 8
}'Transkripsi (ucapan-ke-teks)
Unggahan multi bagian dari file audio. Mengembalikan teks yang ditranskripsi.
https://api.airforce/v1/audio/transcriptionsModel transkripsi
…· live| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | Transcription model ID, e.g. "whisper-1". |
| file | binary | Required | Audio file. Supports mp3, wav, m4a, flac, ogg, webm. |
| language | string | Optional | ISO-639-1 language hint. |
| prompt | string | Optional | Optional bias prompt to steer vocabulary (jargon, names). |
| response_format | string | Optional | "json" (default), "text", "srt", "vtt", "verbose_json". |
| temperature | float | Optional | 0–1. Higher allows more creative interpretation of unclear audio. |
curl https://api.airforce/v1/audio/transcriptions \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-F "[email protected]" \
-F "model=whisper-1" \
-F "language=de" \
-F "response_format=verbose_json"Response (verbose_json)
{
"text": "Willkommen zum Meeting...",
"language": "de",
"duration": 412.5,
"segments": [
{"id": 0, "start": 0.0, "end": 4.2, "text": "Willkommen zum Meeting", "speaker": "S1"},
{"id": 1, "start": 4.2, "end": 9.8, "text": "...", "speaker": "S2"}
]
}Isolasi audio
Hapus kebisingan latar belakang dari klip sambil mempertahankan suara latar depan. Unggahan multibagian, mengembalikan audio.
https://api.airforce/v1/audio/audio-isolation| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | Isolation model ID. |
| file | binary | Required | Input audio. |
curl https://api.airforce/v1/audio/audio-isolation \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-F "model=isolation-v1" \
-F "[email protected]" \
--output clean.mp3Pengubah suara (ucapan-ke-ucapan)
Ambil masukan ucapan dan render ulang dengan suara yang berbeda dengan tetap menjaga waktu dan infleksi.
https://api.airforce/v1/audio/voice-changer| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | Voice-change model ID. |
| voice | string | Required | Target voice ID. Same catalog as TTS. |
| file | binary | Required | Input audio. |
| voice_settings | object | Optional | Optional ElevenLabs-shape settings (stability, similarity_boost, …). |
curl https://api.airforce/v1/audio/voice-changer \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-F "model=voice-changer-v1" \
-F "voice=21m00Tcm4TlvDq8ikWAM" \
-F "[email protected]" \
--output transformed.mp3Sulih suara
Sulih suara multi-bahasa asinkron. Segera mengembalikan ID tugas; jajak pendapat untuk status, lalu ambil audio yang di-dubbing per bahasa target.
1. Create job
https://api.airforce/v1/audio/dubbing| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | Dubbing model ID. |
| file | binary | Required | Source audio (mp3, wav, m4a, mp4 video extracted automatically). |
| target_lang | string | Required | Target language code, ISO-639-1. Pass multiple via repeated form fields for multi-language output. |
| source_lang | string | Optional | Source language. Auto-detected when omitted. |
| num_speakers | integer | Optional | Hint for diarization. Auto when omitted. |
| watermark | boolean | Optional | Add an audible watermark to the output. |
curl https://api.airforce/v1/audio/dubbing \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-F "model=dubbing-v1" \
-F "[email protected]" \
-F "target_lang=de" \
-F "target_lang=es" \
-F "source_lang=en"{
"task_id": "dub_01HXY...",
"status": "queued"
}2. Poll status
https://api.airforce/v1/audio/dubbing/:task_id{
"task_id": "dub_01HXY...",
"status": "completed",
"progress": 100,
"available_languages": ["de", "es"]
}3. Download per language
https://api.airforce/v1/audio/dubbing/:task_id/audio/:langcurl https://api.airforce/v1/audio/dubbing/dub_01HXY.../audio/de \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
--output german.mp3Kloning suara
Kloning suara dari sampel audio pendek dan gunakan kembali di setiap titik akhir ucapan. Kloning suara memerlukan persetujuan eksplisit — ambil teks persetujuan saat ini, hash, dan kirimkan hash dengan sampel Anda.
1. Fetch consent text
https://api.airforce/v1/voices/consent-text{
"text": "I confirm that the voice samples I am uploading are either my own voice or a voice I have explicit permission to clone…",
"hash": "9f4b0c8d2e…"
}2. Create the clone
https://api.airforce/v1/voices/clone| Parameter | Type | Required | Description |
|---|---|---|---|
| name | string | Required | Public voice name shown in the library. |
| description | string | Optional | Optional free-text description. |
| consent_hash | string | Required | SHA-256 of the consent paragraph. Fetch the current text via GET /v1/voices/consent-text and pass its hash field. |
| files | binary | Required | 1–25 audio samples. Repeat the form field per file. Total ≤ 200 MB. 30 s – 3 min per clip works best. |
curl https://api.airforce/v1/voices/clone \
-H "Authorization: Bearer sk-air-YOUR_API_KEY" \
-F "name=My voice" \
-F "description=Calm, conversational" \
-F "consent_hash=9f4b0c8d2e..." \
-F "[email protected]" \
-F "[email protected]"{
"provider_voice_id": "voice_01HXY...",
"name": "My voice",
"description": "Calm, conversational",
"created_at": "2026-05-06T22:30:00Z",
"status": "active",
"provider": "elevenlabs"
}3. List your library
https://api.airforce/v1/voices/librarycurl https://api.airforce/v1/voices/library \
-H "Authorization: Bearer sk-air-YOUR_API_KEY"| Parameter | Type | Required | Description |
|---|---|---|---|
| voices[].provider_voice_id | string | Optional | Pass as "voice" on TTS / voice-changer endpoints. |
| voices[].status | string | Optional | "active" | "errored" | "deleting". |
| voices[].provider | string | Optional | Upstream that hosts the clone. |
| voices[].last_error | string | Optional | Set when status is "errored". |
4. Update / delete
https://api.airforce/v1/voices/clone/:idhttps://api.airforce/v1/voices/clone/:idPATCH accepts name and description in a JSON body. DELETE removes the voice both locally and at the upstream provider.
Catatan
- Audio responses are returned as raw bytes with the right
Content-Type. PCM / µ-law formats are wrapped in a minimal WAV header so they're browser-playable as-is. - Multipart endpoints (transcriptions, isolation, voice-changer, dubbing, cloning) accept up to 200 MB per request.
- Voice IDs work across providers: a cloned ElevenLabs voice can be passed straight to
/v1/audio/voice-changer. - Cost is metered per character (TTS), per second (music / SFX / dubbing / changer), or per audio minute (transcription). Check
X-Cost-Centson the response.