openai-whisper-api
OfficialOpenAI Audio Transcriptions API via curl; gpt-4o-transcribe, mini, diarize, or whisper-1.
What this skill does
When applied, it prepends a system prompt before your request is sent — no extra calls and no change to how you are billed beyond the added tokens.
---
name: openai-whisper-api
description: "OpenAI Audio Transcriptions API via curl; gpt-4o-transcribe, mini, diarize, or whisper-1."
homepage: https://platform.openai.com/docs/guides/speech-to-text
metadata:
{
"openclaw":
{
"emoji": "🌐",
"requires": { "bins": ["curl", "node"], "env": ["OPENAI_API_KEY"] },
"primaryEnv": "OPENAI_API_KEY",
"install":
[
{
"id": "brew",
"kind": "brew",
"formula": "curl",
"bins": ["curl"],
"label": "Install curl (brew)",
},
],
},
}
---
# OpenAI transcriptions API
Transcribe audio through `/v1/audio/transcriptions`. Set `OPENAI_BASE_URL` for an OpenAI-compatible proxy or local gateway.
## Quick start
```bash
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a
```
Defaults:
- Model: `gpt-4o-transcribe`
- Output: `<input>.txt`
## Useful flags
```bash
{baseDir}/scripts/transcribe.sh /path/to/audio.ogg --model gpt-4o-transcribe --out /tmp/transcript.txt
{baseDir}/scripts/transcribe.sh /path/to/audio.ogg --model gpt-4o-mini-transcribe
{baseDir}/scripts/transcribe.sh /path/to/audio.ogg --model gpt-4o-transcribe-diarize --json
{baseDir}/scripts/transcribe.sh /path/to/audio.ogg --model whisper-1
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --language en
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --prompt "Speaker names: Peter, Daniel"
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a --json --out /tmp/transcript.json
```
Notes:
- Supported upload formats include `mp3`, `mp4`, `mpeg`, `mpga`, `m4a`, `wav`, `webm`.
- 25 MB upload limit on the hosted API.
- Use diarize for speaker labels; script sends `chunking_strategy=auto` and rejects `--prompt`.
## API key
Set `OPENAI_API_KEY`, or configure it in the active OpenClaw config file (`$OPENCLAW_CONFIG_PATH`, default `~/.openclaw/openclaw.json`). Optionally set `OPENAI_BASE_URL`:
```json5
{
skills: {
"opeUse this skill
Add a "skill" field with the skill’s ID to your chat completion request. It is applied server-side before your prompt is sent — no extra calls.
{
"model": "gpt-4o-mini",
"skill": "imp-4a83e2cc-5eb2-4f37-a364-36537d8de344",
"messages": [{ "role": "user", "content": "…" }]
}Install the skill, enable it in your dashboard and (optionally) limit it to specific models. It then applies automatically to every matching request — with no "skill" field to send each time.
Set it up in your dashboardMore skills
Set up and use 1Password CLI for sign-in, desktop integration, and reading or injecting secrets.
Create, view, edit, delete, search, move, or export Apple Notes via the memo CLI on macOS.
List, add, edit, complete, or delete Apple Reminders and reminder lists via remindctl.
Create, search, and manage Bear notes via grizzly CLI.
Monitor blogs and RSS/Atom feeds for updates using the blogwatcher CLI.
BluOS CLI (blu) for discovery, playback, grouping, and volume.
Capture frames or clips from RTSP/ONVIF cameras.
Search, install, update, sync, or publish agent skills with the ClawHub CLI and registry.