OpenAI /

Whisper V3 Turbo

accounts/fireworks/models/whisper-v3-turbo

ServerlessAudio

Whisper large-v3-turbo is a finetuned version of a pruned Whisper large-v3. In other words, it's the exact same model, except that the number of decoding layers have reduced from 32 to 4. As a result, the model is way faster, at the expense of a minor quality degradation.

Serverless API

Whisper V3 Turbo is available via Fireworks' Speech-to-Text APIs, where you are billed based on the duration of the transcribed audio. The API supports multiple languages and additional features, including forced alignment.

You can call the Fireworks Speech-to-Text API using HTTP requests from any language. You can see the API references here:

Try it

API Examples

Generate a model response using the speech-transcription endpoint of whisper-v3-turbo. API reference

import requests

with open("audio.mp3", "rb") as f:
    response = requests.post(
        "https://audio-turbo.us-virginia-1.direct.fireworks.ai/v1/audio/transcriptions",
        headers={"Authorization": f"Bearer <YOUR_API_KEY>"},
        files={"file": f},
        data={
            "model": "whisper-v3-turbo",
            "temperature": "0",
            "vad_model": "silero"
        },
    )

if response.status_code == 200:
    print(response.json())
else:
    print(f"Error: {response.status_code}", response.text)

Open in model playground