OpenAI /
Whisper V3 Turbo
accounts/fireworks/models/whisper-v3-turbo
ServerlessAudio
ServerlessAudio
Whisper large-v3-turbo is a finetuned version of a pruned Whisper large-v3. In other words, it's the exact same model, except that the number of decoding layers have reduced from 32 to 4. As a result, the model is way faster, at the expense of a minor quality degradation.
Whisper V3 Turbo is available via Fireworks' Speech-to-Text APIs, where you are billed based on the duration of the transcribed audio. The API supports multiple languages and additional features, including forced alignment.
You can call the Fireworks Speech-to-Text API using HTTP requests from any language. You can see the API references here:
Generate a model response using the speech-transcription endpoint of whisper-v3-turbo. API reference
import requests with open("audio.mp3", "rb") as f: response = requests.post( "https://audio-turbo.us-virginia-1.direct.fireworks.ai/v1/audio/transcriptions", headers={"Authorization": f"Bearer <YOUR_API_KEY>"}, files={"file": f}, data={ "model": "accounts/fireworks/models/whisper-v3-turbo", "temperature": "0", "vad_model": "silero" }, ) if response.status_code == 200: print(response.json()) else: print(f"Error: {response.status_code}", response.text)