Whisper large-v3-turbo is a finetuned version of a pruned Whisper large-v3. In other words, it's the exact same model, except that the number of decoding layers have reduced from 32 to 4. As a result, the model is way faster, at the expense of a minor quality degradation.
Immediately run model on pre-configured GPUs and pay-per-token
Learn MoreOn-demand deployments give you dedicated GPUs for Whisper V3 Turbo using Fireworks' reliable, high-performance system with no rate limits.
Learn MoreOpenAI
Available
$0.0009