Fireworks RFT now available! Fine-tune open models that outperform frontier models. Try today

Model Library
/OpenAI/Whisper V3 Large
OpenAi Logo MArk

Whisper V3 Large

Ready
fireworks/whisper-v3

    Whisper is a state-of-the-art model for automatic speech recognition (ASR) and speech translation, proposed in the paper Robust Speech Recognition via Large-Scale Weak Supervision by Alec Radford et al. from OpenAI. Trained on >5M hours of labeled data, Whisper demonstrates a strong ability to generalise to many datasets and domains in a zero-shot setting.

    Whisper V3 Large API Features

    Serverless

    Docs

    Immediately run model on pre-configured GPUs and pay-per-token

    On-demand Deployment

    Docs

    On-demand deployments give you dedicated GPUs for Whisper V3 Large using Fireworks' reliable, high-performance system with no rate limits.

    0

    Whisper V3 Large FAQs

    What is Whisper V3 Large and who developed it?

    Whisper V3 Large is a multilingual, Transformer-based automatic-speech-recognition (ASR) and speech-translation model created by OpenAI and hosted on Fireworks AI.

    What applications and use cases does Whisper V3 Large excel at?

    Whisper V3 Large is best suited for:

    • High-accuracy speech transcription
    • Zero-shot speech-to-English translation across 99 languages
    What is the maximum context length for Whisper V3 Large?

    The model's receptive field is 30 seconds of audio per inference window.

    What is the usable context window?

    Fireworks recommends chunking longer audio into 30-second segments (with optional overlap) for stable performance.

    Does Whisper V3 Large support quantized formats (4-bit/8-bit)?

    Yes. 16 quantized variants (including 4-bit & 8-bit) are supported for Whisper V3 Large.

    What are known failure modes of Whisper V3 Large?

    Known limitations of Whisper V3 Large include:

    • Possible hallucinated text
    • Uneven accuracy on low-resource languages or certain accents
    • Occasional repetitive outputs
    How many parameters does Whisper V3 Large have?

    Whisper V3 Large has approximately 1.54 billion parameters.

    What rate limits apply on the shared endpoint?

    On-demand deployments of Whisper V3 Large run on dedicated GPUs with no rate limits.

    Metadata

    State
    Ready
    Created on
    N/A
    Kind
    Unknown
    Provider
    OpenAI

    Specification

    Calibrated
    No
    Mixture-of-Experts
    No
    Parameters
    N/A

    Supported Functionality

    Fine-tuning
    Not supported
    Serverless
    Supported
    Serverless LoRA
    Not supported
    Context Length
    N/A
    Function Calling
    Not supported
    Embeddings
    Not supported
    Rerankers
    Not supported
    Support image input
    Not supported