Fireworks RFT now available! Fine-tune open models that outperform frontier models. Try today

Model Library
/Mistral/Mistral Small 24B Instruct 2501
Mistral Logo Icon

Mistral Small 24B Instruct 2501

Ready
fireworks/mistral-small-24b-instruct-2501

    Mistral Small 3 ( 2501 ) sets a new benchmark in the "small" Large Language Models category below 70B, boasting 24B parameters and achieving state-of-the-art capabilities comparable to larger models!

    Mistral Small 24B Instruct 2501 API Features

    Fine-tuning

    Docs

    Mistral Small 24B Instruct 2501 can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model

    On-demand Deployment

    Docs

    On-demand deployments give you dedicated GPUs for Mistral Small 24B Instruct 2501 using Fireworks' reliable, high-performance system with no rate limits.

    Mistral Small 24B Instruct 2501 FAQs

    What is Mistral Small 24B Instruct 2501 and who developed it?

    Mistral Small 24B Instruct 2501 is an instruction-tuned version of the base Mistral Small 24B model, developed by Mistral AI. It is designed as a high-performance, "small" LLM (under 70B parameters) that competes with much larger models. It supports multilingual tasks and is well-suited for chat, reasoning, and structured output generation.

    What applications and use cases does Mistral Small 24B Instruct 2501 excel at?
    • Conversational AI
    • Code assistance
    • Agentic systems
    • Search and Enterprise RAG
    • Tool calling (via vLLM)
    • Multilingual tasks

    Its performance is validated across generalist, reasoning, and coding benchmarks.

    What is the maximum context length for Mistral Small 24B Instruct 2501?

    The model supports a context window of 32,768 tokens.

    What is the usable context window for Mistral Small 24B Instruct 2501?

    The full 32.8K token context window is available on Fireworks' on-demand deployments with no rate limits.

    What are known failure modes of Mistral Small 24B Instruct 2501?
    • No image input, embeddings, or reranker support
    • Function calling is supported, but only in vLLM-compatible setups, not in Fireworks-native API
    • Requires 55–60 GB GPU RAM for FP16 inference
    • Does not support streaming responses
    How many parameters does Mistral Small 24B Instruct 2501 have?

    The model has 23.6 billion parameters.

    Is fine-tuning supported for Mistral Small 24B Instruct 2501?

    Yes. Fireworks supports LoRA-based fine-tuning through its RFT infrastructure.

    How are tokens counted (prompt vs completion)?

    Fireworks charges based on combined input + output token usage.

    What rate limits apply on the shared endpoint?
    • Serverless: Not supported
    • On-demand: Available with no rate limits using dedicated GPUs
    What license governs commercial use of Mistral Small 24B Instruct 2501?

    The model is released under the Apache 2.0 license, allowing unrestricted commercial use.

    Metadata

    State
    Ready
    Created on
    1/30/2025
    Kind
    Base model
    Provider
    Mistral
    Hugging Face
    Mistral-Small-24B-Instruct-2501

    Specification

    Calibrated
    No
    Mixture-of-Experts
    No
    Parameters
    23.6B

    Supported Functionality

    Fine-tuning
    Supported
    Serverless
    Not supported
    Serverless LoRA
    Supported
    Context Length
    32.8k tokens
    Function Calling
    Not supported
    Embeddings
    Not supported
    Rerankers
    Not supported
    Support image input
    Not supported