Fireworks RFT now available! Fine-tune open models that outperform frontier models. Try today

Model Library
/Deepseek/DeepSeek R1 (Fast)
Deepseek Logo Mark

DeepSeek R1 (Fast)

Ready
fireworks/deepseek-r1

    DeepSeek R1 (Fast) is the speed-optimized serverless deployment of DeepSeek-R1. Compared to the DeepSeek R1 (Basic) endpoint, R1 (Fast) provides faster speeds with higher per-token prices, see https://fireworks.ai/pricing for details. Identical models are served on the two endpoints, so there are no quality or quantization differences. DeepSeek-R1 is a state-of-the-art large language model optimized with reinforcement learning and cold-start data for exceptional reasoning, math, and code performance. The model is identical to the one uploaded by DeepSeek on HuggingFace. Note that fine-tuning for this model is only available through contacting fireworks at https://fireworks.ai/company/contact-us.

    DeepSeek R1 (Fast) API Features

    Fine-tuning

    Docs

    DeepSeek R1 (Fast) can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model

    On-demand Deployment

    Docs

    On-demand deployments give you dedicated GPUs for DeepSeek R1 (Fast) using Fireworks' reliable, high-performance system with no rate limits.

    DeepSeek R1 FAQs

    What is DeepSeek R1 (Fast) and who developed it?

    DeepSeek R1 is a serverless, speed-optimized deployment of DeepSeek-R1 hosted by Fireworks AI. It uses the same model as DeepSeek R1 (Basic), with faster inference and higher per-token costs. The underlying model, DeepSeek-R1, was developed by DeepSeek and is optimized for advanced reasoning, math, and code generation using a reinforcement learning-first training approach.

    What applications and use cases does DeepSeek R1 excel at?

    DeepSeek R1 excels at:

    • Multi-step reasoning and logical inference
    • Mathematical problem-solving (e.g., 97.3% on MATH-500)
    • Advanced code generation (2,029 Elo on Codeforces-like tasks)
    • Scientific question answering
    • High-stakes decision-making workflows.
    What is the maximum context length for DeepSeek R1?

    The maximum context length is 163,840 tokens.

    Does DeepSeek R1 support quantized formats (4-bit/8-bit)?

    Yes. DeepSeek R1 has multiple quantized variants including 4-bit and 8-bit options.

    What is the default temperature of DeepSeek R1 on Fireworks AI?

    The recommended default sampling temperature for DeepSeek R1 is 0.6, as used in benchmark evaluations.

    What is the maximum output length for DeepSeek R1?

    The maximum generation length is 32,768 tokens.

    What are known failure modes of DeepSeek R1?

    Known issues include:

    • Repetitive or incoherent output if temperature is too low or system prompt is misused
    • The model may skip chain-of-thought reasoning unless prompted with <think>, which can reduce performance on reasoning tasks.
    How many parameters does DeepSeek R1 have?
    • Total parameters: 671 billion
    • Activated per forward pass: 37 billion

    DeepSeek R1 uses a Mixture of Experts (MoE) architecture to reduce active compute while maintaining model capacity.

    Is fine-tuning supported for DeepSeek R1?

    Yes. Fireworks supports fine-tuning DeepSeek R1 using LoRA-based adapters. Contact Fireworks for access.

    What license governs commercial use of DeepSeek R1?

    DeepSeek R1 is licensed under the MIT License, which permits commercial use, modification, and redistribution.

    Metadata

    State
    Ready
    Created on
    1/20/2025
    Kind
    Base model
    Provider
    Deepseek
    Hugging Face
    DeepSeek-R1

    Specification

    Calibrated
    Yes
    Mixture-of-Experts
    Yes
    Parameters
    671B

    Supported Functionality

    Fine-tuning
    Supported
    Serverless
    Not supported
    Serverless LoRA
    Not supported
    Context Length
    163.8k tokens
    Function Calling
    Not supported
    Embeddings
    Not supported
    Rerankers
    Not supported
    Support image input
    Not supported