Fireworks RFT now available! Fine-tune open models that outperform frontier models. Try today

Model Library
/Deepseek/DeepSeek V3.1
fireworks/deepseek-v3p1

    DeepSeek-V3.1 is post-trained on the top of DeepSeek-V3.1-Base, which is built upon the original V3 base checkpoint through a two-phase long context extension approach, following the methodology outlined in the original DeepSeek-V3 report. We have expanded our dataset by collecting additional long documents and substantially extending both training phases. The 32K extension phase has been increased 10-fold to 630B tokens, while the 128K extension phase has been extended by 3.3x to 209B tokens. Additionally, DeepSeek-V3.1 is trained using the UE8M0 FP8 scale data format to ensure compatibility with microscaling data formats.

    DeepSeek V3.1 API Features

    Fine-tuning

    Docs

    DeepSeek V3.1 can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model

    Serverless

    Docs

    Immediately run model on pre-configured GPUs and pay-per-token

    On-demand Deployment

    Docs

    On-demand deployments give you dedicated GPUs for DeepSeek V3.1 using Fireworks' reliable, high-performance system with no rate limits.

    Available Serverless

    Run queries immediately, pay only for usage

    $0.56 / $1.68
    Per 1M Tokens (input/output)

    DeepSeek V3.1 FAQs

    What is DeepSeek V3.1 and who developed it?

    DeepSeek V3.1 is a hybrid large language model (LLM) developed by DeepSeek AI. It is a post-trained variant of DeepSeek V3.1-Base, which itself builds on the original V3 base through a two-phase long context extension process.

    What applications and use cases does DeepSeek V3.1 excel at?

    DeepSeek V3.1 is optimized for:

    • Conversational AI
    • Code assistance
    • Agentic systems
    • Enterprise RAG (retrieval-augmented generation)
    • Multimodal workflows (though not natively multimodal)

    Its dual-mode architecture ("thinking" and "non-thinking" chat modes) enables high performance in both fast inference tasks and complex agentic behaviors.

    What is the maximum context length for DeepSeek V3.1?

    The maximum context length on Fireworks AI is 163,840 tokens.

    What is the usable context window for DeepSeek V3.1?

    The base model was trained on 32K and 128K token extensions. Fireworks allows up to 163,840 tokens.

    Does DeepSeek V3.1 support quantized formats (4-bit/8-bit)?

    Yes. The model supports multiple quantizations, and its weights and activations are trained using the UE8M0 FP8 format.

    Does DeepSeek V3.1 support function-calling schemas?

    Function-calling is supported, including:

    • Custom tools
    • Code agents
    • Search agents
    • Multi-turn tool use
    How many parameters does DeepSeek V3.1 have?
    • Total parameters: 685 billion
    • Activated during inference: 37 billion
    Is fine-tuning supported for DeepSeek V3.1?

    Yes. Fireworks supports fine-tuning via LoRA for this model.

    What license governs commercial use of DeepSeek V3.1?

    DeepSeek V3.1 is licensed under the MIT License, which permits commercial use.

    Metadata

    State
    Ready
    Created on
    8/21/2025
    Kind
    Base model
    Provider
    Deepseek
    Hugging Face
    DeepSeek-V3.1

    Specification

    Calibrated
    Yes
    Mixture-of-Experts
    Yes
    Parameters
    671B

    Supported Functionality

    Fine-tuning
    Supported
    Serverless
    Supported
    Serverless LoRA
    Not supported
    Context Length
    163.8k tokens
    Function Calling
    Supported
    Embeddings
    Not supported
    Rerankers
    Not supported
    Support image input
    Not supported