DeepSeek-V3.1 is post-trained on the top of DeepSeek-V3.1-Base, which is built upon the original V3 base checkpoint through a two-phase long context extension approach, following the methodology outlined in the original DeepSeek-V3 report. We have expanded our dataset by collecting additional long documents and substantially extending both training phases. The 32K extension phase has been increased 10-fold to 630B tokens, while the 128K extension phase has been extended by 3.3x to 209B tokens. Additionally, DeepSeek-V3.1 is trained using the UE8M0 FP8 scale data format to ensure compatibility with microscaling data formats.
Fine-tuningDocs | DeepSeek V3.1 can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model |
ServerlessDocs | Immediately run model on pre-configured GPUs and pay-per-token |
On-demand DeploymentDocs | On-demand deployments give you dedicated GPUs for DeepSeek V3.1 using Fireworks' reliable, high-performance system with no rate limits. |
Run queries immediately, pay only for usage
DeepSeek V3.1 is a hybrid large language model (LLM) developed by DeepSeek AI. It is a post-trained variant of DeepSeek V3.1-Base, which itself builds on the original V3 base through a two-phase long context extension process.
DeepSeek V3.1 is optimized for:
Its dual-mode architecture ("thinking" and "non-thinking" chat modes) enables high performance in both fast inference tasks and complex agentic behaviors.
The maximum context length on Fireworks AI is 163,840 tokens.
The base model was trained on 32K and 128K token extensions. Fireworks allows up to 163,840 tokens.
Yes. The model supports multiple quantizations, and its weights and activations are trained using the UE8M0 FP8 format.
Function-calling is supported, including:
Yes. Fireworks supports fine-tuning via LoRA for this model.
DeepSeek V3.1 is licensed under the MIT License, which permits commercial use.