GLM 5.2 is live! Opus-level intelligence at open-source rates. Pay per token on serverless. Try it today.

Model Library
/Deepseek/DeepSeek V3.1
Deepseek Logo Mark

DeepSeek V3.1

Ready
model path:accounts/fireworks/models/deepseek-v3p1

DeepSeek-V3.1 is post-trained on the top of DeepSeek-V3.1-Base, which is built upon the original V3 base checkpoint through a two-phase long context extension approach, following the methodology outlined in the original DeepSeek-V3 report. We have expanded our dataset by collecting additional long documents and substantially extending both training phases. The 32K extension phase has been increased 10-fold to 630B tokens, while the 128K extension phase has been extended by 3.3x to 209B tokens. Additionally, DeepSeek-V3.1 is trained using the UE8M0 FP8 scale data format to ensure compatibility with microscaling data formats.

DeepSeek V3.1 API Features

Fine-tuning

Docs

DeepSeek V3.1 can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model

On-demand Deployment

Docs

On-demand deployments allow you to use DeepSeek V3.1 on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits.

DeepSeek V3.1 FAQs

What is DeepSeek V3.1 and who developed it?

DeepSeek V3.1 is a hybrid large language model (LLM) developed by DeepSeek AI. It is a post-trained variant of DeepSeek V3.1-Base, which itself builds on the original V3 base through a two-phase long context extension process.

What applications and use cases does DeepSeek V3.1 excel at?

DeepSeek V3.1 is optimized for:

  • Conversational AI
  • Code assistance
  • Agentic systems
  • Enterprise RAG (retrieval-augmented generation)
  • Multimodal workflows (though not natively multimodal)

Its dual-mode architecture ("thinking" and "non-thinking" chat modes) enables high performance in both fast inference tasks and complex agentic behaviors.

What is the maximum context length for DeepSeek V3.1?

The maximum context length on Fireworks AI is 163,840 tokens.

What is the usable context window for DeepSeek V3.1?

The base model was trained on 32K and 128K token extensions. Fireworks allows up to 163,840 tokens.

Does DeepSeek V3.1 support quantized formats (4-bit/8-bit)?

Yes. The model supports multiple quantizations, and its weights and activations are trained using the UE8M0 FP8 format.

Does DeepSeek V3.1 support function-calling schemas?

Function-calling is supported, including:

  • Custom tools
  • Code agents
  • Search agents
  • Multi-turn tool use
How many parameters does DeepSeek V3.1 have?
  • Total parameters: 685 billion
  • Activated during inference: 37 billion
Is fine-tuning supported for DeepSeek V3.1?

Yes. Fireworks supports fine-tuning via LoRA for this model.

What license governs commercial use of DeepSeek V3.1?

DeepSeek V3.1 is licensed under the MIT License, which permits commercial use.

Metadata

State
Ready
Created on
8/21/2025
Kind
Base model
Provider
Deepseek

Specification

Calibrated
Yes
Mixture-of-Experts
Yes
Parameters
674B

Supported Functionality

Fine-tuning
Supported
Serverless
Not supported
Context Length
163k tokens
Function Calling
Supported
Embeddings
Not supported
Rerankers
Not supported
Support image input
Not supported