DeepSeek Coder V2 Instruct is a 236-billion-parameter open-source Mixture-of-Experts (MoE) code language model with 21 billion active parameters, developed by DeepSeek AI. Fine-tuned for instruction following, it achieves performance comparable to GPT4-Turbo on code-specific tasks. Pre-trained on an additional 6 trillion tokens, it enhances coding and mathematical reasoning capabilities, supports 338 programming languages, and extends context length from 16K to 128K while maintaining strong general language performance.
Fine-tuningDocs | DeepSeek Coder V2 Instruct can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model |
On-demand DeploymentDocs | On-demand deployments give you dedicated GPUs for DeepSeek Coder V2 Instruct using Fireworks' reliable, high-performance system with no rate limits. |
DeepSeek Coder V2 Instruct is a 236B-parameter Mixture-of-Experts (MoE) instruction-tuned code model developed by DeepSeek AI. It is fine-tuned for instruction-following behavior and achieves performance comparable to GPT-4 Turbo on code and math tasks.
The model is optimized for:
It supports 338 programming languages and handles long sequences well.
The model supports a context length of 131,072 tokens on Fireworks.
The full 131.1K token context window is usable on Fireworks AI infrastructure.
Yes, the model supports quantized versions including 4-bit and 8-bit variants.
No, function calling is not supported.
The model has 236 billion total parameters, with 21 billion active parameters in its MoE setup (8 experts active per forward pass).
Yes. Fireworks supports LoRA-based fine-tuning for this model.
The model is released under a custom DeepSeek Model License, which allows commercial use. Code is under MIT License.