Fireworks RFT now available! Fine-tune open models that outperform frontier models. Try today

Model Library
/Fireworks AI/DeepSeek Coder V2 Instruct
Fireworks Logo Mark

DeepSeek Coder V2 Instruct

Ready
fireworks/deepseek-coder-v2-instruct

    DeepSeek Coder V2 Instruct is a 236-billion-parameter open-source Mixture-of-Experts (MoE) code language model with 21 billion active parameters, developed by DeepSeek AI. Fine-tuned for instruction following, it achieves performance comparable to GPT4-Turbo on code-specific tasks. Pre-trained on an additional 6 trillion tokens, it enhances coding and mathematical reasoning capabilities, supports 338 programming languages, and extends context length from 16K to 128K while maintaining strong general language performance.

    DeepSeek Coder V2 Instruct API Features

    Fine-tuning

    Docs

    DeepSeek Coder V2 Instruct can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model

    On-demand Deployment

    Docs

    On-demand deployments give you dedicated GPUs for DeepSeek Coder V2 Instruct using Fireworks' reliable, high-performance system with no rate limits.

    DeepSeek Coder V2 Instruct FAQs

    What is DeepSeek Coder V2 Instruct and who developed it?

    DeepSeek Coder V2 Instruct is a 236B-parameter Mixture-of-Experts (MoE) instruction-tuned code model developed by DeepSeek AI. It is fine-tuned for instruction-following behavior and achieves performance comparable to GPT-4 Turbo on code and math tasks.

    What applications and use cases does DeepSeek Coder V2 Instruct excel at?

    The model is optimized for:

    • Code generation and completion
    • Mathematical reasoning
    • Conversational AI for coding assistants
    • Enterprise and tool-integrated RAG systems

    It supports 338 programming languages and handles long sequences well.

    What is the maximum context length for DeepSeek Coder V2 Instruct?

    The model supports a context length of 131,072 tokens on Fireworks.

    What is the usable context window for DeepSeek Coder V2 Instruct?

    The full 131.1K token context window is usable on Fireworks AI infrastructure.

    Does DeepSeek Coder V2 Instruct support quantized formats (4-bit/8-bit)?

    Yes, the model supports quantized versions including 4-bit and 8-bit variants.

    Does DeepSeek Coder V2 Instruct support function-calling schemas?

    No, function calling is not supported.

    How many parameters does DeepSeek Coder V2 Instruct have?

    The model has 236 billion total parameters, with 21 billion active parameters in its MoE setup (8 experts active per forward pass).

    Is fine-tuning supported for DeepSeek Coder V2 Instruct?

    Yes. Fireworks supports LoRA-based fine-tuning for this model.

    What rate limits apply on the shared endpoint?

    On-demand is available with no rate limits via dedicated GPUs.

    What license governs commercial use of DeepSeek Coder V2 Instruct?

    The model is released under a custom DeepSeek Model License, which allows commercial use. Code is under MIT License.

    Metadata

    State
    Ready
    Created on
    7/11/2024
    Kind
    Base model
    Provider
    Fireworks AI
    Hugging Face
    deepseek-coder-v2-instruct

    Specification

    Calibrated
    No
    Mixture-of-Experts
    Yes
    Parameters
    235.7B

    Supported Functionality

    Fine-tuning
    Supported
    Serverless
    Not supported
    Serverless LoRA
    Not supported
    Context Length
    32.8k tokens
    Function Calling
    Not supported
    Embeddings
    Not supported
    Rerankers
    Not supported
    Support image input
    Not supported