Fireworks RFT now available! Fine-tune open models that outperform frontier models. Try today

Model Library
/Deepseek/DeepSeek Coder 7B Base
Deepseek Logo Mark

DeepSeek Coder 7B Base

Ready
fireworks/deepseek-coder-7b-base

    Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese. Deepseek Coder 6.7B Base is a 6.7B parameter model with Multi-Head Attention trained on 2 trillion tokens by employing a window size of 16K and an extra fill-in-the-blank task

    DeepSeek Coder 7B Base API Features

    Fine-tuning

    Docs

    DeepSeek Coder 7B Base can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model

    On-demand Deployment

    Docs

    On-demand deployments give you dedicated GPUs for DeepSeek Coder 7B Base using Fireworks' reliable, high-performance system with no rate limits.

    DeepSeek Coder 7B Base FAQs

    What is DeepSeek Coder 7B Base and who developed it?

    DeepSeek Coder 7B Base is a base language model developed by DeepSeek AI as part of its DeepSeek Coder family. The model is trained from scratch on 2 trillion tokens, with a composition of 87% code and 13% natural language in English and Chinese. It uses a fill-in-the-blank auxiliary task during training.

    What applications and use cases does DeepSeek Coder 7B Base excel at?

    This model is optimized for:

    • Code generation and completion
    • Conversational AI
    • Agentic systems
    • Search
    • Enterprise RAG
    • Multimedia reasoning (text-based)
    What is the maximum context length for DeepSeek Coder 7B Base?

    The model supports a context length of 4,096 tokens.

    What is the usable context window for DeepSeek Coder 7B Base?

    The full 4.1K token window is supported on Fireworks' on-demand deployments, which provide dedicated GPU access without rate limits.

    What are known failure modes of DeepSeek Coder 7B Base?
    • No image input support
    • No function-calling, reranker, or embedding capabilities
    • Limited context (4K tokens) compared to newer models
    • No safety alignment or moderation layers
    How many parameters does DeepSeek Coder 7B Base have?

    The model has 6.9 billion parameters, rounded to 7B in the model name.

    Is fine-tuning supported for DeepSeek Coder 7B Base?

    Yes. Fireworks supports LoRA-based fine-tuning on dedicated GPUs for this model.

    How are tokens counted (prompt vs completion)?

    Token metering is based on combined input and output tokens.

    What rate limits apply on the shared endpoint?
    • Serverless: Not supported
    • On-demand: Supported with no rate limits via dedicated GPU infrastructure

    Metadata

    State
    Ready
    Created on
    3/15/2024
    Kind
    Base model
    Provider
    Deepseek
    Hugging Face
    deepseek-coder-6.7b-base

    Specification

    Calibrated
    No
    Mixture-of-Experts
    No
    Parameters
    6.9B

    Supported Functionality

    Fine-tuning
    Supported
    Serverless
    Not supported
    Serverless LoRA
    Supported
    Context Length
    4.1k tokens
    Function Calling
    Not supported
    Embeddings
    Not supported
    Rerankers
    Not supported
    Support image input
    Not supported