LLM
DeepSeek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese. deepseek-coder-1.3b-base is a 1.3B parameter model with Multi-Head Attention trained on 1 trillion tokens.
DeepSeek Coder 1.3B Base can be fine-tuned on your data to create a model with better response quality. Fireworks uses low-rank adaptation (LoRA) to train a model that can be served efficiently at inference time.
See the Fine-tuning guide for details.
On-demand deployments allow you to use DeepSeek Coder 1.3B Base on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits.
See the On-demand deployments guide for details.