DeepSeek R1 (Fast) is the speed-optimized serverless deployment of DeepSeek-R1. Compared to the DeepSeek R1 (Basic) endpoint, R1 (Fast) provides faster speeds with higher per-token prices, see https://fireworks.ai/pricing for details. Identical models are served on the two endpoints, so there are no quality or quantization differences. DeepSeek-R1 is a state-of-the-art large language model optimized with reinforcement learning and cold-start data for exceptional reasoning, math, and code performance. The model is identical to the one uploaded by DeepSeek on HuggingFace. Note that fine-tuning for this model is only available through contacting fireworks at https://fireworks.ai/company/contact-us.
Fine-tuningDocs | DeepSeek R1 (Fast) can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model |
On-demand DeploymentDocs | On-demand deployments give you dedicated GPUs for DeepSeek R1 (Fast) using Fireworks' reliable, high-performance system with no rate limits. |
DeepSeek R1 is a serverless, speed-optimized deployment of DeepSeek-R1 hosted by Fireworks AI. It uses the same model as DeepSeek R1 (Basic), with faster inference and higher per-token costs. The underlying model, DeepSeek-R1, was developed by DeepSeek and is optimized for advanced reasoning, math, and code generation using a reinforcement learning-first training approach.
DeepSeek R1 excels at:
The maximum context length is 163,840 tokens.
Yes. DeepSeek R1 has multiple quantized variants including 4-bit and 8-bit options.
The recommended default sampling temperature for DeepSeek R1 is 0.6, as used in benchmark evaluations.
The maximum generation length is 32,768 tokens.
Known issues include:
<think>, which can reduce performance on reasoning tasks.DeepSeek R1 uses a Mixture of Experts (MoE) architecture to reduce active compute while maintaining model capacity.
Yes. Fireworks supports fine-tuning DeepSeek R1 using LoRA-based adapters. Contact Fireworks for access.
DeepSeek R1 is licensed under the MIT License, which permits commercial use, modification, and redistribution.