DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference through significantly compressing the Key-Value (KV) cache into a latent vector, while DeepSeekMoE enables training strong models at an economical cost through sparse computation.
DeepSeek V2 Lite Chat can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model
Learn MoreOn-demand deployments give you dedicated GPUs for DeepSeek V2 Lite Chat using Fireworks' reliable, high-performance system with no rate limits.
Learn MoreDeepseek
163840
Available
$0.5