DeepSeek V2 Lite Chat

DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference through significantly compressing the Key-Value (KV) cache into a latent vector, while DeepSeekMoE enables training strong models at an economical cost through sparse computation.

DeepSeek V2 Lite Chat API Features

Fine-tuning Docs	DeepSeek V2 Lite Chat can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model
On-demand Deployment Docs	On-demand deployments give you dedicated GPUs for DeepSeek V2 Lite Chat using Fireworks' reliable, high-performance system with no rate limits.

Metadata

State

Ready

Created on

10/22/2024

Kind

Base model

Provider

Deepseek

Hugging Face

DeepSeek-V2-Lite-Chat

Specification

Calibrated

Mixture-of-Experts

Yes

Parameters

15.7B

Supported Functionality

Fine-tuning

Supported

Serverless

Not supported

Serverless LoRA

Not supported

Context Length

163.8k tokens

Function Calling

Not supported

Embeddings

Not supported

Rerankers

Not supported

Support image input

Not supported

DeepSeek V2 Lite Chat