The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters utilizing Grouped-query attention (GQA) for faster inference and Sliding window attention (SWA) to handle longer sequences at a lower cost.
Fine-tuningDocs | Mistral 7B can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model |
On-demand DeploymentDocs | On-demand deployments allow you to use Mistral 7B on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits. |