LLM
Mixtral 8x7B v0.1 is a sparse mixture-of-experts (SMoE) large language model developed by Mistral AI. With 46.7 billion total parameters and 12.9 billion active parameters per token, it outperforms Llama 2 70B and matches GPT-3.5 on many benchmarks while offering efficient inference. The model handles context lengths up to 32k tokens, supports multiple languages including English, French, Italian, German, and Spanish, and excels in code generation tasks. Licensed under Apache 2.0, Mixtral provides a powerful and efficient solution for diverse NLP applications.
Mixtral 8x7B v0.1 can be fine-tuned on your data to create a model with better response quality. Fireworks uses low-rank adaptation (LoRA) to train a model that can be served efficiently at inference time.
See the Fine-tuning guide for details.
On-demand deployments allow you to use Mixtral 8x7B v0.1 on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits.
See the On-demand deployments guide for details.