Mixtral MoE 8x7B Instruct is the instruction-tuned version of Mixtral MoE 8x7B and has the chat completions API enabled.
On-demand DeploymentDocs | On-demand deployments give you dedicated GPUs for Mixtral MoE 8x7B Instruct using Fireworks' reliable, high-performance system with no rate limits. |
Mixtral MoE 8x7B Instruct is an instruction-tuned sparse Mixture-of-Experts (MoE) model developed by Mistral AI. It fine-tunes the base Mixtral-8x7B model for conversational and instruction-following tasks.
The model is designed for:
It outperforms Llama 2 70B across several benchmarks, per Mistral’s internal evaluations.
The model supports a context window of 32,768 tokens.
Fireworks supports the full 32.8K token context window on on-demand GPU deployments, with no rate limits.
The model has 46.7 billion active parameters, drawn from 8 experts of 7B each, with 2 experts activated per forward pass.
Tokens are counted across input + output, within the 32.8K context limit.
The model is licensed under the Apache 2.0 license, which allows unrestricted commercial use.