Qwen3 Next 80B A3B Thinking is a state-of-the-art mixture-of-experts (MoE) language model with 3 billion activated parameters and 80 billion total parameters. It features a hybrid attention architecture, supports contexts up to 262K tokens, and includes enhanced reasoning capabilities. To ensure sufficient GPU memory capacity, we recommend deploying this model on 2 NVIDIA H200 or 4 NVIDIA H100 GPUs.
On-demand deployments give you dedicated GPUs for Qwen3 Next 80B A3B Thinking using Fireworks' reliable, high-performance system with no rate limits.
Qwen
$0.9