Join us for "Own Your AI" night on 10/1 in SF featuring Meta, Uber, Upwork, and AWS. Register here

Quen Logo Mark

Qwen3 Next 80B A3B Thinking

Qwen3 Next 80B A3B Thinking is a state-of-the-art mixture-of-experts (MoE) language model with 3 billion activated parameters and 80 billion total parameters. It features a hybrid attention architecture, supports contexts up to 262K tokens, and includes enhanced reasoning capabilities. To ensure sufficient GPU memory capacity, we recommend deploying this model on 2 NVIDIA H200 or 4 NVIDIA H100 GPUs.

Try Model

Fireworks Features

On-demand Deployment

On-demand deployments give you dedicated GPUs for Qwen3 Next 80B A3B Thinking using Fireworks' reliable, high-performance system with no rate limits.

Learn More

Info & Pricing

Provider

Qwen

Model Type

LLM

Pricing Per 1M Tokens

$0.9