Qwen3 Next 80B A3B Instruct

Qwen3 Next 80B A3B Instruct is a state-of-the-art mixture-of-experts (MoE) language model with 3 billion activated parameters and 80 billion total parameters. It features a hybrid attention architecture for efficient processing and supports contexts up to 262K tokens. To ensure sufficient GPU memory capacity, we recommend deploying this model on 2 NVIDIA H200 or 4 NVIDIA H100 GPUs.

Qwen3 Next 80B A3B Instruct API Features

On-demand Deployment

Docs

On-demand deployments give you dedicated GPUs for Qwen3 Next 80B A3B Instruct using Fireworks' reliable, high-performance system with no rate limits.

Metadata

State

Ready

Created on

9/16/2025

Kind

Base model

Provider

Qwen

Hugging Face

Qwen3-Next-80B-A3B-Instruct

Specification

Calibrated

Mixture-of-Experts

Parameters

80B

Supported Functionality

Fine-tuning

Not supported

Serverless

Not supported

Context Length

N/A

Function Calling

Not supported

Embeddings

Not supported

Rerankers

Not supported

Support image input

Not supported