GLM 5.2 is live! Opus-level intelligence at open-source rates. Pay per token on serverless. Try it today.

Model Library
/Qwen/Qwen3 Coder 30B A3B Instruct
Quen Logo Mark

Qwen3 Coder 30B A3B Instruct

Ready
model path:accounts/fireworks/models/qwen3-coder-30b-a3b-instruct

Latest Qwen3 coder model, 30B with 3B active parameter model

Qwen3 Coder 30B A3B Instruct API Features

Fine-tuning

Docs

Qwen3 Coder 30B A3B Instruct can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model

On-demand Deployment

Docs

On-demand deployments allow you to use Qwen3 Coder 30B A3B Instruct on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits.

Qwen3 Coder 30B A3B Instruct FAQs

What is Qwen3-Coder 30B A3B Instruct and who developed it?

Qwen3-Coder 30B A3B Instruct is a Mixture-of-Experts (MoE) instruction-tuned coding model developed by Qwen (Alibaba Group). It features 30.5 billion total parameters with 3.3 billion active per forward pass, and is trained for advanced code reasoning, agentic systems, and browser-integrated coding tasks.

What applications and use cases does Qwen3-Coder 30B A3B Instruct excel at?

The model is designed for:

  • Code assistance
  • Conversational AI
  • Agentic systems (e.g., Qwen Code, CLINE)
  • Search
  • Enterprise RAG
  • Multimodal reasoning (text-only)
What is the maximum context length for Qwen3-Coder 30B A3B Instruct?

The model natively supports a context window of 262,144 tokens (262.1K).

What is the usable context window for Qwen3-Coder 30B A3B Instruct?

The full 262.1K token window is usable in on-demand deployments on Fireworks, which provide dedicated GPU access.

What is the maximum output length Fireworks allows for Qwen3-Coder 30B A3B Instruct?

The recommended output length is up to 65,536 tokens, constrained by the 262.1K total context limit.

What are known failure modes of Qwen3-Coder 30B A3B Instruct?
  • Function calling is not supported on Fireworks
  • Streaming responses not supported
  • Image input, embeddings, and rerankers are not supported
  • The model does not support “thinking mode” and ignores <think> tokens
How many parameters does Qwen3-Coder 30B A3B Instruct have?
  • Total parameters: 30.5 billion
  • Active per forward pass: 3.3 billion
  • Experts: 128 total, with 8 activated per token generation
Is fine-tuning supported for Qwen3-Coder 30B A3B Instruct?

Yes. Fireworks supports LoRA-based fine-tuning for this model using its RFT (Reserved Fine-Tuning) infrastructure.

How are tokens counted (prompt vs completion)?

Fireworks charges based on total input + output tokens, respecting the 262.1K context limit.

What rate limits apply on the shared endpoint?
  • Serverless: Not supported
  • On-demand: Available with no rate limits via dedicated GPU deployments
What license governs commercial use of Qwen3-Coder 30B A3B Instruct?

The model is released under the Apache 2.0 license, which permits unrestricted commercial use.

Metadata

State
Ready
Created on
8/1/2025
Kind
Base model
Provider
Qwen

Specification

Calibrated
Yes
Mixture-of-Experts
Yes
Parameters
30.5B

Supported Functionality

Fine-tuning
Supported
Serverless
Not supported
Context Length
262k tokens
Function Calling
Not supported
Embeddings
Not supported
Rerankers
Not supported
Support image input
Not supported