Fireworks RFT now available! Fine-tune open models that outperform frontier models. Try today

Model Library
/Qwen/Qwen3 Coder 30B A3B Instruct
Quen Logo Mark

Qwen3 Coder 30B A3B Instruct

Ready
fireworks/qwen3-coder-30b-a3b-instruct

    Latest Qwen3 coder model, 30B with 3B active parameter model

    Qwen3 Coder 30B A3B Instruct API Features

    Fine-tuning

    Docs

    Qwen3 Coder 30B A3B Instruct can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model

    On-demand Deployment

    Docs

    On-demand deployments give you dedicated GPUs for Qwen3 Coder 30B A3B Instruct using Fireworks' reliable, high-performance system with no rate limits.

    Qwen3 Coder 30B A3B Instruct FAQs

    What is Qwen3-Coder 30B A3B Instruct and who developed it?

    Qwen3-Coder 30B A3B Instruct is a Mixture-of-Experts (MoE) instruction-tuned coding model developed by Qwen (Alibaba Group). It features 30.5 billion total parameters with 3.3 billion active per forward pass, and is trained for advanced code reasoning, agentic systems, and browser-integrated coding tasks.

    What applications and use cases does Qwen3-Coder 30B A3B Instruct excel at?

    The model is designed for:

    • Code assistance
    • Conversational AI
    • Agentic systems (e.g., Qwen Code, CLINE)
    • Search
    • Enterprise RAG
    • Multimodal reasoning (text-only)
    What is the maximum context length for Qwen3-Coder 30B A3B Instruct?

    The model natively supports a context window of 262,144 tokens (262.1K).

    What is the usable context window for Qwen3-Coder 30B A3B Instruct?

    The full 262.1K token window is usable in on-demand deployments on Fireworks, which provide dedicated GPU access.

    What is the maximum output length Fireworks allows for Qwen3-Coder 30B A3B Instruct?

    The recommended output length is up to 65,536 tokens, constrained by the 262.1K total context limit.

    What are known failure modes of Qwen3-Coder 30B A3B Instruct?
    • Function calling is not supported on Fireworks
    • Streaming responses not supported
    • Image input, embeddings, and rerankers are not supported
    • The model does not support “thinking mode” and ignores <think> tokens
    How many parameters does Qwen3-Coder 30B A3B Instruct have?
    • Total parameters: 30.5 billion
    • Active per forward pass: 3.3 billion
    • Experts: 128 total, with 8 activated per token generation
    Is fine-tuning supported for Qwen3-Coder 30B A3B Instruct?

    Yes. Fireworks supports LoRA-based fine-tuning for this model using its RFT (Reserved Fine-Tuning) infrastructure.

    How are tokens counted (prompt vs completion)?

    Fireworks charges based on total input + output tokens, respecting the 262.1K context limit.

    What rate limits apply on the shared endpoint?
    • Serverless: Not supported
    • On-demand: Available with no rate limits via dedicated GPU deployments
    What license governs commercial use of Qwen3-Coder 30B A3B Instruct?

    The model is released under the Apache 2.0 license, which permits unrestricted commercial use.

    Metadata

    State
    Ready
    Created on
    8/1/2025
    Kind
    Base model
    Provider
    Qwen
    Hugging Face
    Qwen3-Coder-30B-A3B-Instruct

    Specification

    Calibrated
    Yes
    Mixture-of-Experts
    Yes
    Parameters
    30.5B

    Supported Functionality

    Fine-tuning
    Supported
    Serverless
    Not supported
    Serverless LoRA
    Not supported
    Context Length
    262.1k tokens
    Function Calling
    Not supported
    Embeddings
    Not supported
    Rerankers
    Not supported
    Support image input
    Not supported