GLM 5 is now live on Fireworks. Try It Today.

Model Library
/Qwen/Qwen3 235B A22B Instruct 2507
Quen Logo Mark

Qwen3 235B A22B Instruct 2507

Ready
fireworks/qwen3-235b-a22b-instruct-2507

    Updated FP8 version of Qwen3-235B-A22B non-thinking mode, with better tool use, coding, instruction following, logical reasoning and text comprehension capabilities

    Qwen3 235B A22B Instruct 2507 API Features

    Fine-tuning

    Docs

    Qwen3 235B A22B Instruct 2507 can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model

    On-demand Deployment

    Docs

    On-demand deployments give you dedicated GPUs for Qwen3 235B A22B Instruct 2507 using Fireworks' reliable, high-performance system with no rate limits.

    Qwen3 235B A22B Instruct 2507 FAQs

    What is Qwen3-235B-A22B-Instruct-2507 and who developed it?

    Qwen3-235B-A22B-Instruct-2507 is an instruction-tuned, non-thinking mode large language model developed by Alibaba’s Qwen team. It is a mixture-of-experts (MoE) model with 235 billion total parameters (22B active) and is optimized for reasoning, tool use, coding, and long-context tasks.

    What applications and use cases does Qwen3-235B-A22B-Instruct-2507 excel at?

    The model is designed for:

    • Complex reasoning (e.g., AIME25, HMMT25)
    • Instruction following and logic tasks
    • Coding (e.g., MultiPL-E, LiveCodeBench)
    • Long-context comprehension (supports up to 1M tokens)
    • Multilingual knowledge and creative writing
    What is the maximum context length for Qwen3-235B-A22B-Instruct-2507?

    The model supports a native context length of 262,144 tokens, and can be extended up to 1,010,000 tokens using Dual Chunk Attention and sparse attention mechanisms.

    What is the usable context window for Qwen3-235B-A22B-Instruct-2507?

    While the model supports up to 1M tokens, the recommended usable context for most tasks is up to 16,384 tokens, due to memory and latency considerations.

    Does Qwen3-235B-A22B-Instruct-2507 support quantized formats (4-bit/8-bit)?

    Yes. The model is available in FP8 quantized format, which improves inference speed and reduces memory usage.

    What is the maximum output length Fireworks allows for Qwen3-235B-A22B-Instruct-2507?

    The recommended maximum output length is 16,384 tokens, aligned with guidance from the Qwen team for generation quality and stability.

    What are known failure modes of Qwen3-235B-A22B-Instruct-2507?

    Known challenges include:

    • VRAM-related issues when attempting 1M context inference without proper configuration
    • Slight performance tradeoffs in long contexts with sparse attention
    • Some reports of alignment inconsistencies in subjective tasks
    Does Qwen3-235B-A22B-Instruct-2507 support streaming responses and function-calling schemas?

    Yes. The model supports streaming generation and agentic tool use via Qwen-Agent, which provides built-in support for function-calling and tool integration through configurable MCP files.

    How many parameters does Qwen3-235B-A22B-Instruct-2507 have?

    The model has 235 billion total parameters, with 22 billion active per token using a Mixture-of-Experts architecture.

    How are tokens counted (prompt vs completion)?

    Token pricing is split between input and output:

    • $0.22 per 1M input tokens
    • $0.88 per 1M output tokens
    What license governs commercial use of Qwen3-235B-A22B-Instruct-2507?

    The model is released under the Apache 2.0 license, permitting commercial use with attribution.

    Metadata

    State
    Ready
    Created on
    7/21/2025
    Kind
    Base model
    Provider
    Qwen
    Hugging Face
    Qwen3-235B-A22B-Instruct-2507-FP8

    Specification

    Calibrated
    Yes
    Mixture-of-Experts
    Yes
    Parameters
    235.1B

    Supported Functionality

    Fine-tuning
    Supported
    Serverless
    Not supported
    Context Length
    262.1k tokens
    Function Calling
    Supported
    Embeddings
    Not supported
    Rerankers
    Not supported
    Support image input
    Not supported