Skip to main content

Qwen 3 models are now available with SOTA reasoning, coding and agentic tool use capabilities. Try Qwen 3 now

    FireAttention — Serving Open Source Models 4x faster than vLLM by quantizing with ~no tradeoffs