The GLM-4.5 series models are foundation models designed for intelligent agents. GLM-4.5 has 355 billion total parameters with 32 billion active parameters, while GLM-4.5-Air adopts a more compact design with 106 billion total parameters and 12 billion active parameters. GLM-4.5 models unify reasoning, coding, and intelligent agent capabilities to meet the complex demands of intelligent agent applications.
Fine-tuningDocs | GLM-4.5-Air can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model |
On-demand DeploymentDocs | On-demand deployments give you dedicated GPUs for GLM-4.5-Air using Fireworks' reliable, high-performance system with no rate limits. |
GLM-4.5-Air is a compact, open-source large language model developed by Zhipu AI. It is part of the GLM-4.5 family, optimized for intelligent agent applications. GLM-4.5-Air features 106 billion total parameters and 12 billion active parameters and supports hybrid reasoning with two execution modes: "thinking" (for complex tasks) and "non-thinking" (for fast responses).
GLM-4.5-Air is designed for:
Its hybrid reasoning capabilities make it suitable for intelligent agent environments and real-world task planning.
The maximum context length for GLM-4.5-Air is 131,072 tokens (131.1k).
Yes. The model lists 52 quantized variants, including 4-bit and 8-bit for efficient inference.
GLM-4.5-Air has 106 billion total parameters and 12 billion active parameters. It is a dense model that does not use a Mixture-of-Experts (MoE) architecture.
No, fine-tuning is not supported on Fireworks.
GLM-4.5-Air is released under the MIT license, which allows commercial use and secondary development.