Qwen 3.7 Plus is now available on Serverless, exclusively on Fireworks. Try it today.

Model Library
/Z.ai/GLM-4.6
model path:accounts/fireworks/models/glm-4p6

As the latest iteration in the GLM series, GLM-4.6 achieves comprehensive enhancements across multiple domains, including real-world coding, long-context processing, reasoning, searching, writing, and agentic applications.

GLM-4.6 API Features

Fine-tuning

Docs

GLM-4.6 can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model

On-demand Deployment

Docs

On-demand deployments allow you to use GLM-4.6 on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits.

GLM-4.6 FAQs

What is GLM-4.6 and who developed it?

GLM-4.6 is the latest version in the GLM (General Language Model) series developed by Zhipu AI (Z.ai). It introduces enhancements in long-context reasoning, agentic behavior, code generation, and search capabilities. The model builds upon GLM-4.5, delivering improvements across multiple domains.

What applications and use cases does GLM-4.6 excel at?

GLM-4.6 is optimized for:

  • Code assistance
  • Conversational AI
  • Agentic systems
  • Search
  • Multimedia
  • Enterprise RAG (retrieval-augmented generation)
What is the maximum context length for GLM-4.6?

GLM-4.6 supports a context length of 202,752 tokens on Fireworks AI.

What is the usable context window for GLM-4.6?

Fireworks supports the full 202,752 tokens, but the model was benchmarked using up to 128K in evaluations.

Does GLM-4.6 support quantized formats (4-bit/8-bit)?

GLM-4.6 fully supports quantization, including 4-bit and 8-bit formats.

How many parameters does GLM-4.6 have?

GLM-4.6 has 357 billion parameters.

What rate limits apply on the shared endpoint?

GLM-4.6 runs on dedicated GPU infrastructure with no rate limits when deployed on-demand via Fireworks.

What license governs commercial use of GLM-4.6?

GLM-4.6 is released under the MIT License, allowing commercial use.

Metadata

State
Ready
Created on
10/1/2025
Kind
Base model
Provider
Z.ai
Hugging Face
zai-org/GLM-4.6

Specification

Calibrated
Yes
Mixture-of-Experts
Yes
Parameters
352B

Supported Functionality

Fine-tuning
Supported
Serverless
Not supported
Context Length
202k tokens
Function Calling
Supported
Embeddings
Not supported
Rerankers
Not supported
Support image input
Not supported