GLM-4.7 Flash API & Playground

GLM-4.7-Flash is a 30B-A3B MoE model. As the strongest model in the 30B class, GLM-4.7-Flash offers a new option for lightweight deployment that balances performance and efficiency.

GLM-4.7 Flash API Features

On-demand Deployment

Docs

On-demand deployments allow you to use GLM-4.7 Flash on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits.

Metadata

State

Ready

Created on

1/19/2026

Kind

Base model

Provider

Z.ai

Hugging Face

zai-org/GLM-4.7-Flash

Specification

Calibrated

Mixture-of-Experts

Yes

Parameters

31B

Supported Functionality

Fine-tuning

Not supported

Serverless

Not supported

Context Length

202k tokens

Function Calling

Not supported

Embeddings

Not supported

Rerankers

Not supported

Support image input

Not supported