GLM-4.7 API & Playground

GLM-4.7 is a next-generation general-purpose model optimized for coding, reasoning, and agentic workflows, delivering strong gains in multilingual software engineering, tool use, and complex problem solving. It introduces advanced thinking controls: interleaved, preserved, and turn-level thinking; to improve stability on long-horizon, multi-turn tasks. You can explore these thinking modes on our API using the `reasoning_history` field. Learn more here - https://docs.fireworks.ai/guides/reasoning

GLM-4.7 API Features

Fine-tuning Docs	GLM-4.7 can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model
Serverless Docs	Immediately run model on pre-configured GPUs and pay-per-token
On-demand Deployment Docs	On-demand deployments give you dedicated GPUs for GLM-4.7 using Fireworks' reliable, high-performance system with no rate limits.

Available Serverless

Run queries immediately, pay only for usage

$0.60 / $0.30 / $2.20

Per 1M Tokens (input/cached input/output)

Metadata

State

Ready

Created on

12/22/2025

Kind

Base model

Provider

N/A

Hugging Face

GLM-4.7

Specification

Calibrated

Mixture-of-Experts

Yes

Parameters

352.8B

Supported Functionality

Fine-tuning

Supported

Serverless

Supported

Context Length

202.8k tokens

Function Calling

Supported

Embeddings

Not supported

Rerankers

Not supported

Support image input

Not supported

GLM-4.7

GLM-4.7 API Features

Fine-tuning

Serverless

On-demand Deployment

Available Serverless

Metadata

Specification

Supported Functionality