GLM 5.2 is live! Opus-level intelligence at open-source rates. Pay per token on serverless. Try it today.

Fireworks Blog

glm fireworks lockup

GLM 5.2 is live on Fireworks inference, day zero.

Qwen 3 Decoded
Model Releases
8/1/2025

Qwen3 Decoded: Choosing the Right Model For Your Task

Kimi K2 Deep Dive
Developer Experience
8/1/2025

Kimi K2: Deep Dive into model performance and use-cases

Fireworks AI Batch API
Model Releases
7/31/2025

Run bulk async workloads with Fireworks Batch API

Real-world leaderboard
Benchmarks
7/30/2025

Fireworks Real-World Benchmarks: Find the Best OSS Model for the Job

Introducing Vision-Language Model Fine-tuning
Model Releases
7/29/2025

Introducing Vision-Language Model Fine-tuning: Tailor VLMs to Your Domain

Notion
Case Studies
7/25/2025

How Notion Cuts Latency 4x and Scales Enterprise AI Workflows with Fireworks AI

QK-Clip
Developer Experience
7/22/2025

A Deep Dive into MLA training/inference difference and why QK-Clip from Kimi is such an elegant idea

VibeRL: When AI Trains AI
Model Releases
7/22/2025

VibeRL: When AI Trains AI

Sentient & Fireworks Powers Decentralized AI At Viral Scale
Case Studies
7/17/2025

Sentient & Fireworks Powers Decentralized AI At Viral Scale

Fireworks Sagemaker
Model Releases
7/15/2025

Fireworks AI Now Supports Amazon SageMaker

Deep-dive into MuonClip: Fixing Attention Score Explosions in Transformer Training
Developer Experience
7/15/2025

Deep-dive into MuonClip: Fixing Attention Score Explosions in Transformer Training

Understanding Function Calling: The Bridge to Agentic AI
Developer Experience
7/11/2025

Understanding Function Calling: The Bridge to Agentic AI

Using Model as Judge for Reward in Reinforcement Fine Tuning
Developer Experience
7/10/2025

Using Model-as-a-Judge for Reward in Reinforcement Fine Tuning

Flux Kontext on Fireworks
Model Releases
7/9/2025

Introducing FLUX.1 Kontext on Fireworks

Announcing Response API with MCP
Model Releases
6/22/2025

Unlock Your Tools: Fireworks Adds OpenAI-Response API with MCP Support (Beta)

Announcing Virtual Cloud on Fireworks AI
Model Releases
6/16/2025

Build for Scale with Fireworks Virtual Cloud (GA)

Announcing Updated 3D FireOptimizer
Model Releases
6/14/2025

3D FireOptimizer: Automating the Multi-Dimensional Tradeoffs in LLM Serving

Updated Supervised Fine Tuning
Model Releases
6/13/2025

Introducing Supervised Fine Tuning V2

Reinforcement fine tuning announcement
Model Releases
6/9/2025

Reinforcement Fine Tuning (Beta): Train expert open models to surpass closed frontier models

Fireworks AI Dev Day 2025 Wrapped
Company News
5/29/2025

Fireworks DevDay 2025 Wrapped

 Independent benchmarking of Fireworks shows >250 tokens / second on DeepSeek V3
Model Releases
5/28/2025

FireAttention V4: Industry-Leading Latency and Cost Efficiency with FP4