OpenAI gpt-oss-120b API & Playground

What is gpt-oss-120b and who developed it?

gpt-oss-120b is an open-weight large language model developed by OpenAI and released on August 5, 2025. It is designed for high-performance reasoning, agentic tasks, and general-purpose applications.

What applications and use cases does gpt-oss-120b excel at?

gpt-oss-120b is optimized for:

Complex reasoning and structured problem-solving (especially with chain-of-thought)
Agentic workflows (tool use, web browsing, function calling)
Production-grade general-purpose tasks (e.g., coding, math, science)
Use cases that benefit from adjustable reasoning levels

What is the maximum context length for gpt-oss-120b?

128K tokens.

What is the usable context window for gpt-oss-120b?

The full 128K context is supported on Fireworks AI, though usable context depends on prompt length and model memory limits.

Does gpt-oss-120b support quantized formats?

Yes. The model supports quantized versions, including 8-bit and MXFP4 precision for the MoE layer.

What is the default temperature of gpt-oss-120b on Fireworks AI?

The default temperature of gpt-oss-120b is 0.7.

What is the maximum output length Fireworks allows for gpt-oss-120b?

100 tokens (default in code example), but can be adjusted via the max_tokens parameter.

Does gpt-oss-120b support streaming responses and function-calling schemas?

Yes. gpt-oss-120b supports agentic workflows including function calling (via schemas), tool use, and Harmony-format structured outputs.

How many parameters does gpt-oss-120b have?

117 billion total parameters, with 5.1 billion active parameters per forward pass (Mixture-of-Experts).

Is fine-tuning supported for gpt-oss-120b?

Yes. Fine-tuning is supported and available for gpt-oss-120b on Fireworks AI.

How are tokens counted (prompt vs completion)?

Fireworks AI charges per 1M tokens: $0.15 for input and $0.60 for output.

What license governs commercial use of gpt-oss-120b?

Apache 2.0 license — permissive for commercial use without restriction.

Fine-tuning Docs	OpenAI gpt-oss-120b can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model
Serverless Docs	Immediately run model on pre-configured GPUs and pay-per-token
On-demand Deployment Docs	On-demand deployments give you dedicated GPUs for OpenAI gpt-oss-120b using Fireworks' reliable, high-performance system with no rate limits.

OpenAI gpt-oss-120b

OpenAI gpt-oss-120b API Features

Fine-tuning

Serverless

On-demand Deployment

Available Serverless

Metadata

Specification

Supported Functionality