OpenAI gpt-oss-120b & 20b, open weight models designed for reasoning, agentic tasks, and versatile developer use cases is now available! Try Now

OpenAi Logo MArk

OpenAI gpt-oss-20b

gpt-oss-20b is a compact, open-weight language model optimized for low-latency and resource-constrained environments, including local and edge deployments. It shares the same Harmony training foundation and capabilities as 120B, with faster inference and easier deployment that is ideal for specialized or offline use cases, fast responsive performance, chain-of-thought output and adjustable reasoning levels, and agentic workflows.

Try Model

Fireworks Features

Serverless

gpt-oss-20b is available via Fireworks' serverless API, where you pay per token. There are several ways to call the Fireworks API, including Fireworks' Python client, the REST API, or OpenAI's Python client.

Learn More

On-demand Deployment

On-demand deployments allow you to use gpt-oss-20b on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits.

Learn More

Info

Provider

OpenAI

Model Type

LLM

Context Length

128K

Serverless

Available

Fine-Tuning

Available

Pricing Per 1M Tokens Input/Output

$0.07 / $0.3