DeepSeek V3.1, a new state of the art open weight models for agentic reasoning, tool use, and coding, is now available! Try Now

OpenAi Logo MArk

OpenAI gpt-oss-20b

Welcome to the gpt-oss series, OpenAI's open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. gpt-oss-20b is used for lower latency, and local or specialized use-cases.

Try Model

Fireworks Features

Fine-tuning

OpenAI gpt-oss-20b can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model

Learn More

Serverless

Immediately run model on pre-configured GPUs and pay-per-token

Learn More

On-demand Deployment

On-demand deployments give you dedicated GPUs for OpenAI gpt-oss-20b using Fireworks' reliable, high-performance system with no rate limits.

Learn More

FAQs

What is OpenAI gpt-oss-20b and who developed it?

OpenAI gpt-oss-20b is an open-weight 21.5B parameter model developed by OpenAI. It is part of the "gpt-oss" series, optimized for lower latency and local or specialized tasks. The model was trained using OpenAI's Harmony response format and supports configurable reasoning depth for agentic applications.

What applications and use cases does OpenAI gpt-oss-20b excel at?

gpt-oss-20b is designed for:

Function calling with schemas

Web browsing and browser automation

Agentic tasks

Chain-of-thought reasoning

Local and low-latency deployments

It is particularly suited for scenarios where developers need customization and transparency in reasoning processes.

What is the maximum context length for OpenAI gpt-oss-20b?

The maximum context length is 131,072 tokens on Fireworks AI.

Does OpenAI gpt-oss-20b support quantized formats (4-bit/8-bit)?

Yes. gpt-oss-20b supports 8-bit precision and was post-trained using MXFP4 quantization of the MoE weights, making it compatible with 16GB memory deployments.

Does OpenAI gpt-oss-20b support streaming responses and function-calling schemas?

Yes. The model natively supports function calling with defined schemas and is suitable for streaming scenarios, particularly when using OpenAI-compatible APIs such as vLLM.

How many parameters does OpenAI gpt-oss-20b have?

The model has 21.5 billion parameters, of which 3.6 billion are active during inference (MoE architecture).

Is fine-tuning supported for OpenAI gpt-oss-20b?

Yes. Fine-tuning is supported on Fireworks AI using LoRA.

What license governs commercial use of OpenAI gpt-oss-20b?

The model is released under the Apache 2.0 license, which permits free use, modification, and commercial deployment without patent restrictions.

Info

Provider

OpenAI

Model Type

LLM

Context Length

131072

Serverless

Available

Fine-Tuning

Available

Pricing Per 1M Tokens Input/Output

$0.07 / $0.3