gpt-oss-20b is a compact, open-weight language model optimized for low-latency and resource-constrained environments, including local and edge deployments. It shares the same Harmony training foundation and capabilities as 120B, with faster inference and easier deployment that is ideal for specialized or offline use cases, fast responsive performance, chain-of-thought output and adjustable reasoning levels, and agentic workflows.
gpt-oss-20b is available via Fireworks' serverless API, where you pay per token. There are several ways to call the Fireworks API, including Fireworks' Python client, the REST API, or OpenAI's Python client.
Learn MoreOn-demand deployments allow you to use gpt-oss-20b on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits.
Learn MoreOpenAI
128K
Available
Available
$0.07 / $0.3