gpt oss 120b is a high-performance, open-weight language model designed for production-grade, general-purpose use cases. It fits on a single H100 GPU, making it accessible without requiring multi-GPU infrastructure. Trained on the Harmony response format, it excels at complex reasoning and supports configurable reasoning effort, full chain-of-thought transparency for easier debugging and trust, and native agentic capabilities for function calling, tool use, and structured outputs.
gpt oss 120b is available via Fireworks' serverless API, where you pay per token. There are several ways to call the Fireworks API, including Fireworks' Python client, the REST API, or OpenAI's Python client.
Learn MoreOn-demand deployments allow you to use gpt oss 120b on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits.
Learn MoreOpenAI
128K
Available
$0.15 / $0.6