
NVIDIA Nemotron 3 Super 120B A12B BF16 is a large language model trained by NVIDIA for agentic workflows, long-context reasoning, and tool use. It employs a hybrid Latent Mixture-of-Experts (LatentMoE) architecture with 120B total parameters and 12B active parameters, configurable reasoning mode, and support for English, French, German, Italian, Japanese, Spanish, and Chinese.
On-demand DeploymentDocs | On-demand deployments allow you to use NVIDIA Nemotron 3 Super 120B A12B BF16 on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits. |