
NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks. To disable the reasoning trace, include /no_think in your system prompt.
On-demand DeploymentDocs | On-demand deployments allow you to use NVIDIA Nemotron Nano 9B v2 on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits. |