
NVIDIA Nemotron 3 Nano Omni is an open multimodal model from NVIDIA for reasoning across text, images, video, and audio. Built on a hybrid Mixture-of-Experts architecture with 30B total / 3B active parameters. Reasoning is currently supported for text and image inputs only; pass enable_thinking: false in chat_template_kwargs for video and audio requests.
On-demand DeploymentDocs | On-demand deployments allow you to use NVIDIA Nemotron 3 Nano Omni 30B A3B on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits. |