NVIDIA Nemotron 3 Nano Omni 30B A3B API & Playground

NVIDIA Nemotron 3 Nano Omni is an open multimodal model from NVIDIA for reasoning across text, images, video, and audio. Built on a hybrid Mixture-of-Experts architecture with 30B total / 3B active parameters. Reasoning is currently supported for text and image inputs only; pass enable_thinking: false in chat_template_kwargs for video and audio requests.

NVIDIA Nemotron 3 Nano Omni 30B A3B API Features

On-demand Deployment

Docs

On-demand deployments allow you to use NVIDIA Nemotron 3 Nano Omni 30B A3B on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits.

Metadata

State

Ready

Created on

4/27/2026

Kind

Base model

Provider

NVIDIA

Hugging Face

nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16

Specification

Calibrated

Mixture-of-Experts

Parameters

30B

Supported Functionality

Fine-tuning

Not supported

Serverless

Not supported

Context Length

262k tokens

Function Calling

Not supported

Embeddings

Not supported

Rerankers

Not supported

Support image input

Not supported