Gemma 4 31B IT NVFP4 API & Playground

Gemma 4 31B IT NVFP4 - NVIDIA 4-bit quantized variant of Google Gemma 4 31B Instruct for efficient inference

Gemma 4 31B IT NVFP4 API Features

On-demand Deployment

On-demand deployments allow you to use Gemma 4 31B IT NVFP4 on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits.

Metadata

State

Ready

Created on

4/15/2026

Kind

Base model

Provider

NVIDIA

Hugging Face

nvidia/Gemma-4-31B-IT-NVFP4

Specification

Calibrated

Mixture-of-Experts

Parameters

31B

Supported Functionality

Fine-tuning

Not supported

Serverless

Not supported

Context Length

262k tokens

Function Calling

Supported

Embeddings

Not supported

Rerankers

Not supported

Support image input

Supported