Excited to launch a multi-year partnership bringing Fireworks to Microsoft Azure Foundry! Learn more

Fireworks AI on NVIDIA Accelerated Infrastructure

Fireworks AI on NVIDIA GPUs empowers you to build groundbreaking AI experiences leveraging the industry’s fastest inference engine and the most advanced, reliable GPUs


Fireworks NVIDIA Partnership

NVIDIA Nemotron: Optimized Open-Source Intelligence on Fireworks

NVIDIA Nemotron is a family of open AI models, engineered for high intelligence, compute efficiency, and deployment flexibility across a wide range of enterprise and developer workloads.

By leveraging cutting-edge MoE technologies and a massive 1M context length, Nemotron models deliver strong reasoning while maintaining high throughput and cost efficiency.

Deploy the latest NVIDIA Nemotron text and vision models on Fireworks with fully managed, high-performance inference.



NVIDIA Nemoclaw Integration

Autonomous AI agents are here, but deploying them securely at scale is challenging. Fireworks is collaborating with NVIDIA on NVIDIA NemoClaw, an open-source stack that simplifies running always-on agents safely. NemoClaw installs the NVIDIA OpenShell runtime to enforce policy-based security and privacy. As a day-0 inference provider, Fireworks delivers the speed and efficiency you need to deploy your agents immediately.

NVIDIA NIM Integration

Deploy and run models seamlessly using NVIDIA NIM inference microservices with Fireworks AI. NIM provides optimized inference containers that simplify deployment while maximizing performance on NVIDIA GPUs.

The Ultimate AI Platform

Fireworks AI and NVIDIA together deliver the ultimate platform for generative AI, empowering developers with.state-of-the-art GPU hardware and the industry's fastest inference engine. This combination delivers unmatched performance, reliability, and scalability. Whether you're building conversational AI, vision applications, or complex multimodal systems, Fireworks AI on NVIDIA GPUs provides the essential foundation for innovating at scale, and delivering exceptional AI experiences.