Fireworks AI

We are excited to announce the Public Preview of Fireworks AI on Microsoft Foundry, bringing our best-in-class fast open-model serving directly into Azure. This partnership integrates Fireworks’ high-performance inference and State-of-the-Art (SOTA) open models into the unified Microsoft Foundry platform, which already offers a wide selection of models. By empowering developers with the fastest path to production-grade open-models, this milestone ensures teams using this new solution have one place to use any model, any framework, with enterprise‑grade controls to build and run AI applications and agents at scale.

Across industries, organizations are increasingly standardizing on open models to get greater control over performance, cost, customization, and the security and compliance needed for enterprise deployment. With open models, teams can choose the right architecture per workload, bring their own weights, and fine-tune for quality, latency, and cost without provider lock‑in.

Yet many organizations struggle to evaluate new models quickly, run trusted models reliably, and optimize inference at scale. Too often, that means building bespoke serving stacks, which slows innovation and limits a team's ability to compound improvements over time.

Microsoft Foundry and Fireworks are providing an enterprise-grade path to evaluate models and bring best-in-class open model inference directly into Azure.

Start Building with Fireworks on Microsoft Foundry Today

Fireworks AI Models in Microsoft Foundry: A Single Place for Open Models

Open models are evolving rapidly, and for many tasks, now approach frontier-class performance. Teams want the freedom to choose the best model per task, bring their own weights, and optimize live systems without retraining. Fireworks on Microsoft Foundry is designed for this reality.

Get best-in-class performance for SOTA open models. Fireworks leads on high-performance inference for open models. Its engine already runs at internet scale, processing 13T+ tokens daily, sustaining ~180K requests/sec, and generating 1,000+ tok/sec on large models, backed by leading benchmark performance on Artificial Analysis. This performance is now natively available in Microsoft Foundry.

Developers can log into Microsoft Foundry and access these open models from Fireworks AI today:

•DeepSeek V3.2
•Kimi K2.5
•MiniMax M2.5
•GLM-5
•GPT-OSS 120B

Evaluate models faster with day‑zero access and support. Start building immediately with access to SOTA open-weight models from Fireworks AI through a single Azure endpoint, available as first‑party models in Microsoft Foundry.

Run the models you already trust. With bring-your-own-weights (BYOW), you can upload and register quantized or fine‑tuned weights trained elsewhere without changing their serving stack.

Optimize for quality × latency × cost at inference time. Requests to Fireworks‑backed models are served by Fireworks’ high‑throughput inference stack, combining fast performance with cost efficiency and Azure‑grade governance.

Choose the right pricing model for your workload. Use Serverless, pay-per‑token inference to experiment quickly with off-the-shelf models or choose Provisioned Throughput Units (PTUs) for predictable, steady-state performance with base or custom models. Whether you’re optimizing for agility or efficiency, you get flexibility without managing infrastructure.

Operate with enterprise trust and scale. We are committed to enabling customers to build production-ready AI applications quickly while maintaining the highest levels of safety and security. Microsoft Foundry provides an end-to-end workspace for agent development, evaluation, and deployment, including unified governance, observability, and agent-ready tooling.

Real-world Impact

Whether they are building developer tools, enterprise assistants, agent‑driven workflows, or consumer‑scale AI experiences, digital natives and enterprises are looking for ways to move faster without losing control, to manage costs as usage grows, and to run AI with the same rigor they expect from any mission‑critical platform. Fireworks AI on Microsoft Foundry is helping teams meet that moment enabling them to translate ambition into impact and build AI systems that are designed to grow, adapt, and endure. Here’s how one customer is translating that scale‑with‑control approach into impact.

Looking Ahead

This partnership isn’t just about adding new models to Microsoft Foundry, it’s about giving teams better tools to build, run, and scale impactful AI use cases. Coming soon, our roadmap brings fine‑tuning capabilities for Fireworks AI models into Microsoft Foundry to deliver the best end‑to‑end destination for open‑source model customization & deployment. We’re looking forward to seeing how developers and enterprises use Fireworks on Foundry to power the next generation of intelligent applications.

Getting Started

Select Fireworks AI open models are now available via a serverless endpoint through Microsoft Foundry Models. Ready to explore Fireworks open models in Foundry? Start building with Fireworks on Microsoft Foundry today. Check out some additional tutorials here.

Get started with Bring-Your-Own-Weights (BYOW) with Microsoft Learn Docs.

Introducing Fireworks on Microsoft Foundry: Bringing Best-in-Class Open Model inference to Azure

Real-world Impact

Looking Ahead

Getting Started