Fireworks AI

We’re thrilled to announce Fireworks AI has now made Amazon SageMaker available as a Bring Your Own Compute (BYOC) deployment option.

This integration allows developers and enterprise ML teams to train models using SageMaker, and leverage Fireworks’ high-performance, low-latency inference platform for model serving — all within their existing AWS environment.

Why This Matters

As organizations embrace Generative AI at scale, they’re hitting the same roadblocks: training and experimentation in SageMaker is made seamless, but production-grade inference requires custom platform, performance tuning, and ongoing cost management.

That’s where Fireworks comes in.

Fireworks is the fastest inference and AI platform that enables customers to build magical AI Applications. Fireworks offers:

Blazing-fast response times (low P99 latencies)
Significant cost reductions
Support for wide array of LLMs, VLMs, Embeddings, Audio, Video and Image models

Now, with Amazon SageMaker as a deployment option, customers can benefit all within their AWS environment.

What You Can Do With It

With Fireworks' deployment on Amazon SageMaker, you can:

Train models in SageMaker
Register your model in SageMaker Model Registry
Deploy your trained model using Fireworks’ optimized endpoints
Monitor your deployment with AWS Cloudwatch and Cloudtrail

All of this while retaining full control over your data, compliance boundaries, and AWS resource governance.

How It Works

Train models separately on SageMaker

You can use SageMaker’s built-in training capabilities to train and tune your models before serving them through the Fireworks container.

Save the trained model in S3

You can deploy any custom fine-tuned or base open source model

Deploy Fireworks as a container on SageMaker

The container will load and serve models that you’ve uploaded to an S3 bucket.

Expose the model via a SageMaker endpoint

Once deployed, your application or end users can interact with the model through a real-time SageMaker Endpoint.

Built for Enterprise AI Teams

We designed this feature with enterprise needs in mind. That means:

No data leaves your cloud — meet internal compliance & security requirements
Full AWS billing integration — use AWS Marketplace or consolidated billing
Granular IAM & VPC support — align with your existing infrastructure

Whether you're deploying an open-source LLM or scaling a fine-tuned model to thousands of users, Fireworks BYOC ensures your workloads are fast, reliable, and cost-efficient.

Get Started Today

Fireworks deployment on Amazon SageMaker is available now in private preview.
Want early access? Request access here.

Final Thoughts

This launch is a big step toward our mission: making world-class AI infrastructure accessible and scalable for everyone. By integrating deeply with Amazon SageMaker, we're bridging the gap between model development and high-performance inference.