The production AI platform

Fireworks partners with the world's leading generative AI researchers to serve the best models, at the fastest speeds.

Meta LLama v3 70b instruct ad

Meta Llama 3

Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes.

Try It Now

Trusted for empowering AI-driven production workflows

Models curated and optimized by Fireworks

The fastest and most uncompromising AI platform!

Fireworks AI
tokens / second
Next provider
tokens / second
average provider
tokens / second

Industry Leading Performance

Independently benchmarked to have the top speed of all inference providers

Enterprise Scale Throughput

Our proprietary stack blows open source options out of the water (see blog)

FireLLaVA: the first commercially permissive OSS LLaVA model

State-of-the-art Models

Use powerful models curated by Fireworks or our in-house trained multi-modal and function-calling models

0 Billion+
tokens served in a day

Battle Tested for Reliability

Fireworks is the 2nd most used open-source model provider and also generates over 1M images/day

fetch("", { method: "POST", headers: { "Content-Type": "application "Authorization: "Bearer <API KEY>", }, body: JSON.stringify({ model: "accounts/fireworks/mixtral-8x7b", prompt: "Say this is a test", max_tokens: 700, }), })

Built for Developers

Our OpenAI-compatible API makes it easy to start building with Fireworks!

Level up with Fireworks AI Enterprise

Get dedicated deployments for your models to ensure uptime and speed


Fireworks is proudly compliant with HIPAA and SOC2 and offers secure VPC and VPN connectivity

Meet your needs with data privacy - own your data and your models

© 2024 Fireworks AI All rights reserved.