Choose the plan that's right for you


Powerful speed and reliability to start your project

100 requests/min rate limit
Up to 100 deployed models
Custom PEFT add-ons
Pay per usage
Get Started →


A plan that scales with your production usage

Everything from the Developer plan
Custom rate limits
Team collaboration features
API telemetry and metrics
Dedicated email support


Personalized configurations for serving at scale

Everything from the Business plan
Custom pricing
Unlimited rate limits
Unlimited deployed models
Custom base models
Dedicated and self-hosted deployments
Specialized enterprise support
Text Models
Base model parameter count$/1M tokens
up to 16B$0.20
16.1B - 80B$0.90
Mixtral 8x7B$0.50

Per-token pricing is applied only for non-enterprise deployments. Contact us for dedicated deployment pricing options.

Image Models
SDXL, $/stepSDXL w/ ControlNet, $/step

For image generation models like SDXL we charge based on the number of inference steps (denoising iterations).


For multi-modal models like LLaVA, each image is billed as 576 prompt tokens.

Frequently asked questions

© 2024 Fireworks AI All rights reserved.