Pricing to seamlessly scale from idea to enterprise
Developer
Powerful speed and reliability to start your project
Enterprise
Personalized configurations for serving at scale
Fireworks is fully pay-as-you-go, besides enterprise deals. We have multiple pay-as-you-go product offerings, including serverless text model inference, image generation, fine-tuning and on-demand, private GPU inference. Spending on all offerings contribute to spending limits (including credits-based spending) based on past historical usage.
Per-token pricing is applied only for serverless inference. See below for on-demand deployment pricing.
LoRA models deployed to our serverless inference service are charged at the same rate as the underlying base model. There is no additional cost for serving LoRA models.
For image generation models, like SDXL, we charge based on the number of inference steps (denoising iterations). All image generation models besides SD3 are priced identically. SD3 uses pricing from Stability AI.
For multi-modal models like LLaVA, each image is billed as 576 prompt tokens.
For the Whisper speech recognition model, we charge per second of audio input at a $0.004/minute rate.
Embedding model pricing is based on the number of input tokens processed by the model.
Fireworks charges based on the total number of tokens in your fine-tuning dataset (dataset size * number of epochs). Fireworks only charges for the costs of tuning - there's no additional cost to deploy fine-tuned models and inference costs are the same as the base model.
On-demand deployments are billed by GPU-second using the above rates. Pricing scales linearly when using multiple GPUs. Users do not pay for start-up times.
Spending limits restrict how much you can spend on the Fireworks platform per calendar month. The spending limit is determined by your total historical Fireworks spend. You can purchase prepaid credits to immediately increase your historical spend. Visit our FAQ for answers to common billing questions.
Note: Credits are counted against your spending limit, so it is possible to hit the spending limit before all of your current credits are depleted.