To review current Serverless Pricing for our most popular models, please visit the Docs page below for specific prices and details on Turbo + Priority tiers:
Specific details for each individual model can also be found on the model listing page from within the models library.
| Base model parameter count | $ / 1M input tokens |
|---|---|
| up to 150M | $0.008 |
| 150M - 350M | $0.016 |
| Qwen3 8B | $0.1 |
Priced per 1M training tokens
| Base Model | LoRA SFT | LoRA DPO | Full Param SFT | Full Param DPO |
|---|---|---|---|---|
| Models up to 16B parameters | $0.50 | $1.00 | $1.00 | $2.00 |
| Models 16.1B - 80B | $3.00 | $6.00 | $6.00 | $12.00 |
| Models 80B - 300B (e.g. Qwen3-235B, gpt-oss-120B) | $6.00 | $12.00 | $12.00 | $24.00 |
| Models >300B (e.g. DeepSeek V3, Kimi K2) | $10.00 | $20.00 | $20.00 | $40.00 |
Reinforcement fine tuning jobs are priced per GPU hour (billed per second), at the same price as Fireworks on-demand deployment. Please see the section below for details on RFT pricing.
| GPU Type | Price ($) per hour |
|---|---|
| H100 80 GB GPU | $7.00 |
| H200 141 GB GPU | $7.00 |
| B200 180 GB GPU | $10.00 |
| B300 288 GB GPU | $12.00 |