Latest Qwen3 state of the art model, FP8 version 8B Model
Fine-tuningDocs | Qwen3 8B can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model |
ServerlessDocs | Qwen3 8B is available via Fireworks' serverless API, where you pay per token. There are several ways to call the Fireworks API, including Fireworks' Python client, the REST API, or OpenAI's Python client. |
On-demand DeploymentDocs | On-demand deployments allow you to use Qwen3 8B on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits. |
Run queries immediately, pay only for usage