Introduction
There are several ways to query models on Fireworks:
- The Fireworks Python client library
- The web UI
- LangChain
- Directly invoking the REST API using your favorite tools or language
- The OpenAI Python client
To see a list of available models, check out our model library.
Models on Fireworks can be hosted (“deployed”) via serverless or dedicated deployments. A serverless deployment is a shared public deployment of a model that is priced per-token and subject to rate limits. Not all models are available serverlessly. To confirm if a model is available serverlessly, you can find it in the model library and look for the Serverless
tag.
For workloads that exceed serverless rate limits or require models that are not available serverlessly, we offer you the ability to spin up dedicated model deployments. On-demand deployments get billed by the GPU-second and are subject to capacity constraints. Enterprise accounts can purchase reserved capacity deployments to get guaranteed access to compute with a fixed commitment.
Test models using the model playground
All serverless models can be tested using the model playground, where you can evaluate how the model behavior changes with different parameters.