Direct routing enables enterprise users reduce latency to their deployments.
--region
parameter is required to specify the deployment region.--direct-route-api-keys=<API_KEY_1> --direct-route-api-keys=<API_KEY_2>
. These keys can
be any alpha-numeric string and are a distinct concept from the API keys provisioned via the Fireworks console. A key
provisioned in the console but not specified the list here will not be allowed when querying the model via direct
routing.
Take note of the Direct Route Handle
to get the inference endpoint. This is what you will use access the deployment
instead of the global https://api.fireworks.ai/inference/
endpoint. For example:
US_IOWA_1
US_VIRGINIA_1
US_ARIZONA_1
US_ILLINOIS_1
US_TEXAS_1
US_ILLINOIS_2
EU_FRANKFURT_1
US_WASHINGTON_3
US_WASHINGTON_1
AP_TOKYO_1