Skip to main content
Upload your own models from Hugging Face or elsewhere to deploy fine-tuned or custom-trained models optimized for your use case.
  • Multiple upload options – Upload from local files or directly from S3 buckets
  • Secure uploads – All uploads are encrypted and models remain private to your account by default

Requirements

Supported architectures

Fireworks supports most popular model architectures, including:

Required files

You’ll need standard Hugging Face model files: config.json, model weights (.safetensors or .bin), and tokenizer files.
The model files you will need to provide depend on the model architecture. In general, you will need:
  • Model configuration: config.json
    Fireworks does not support the quantization_config option in config.json.
  • Model weights in one of the following formats:
    • *.safetensors
    • *.bin
  • Weights index: *.index.json
  • Tokenizer file(s), e.g.:
    • tokenizer.model
    • tokenizer.json
    • tokenizer_config.json
If the requisite files are not present, model deployment may fail.Enabling chat completions: To enable the chat completions API for your custom base model, ensure your tokenizer_config.json contains a chat_template field. See the Hugging Face guide on Templates for Chat Models for details.

Uploading your model

  • Local files (CLI)
  • S3 bucket (CLI)
  • REST API
Upload from your local machine:
firectl create model <MODEL_ID> /path/to/files/

Verifying your upload

After uploading, verify your model is ready to deploy:
firectl get model accounts/<ACCOUNT_ID>/models/<MODEL_NAME>
Look for State: READY in the output. Once ready, you can create a deployment.

Deploying your model

Once your model shows State: READY, create a deployment:
firectl create deployment accounts/<ACCOUNT_ID>/models/<MODEL_NAME> --wait
See the On-demand deployments guide for configuration options like GPU types, autoscaling, and quantization.

Publishing your model

By default, models are private to your account. Publish a model to make it available to other Fireworks users. When published:
  • Listed in the public model catalog
  • Deployable by anyone with a Fireworks account
  • Still hosted and controlled by your account
Publish a model:
firectl update model <MODEL_ID> --public
Unpublish a model:
firectl update model <MODEL_ID> --public=false

Importing fine-tuned models

In addition to models you fine-tune on the Fireworks platform, you can also upload your own custom fine-tuned models as LoRA adapters.

Requirements

Your custom LoRA addon must contain the following files:
  • adapter_config.json - The Hugging Face adapter configuration file
  • adapter_model.bin or adapter_model.safetensors - The saved addon file
The adapter_config.json must contain the following fields:
  • r - The number of LoRA ranks. Must be an integer between 4 and 64, inclusive
  • target_modules - A list of target modules. Currently the following target modules are supported:
    • q_proj
    • k_proj
    • v_proj
    • o_proj
    • up_proj or w1
    • down_proj or w2
    • gate_proj or w3
    • block_sparse_moe.gate
Additional fields may be specified but are ignored.

Enabling chat completions

To enable the chat completions API for your LoRA addon, add a fireworks.json file to the directory containing:
{
  "conversation_config": {
    "style": "jinja",
    "args": {
      "template": "<YOUR_JINJA_TEMPLATE>"
    }
  }
}

Uploading the LoRA adapter

To upload a LoRA addon, run the following command. The MODEL_ID is an arbitrary resource ID to refer to the model within Fireworks.
Only some base models support LoRA addons.
firectl create model <MODEL_ID> /path/to/files/ --base-model "accounts/fireworks/models/<BASE_MODEL_ID>"

Next steps