- Multiple upload options – Upload from local files or directly from S3 buckets
- Secure uploads – All uploads are encrypted and models remain private to your account by default
Requirements
Supported architectures
Fireworks supports most popular model architectures, including:- DeepSeek V1, V2 & V3
- Qwen, Qwen2, Qwen2.5, Qwen2.5-VL, Qwen3
- Kimi K2 family
- GLM 4.X family
- Llama 1, 2, 3, 3.1, 4
- Mistral & Mixtral
- Gemma
View all supported architectures
View all supported architectures
Required files
You’ll need standard Hugging Face model files:config.json, model weights (.safetensors or .bin), and tokenizer files.
View detailed file requirements
View detailed file requirements
The model files you will need to provide depend on the model architecture. In general, you will need:
-
Model configuration:
config.jsonFireworks does not support thequantization_configoption inconfig.json. -
Model weights in one of the following formats:
*.safetensors*.bin
-
Weights index:
*.index.json -
Tokenizer file(s), e.g.:
tokenizer.modeltokenizer.jsontokenizer_config.json
tokenizer_config.json contains a chat_template field. See the Hugging Face guide on Templates for Chat Models for details.Uploading your model
- Local files (CLI)
- S3 bucket (CLI)
- REST API
Upload from your local machine:
Verifying your upload
After uploading, verify your model is ready to deploy:State: READY in the output. Once ready, you can create a deployment.
Deploying your model
Once your model showsState: READY, create a deployment:
Publishing your model
By default, models are private to your account. Publish a model to make it available to other Fireworks users. When published:- Listed in the public model catalog
- Deployable by anyone with a Fireworks account
- Still hosted and controlled by your account
Importing fine-tuned models
In addition to models you fine-tune on the Fireworks platform, you can also upload your own custom fine-tuned models as LoRA adapters.Requirements
Your custom LoRA addon must contain the following files:adapter_config.json- The Hugging Face adapter configuration fileadapter_model.binoradapter_model.safetensors- The saved addon file
adapter_config.json must contain the following fields:
r- The number of LoRA ranks. Must be an integer between 4 and 64, inclusivetarget_modules- A list of target modules. Currently the following target modules are supported:q_projk_projv_projo_projup_projorw1down_projorw2gate_projorw3block_sparse_moe.gate
Enabling chat completions
To enable the chat completions API for your LoRA addon, add afireworks.json file to the directory containing:
Uploading the LoRA adapter
To upload a LoRA addon, run the following command. The MODEL_ID is an arbitrary resource ID to refer to the model within Fireworks.Only some base models support LoRA addons.