- Multiple upload options – Upload from local files or directly from S3 buckets
- Secure uploads – All uploads are encrypted and models remain private to your account by default
For LoRA adapters, see importing fine-tuned models.
Requirements
Supported architectures
Fireworks supports most popular model architectures, including:- DeepSeek V1, V2 & V3
- Qwen, Qwen2, Qwen2.5, Qwen2.5-VL, Qwen3
- Kimi K2 family
- GLM 4.X family
- Llama 1, 2, 3, 3.1, 4
- Mistral & Mixtral
- Gemma
View all supported architectures
View all supported architectures
Required files
You’ll need standard Hugging Face model files:config.json
, model weights (.safetensors
or .bin
), and tokenizer files.
View detailed file requirements
View detailed file requirements
The model files you will need to provide depend on the model architecture. In general, you will need:
-
Model configuration:
config.json
Fireworks does not support thequantization_config
option inconfig.json
. -
Model weights in one of the following formats:
*.safetensors
*.bin
-
Weights index:
*.index.json
-
Tokenizer file(s), e.g.:
tokenizer.model
tokenizer.json
tokenizer_config.json
tokenizer_config.json
contains a chat_template
field. See the Hugging Face guide on Templates for Chat Models for details.Uploading your model
- Local files (CLI)
- S3 bucket (CLI)
- REST API
Upload from your local machine:
Verifying your upload
After uploading, verify your model is ready to deploy:State: READY
in the output. Once ready, you can create a deployment.
Deploying your model
Once your model showsState: READY
, create a deployment:
Publishing your model
By default, models are private to your account. Publish a model to make it available to other Fireworks users. When published:- Listed in the public model catalog
- Deployable by anyone with a Fireworks account
- Still hosted and controlled by your account