Single-LoRA deployment with live merge

On this page

Quick deployment
Alternative deployment method
Deployment with the Build SDK
When to use live merge

Deploy your LoRA fine-tuned model with a single command that delivers performance matching the base model. This streamlined approach, called live merge, eliminates the previous two-step process and provides better performance compared to multi-LoRA deployments.

Quick deployment

Deploy your LoRA fine-tuned model with one simple command:

firectl create deployment "accounts/fireworks/models/<MODEL_ID of lora model>"

Your deployment will be ready to use once it completes, with performance that matches the base model.

Alternative deployment method

This two-step method is the standard approach for multi-LoRA deployments where multiple LoRA models share the same base model. While it can also be used for single LoRA deployments, it provides slower performance compared to live merge and is not recommended for single LoRA use cases.

You can also deploy single LoRA models using a two-step process:

Create base model deployment

Deploy the base model with addons enabled:

firectl create deployment "accounts/fireworks/models/<MODEL_ID of base model>" --enable-addons

Load LoRA addon

Once the deployment is ready, load the LoRA model onto the deployment:

firectl load-lora <MODEL_ID> --deployment <DEPLOYMENT_ID>

Deployment with the Build SDK

You can also deploy your LoRA fine-tuned model using the Build SDK:

from fireworks import LLM

# Deploy a fine-tuned model with on-demand deployment (live merge)
fine_tuned_llm = LLM(
    model="accounts/your-account/models/your-fine-tuned-model-id",
    deployment_type="on-demand",
    id="my-fine-tuned-deployment"  # Simple string identifier
)

# Apply the deployment to ensure it's ready
fine_tuned_llm.apply()

# Use the deployed model
response = fine_tuned_llm.chat.completions.create(
    messages=[{"role": "user", "content": "Hello!"}]
)

# Track deployment in web dashboard
print(f"Track at: {fine_tuned_llm.deployment_url}")

The id parameter can be any simple string - it does not need to follow the format "accounts/account_id/deployments/model_id".

When to use live merge

Use live merge deployment when you:

Have a single fine-tuned model to serve
Need optimal performance that matches the base model
Want the simplest deployment process
Don’t require sharing a base model across multiple LoRA models

The live merge deployment method is designed for dedicated deployments with a single LoRA model. For multiple LoRA models sharing the same base model, consider using multi-LoRA deployment.

Reinforcement fine-tuning (RFT)

Using multi-LoRA

Get Started

Querying models

Dedicated Deployments

Fine-tuning

Integrations

Policies

Administration

Single-LoRA deployment with live merge

Quick deployment

Alternative deployment method

Deployment with the Build SDK

When to use live merge

Get Started

Querying models

Dedicated Deployments

Fine-tuning

Integrations

Policies

Administration

​Quick deployment

​Alternative deployment method

​Deployment with the Build SDK

​When to use live merge

Quick deployment

Alternative deployment method

Deployment with the Build SDK

When to use live merge