Llama 4 Scout Instruct (Basic) API & Playground

The Llama 4 collection of models are natively multimodal AI models that enable text and multimodal experiences. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding.

Fireworks Features

Fine-tuning Docs	Llama 4 Scout Instruct (Basic) can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model
Serverless Docs	Immediately run model on pre-configured GPUs and pay-per-token
On-demand Deployment Docs	On-demand deployments give you dedicated GPUs for Llama 4 Scout Instruct (Basic) using Fireworks' reliable, high-performance system with no rate limits.

Available Serverless

Run queries immediately, pay only for usage

$0.15 / $0.60

Per 1M Tokens (input/output)

Metadata

State

Ready

Created on

4/5/2025

Kind

Base model

Provider

Specification

Calibrated

Yes

Mixture-of-Experts

Yes

Parameters

108.6B

Supported Functionality

Fine-tuning

Supported

Serverless

Supported

Serverless LoRA

Not supported

Context Length

1048.6k tokens

Function Calling

Supported

Embeddings

Not supported

Rerankers

Not supported

Support image input

Supported

Llama 4 Scout Instruct (Basic)

Fireworks Features

Fine-tuning

Serverless

On-demand Deployment

Available Serverless

Metadata

Specification

Supported Functionality