DeepSeek V4 Pro is Live → Try it now.

Train
Training Api
Training API · Beta

The lower-level [hl]training stack[/hl]. With the safety off.

For teams who want full control — raw primitives, the Python SDK, and forty production recipes you can fork. Compose your own training loops on the same FireAttention runtime that powers Managed Training.

Training API
Recipes · 40+ on day one

Start from a recipe. Or write your own loop.

Every recipe is plain Python with a small surface area. Fork it, edit it, ship it — without rewriting the parts that work.

Llama-70B GRPO with tool rewards

Llama-70B GRPO with tool rewards

Distributed GRPO on a 70B base, 256 rollout workers, model-graded reward. Used by Genspark.
★ 312 · forks 47

SFT for code-edit completion

SFT for code-edit completion

Two-epoch. SFT on edit pairs with prefix masking. The recipe Cursor open-sourced.
★ 1,840 · forks 261

DPO with model-graded preferences

DPO with model-graded preferences

Sample N completions per prompt, judge with a stronger model, train DPO end-to-end.
★ 720 · forks 94

Continual pre-training for domain shift

Continual pre-training for domain shift

★ 488 · forks 71

PPO with custom env via gRPC

PPO with custom env via gRPC

★ 612 · forks 88

KTO on production traces

KTO on production traces

★ 224 · forks 28

PRIMITIVES

The six objects you'll use.

No frameworks. Six composable objects with strict types. Use them in our recipes, in your training loop, or in a Jupyter notebook.

Training API - Cluster

Request GPUs by shape, not by name.

Declare what you need; we schedule it. No quota tickets, no zone juggling.

Training API - Dataset

JSONL, Parquet, Hub, or streaming.

Type-checked schemas. Lazy. No 90-line data-loaders.

Training API - Trainer

SFT, DPO, GRPO, or your own.

One base class. Override step() for novel recipes.

Training API - Rollout

Distributed sampling, batched.

Inference on the same kernel hat serves prod. Scales horizontally.

Training API - Eval

Hook a metric. Block bad ckpts.

Model-graded, rubic, or your function. Gates promotion.

Training API - Checkpoint

Save, resume, deploy — same object.

Versioned, content-addressed, one-call deploy to a live endpoint.

Six primitives
HOW IT FITS TOGETHER

One loop, six primitives.

Your training script imports six objects. Fireworks fan-outs the rollouts, gradient-syncs the trainer, persists checkpoints, and ships the winning policy. You stay in Python.

Training API · Beta

Bring your own loop. Keep your sanity.

The Training API is in private beta. Request access and we'll get you a workspace, a $500 credit, and the engineer who wrote the SDK.