Dolphin 2.9.2 Qwen2 72B API & Playground

What is Dolphin 2.9.2 Qwen2 72B and who developed it?

Dolphin 2.9.2 Qwen2 72B is a fine-tuned version of Qwen2-72B, developed by Cognitive Computations and hosted on Fireworks AI. The model was curated by Eric Hartford, Lucas Atkins, and Fernando Fernandes. It is designed for instruction-following, conversational tasks, and early agentic behaviors.

What applications and use cases does Dolphin 2.9.2 Qwen2 72B excel at?

The model is optimized for:

•Conversational AI
•Code generation and reasoning
•Agentic systems
•Enterprise RAG workflows
•Search and multimedia

It supports initial function-calling logic, though tool integration is not built-in.

What is the maximum context length for Dolphin 2.9.2 Qwen2 72B?

The model supports a context length of 131,072 tokens via rope scaling (YaRN), inherited from its Qwen2-72B base.

What is the usable context window for Dolphin 2.9.2 Qwen2 72B?

The full 131K token window is available in Fireworks on-demand deployments.

What are known failure modes of Dolphin 2.9.2 Qwen2 72B?

•Uncensored behavior: The model is intentionally unaligned and can comply with unethical prompts; users are warned to implement safety filters before use
•No vision support, embeddings, or reranker capabilities
•Function calling is rudimentary; lacks full schema/tool use integration
•Fine-tuned on 8K token sequences, which may limit long-range coherence despite large context window

How many parameters does Dolphin 2.9.2 Qwen2 72B have?

The model has 72.7 billion parameters.

Is fine-tuning supported for Dolphin 2.9.2 Qwen2 72B?

Yes. Fireworks supports LoRA-based fine-tuning for this model, available via the platform’s RFT offering.

What rate limits apply on the shared endpoint?

•Serverless: Not supported
•On-demand: Available with no rate limits via dedicated infrastructure

What license governs commercial use of Dolphin 2.9.2 Qwen2 72B?

The model is licensed under the Tongyi Qianwen license, inherited from its base (Qwen2-72B).

Fine-tuning Docs	Dolphin 2.9.2 Qwen2 72B can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model
On-demand Deployment Docs	On-demand deployments allow you to use Dolphin 2.9.2 Qwen2 72B on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits.

Dolphin 2.9.2 Qwen2 72B

Dolphin 2.9.2 Qwen2 72B API Features

Fine-tuning

On-demand Deployment

Dolphin 2.9.2 Qwen2 72B FAQs

Metadata

Specification

Supported Functionality