Yi 34B API

What is Yi 34B and who developed it?

Yi 34B is a 34.4 billion parameter base language model developed by 01.AI. It is part of the Yi series, trained from scratch to support both English and Chinese. As of late 2023, it ranked first among open-source models (including Falcon-180B, Llama 2-70B, and Claude) on benchmarks like the Hugging Face Open LLM Leaderboard and C-Eval.

What applications and use cases does Yi 34B excel at?

The model is suitable for:

•Conversational AI
•Code assistance
•Agentic systems
•Enterprise RAG
•Search and multimedia reasoning

What is the maximum context length for Yi 34B?

Yi 34B supports a context length of 4,096 tokens.

What is the usable context window for Yi 34B?

The full 4.1K token window is available when running the model on Fireworks' on-demand infrastructure.

What is the maximum output length Fireworks allows for Yi 34B?

Outputs are constrained by the 4.1K token context length (prompt + completion combined).

What are known failure modes of Yi 34B?

•No function calling, image input, or embeddings support
•Not safety-aligned: May generate unsafe or unmoderated outputs
•No RAG-specific tuning or tool use capabilities

How many parameters does Yi 34B have?

Yi 34B has 34.4 billion parameters.

Is fine-tuning supported for Yi 34B?

Standard fine-tuning is not supported, but LoRA (parameter-efficient fine-tuning) is supported via Fireworks' Serverless LoRA framework.

How are tokens counted (prompt vs completion)?

Token billing is based on total input + output token usage.

What rate limits apply on the shared endpoint?

•Serverless: Not supported
•On-demand: Available with no rate limits on dedicated infrastructure

Yi 34B

Yi 34B API Features

On-demand Deployment

Yi 34B FAQs

Metadata

Specification

Supported Functionality