GLM 5.2 is live! Opus-level intelligence at open-source rates. Pay per token on serverless. Try it today.

Model Library
/01.Ai/Yi-Large

Yi-Large is among the top LLMs, with performance on the LMSYS benchmark leaderboard closely trailing GPT-4, Gemini 1.5 Pro, and Claude 3 Opus. It excels in multilingual capabilities, especially in Spanish, Chinese, Japanese, German, and French. Yi-Large is user-friendly, sharing the same API definition as OpenAI for easy integration.

Yi-Large API Features

Fine-tuning

Docs

Yi-Large can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model

On-demand Deployment

Docs

On-demand deployments give you dedicated GPUs for Yi-Large using Fireworks' reliable, high-performance system with no rate limits.

Yi-Large FAQs

What is Yi-Large and who developed it?

Yi-Large is a 70B parameter dense language model developed by 01.AI. Yi-Large ranks among the top-performing open models on the LMSYS leaderboard, closely trailing GPT-4, Claude 3 Opus, and Gemini 1.5 Pro.

What applications and use cases does Yi-Large excel at?

Yi-Large is well-suited for:

  • Conversational AI
  • Code assistance
  • Agentic systems
  • Search
  • Enterprise RAG
  • Multilingual tasks (especially Spanish, Chinese, Japanese, German, and French)
What is the maximum context length for Yi-Large?

Yi-Large supports a context length of 32,800 tokens on Fireworks AI.

What is the usable context window for Yi-Large?

The maximum usable context window is 32.8K tokens, as defined by Fireworks AI's platform configuration.

How many parameters does Yi-Large have?

Yi-Large is a dense model with 70 billion parameters.

Is fine-tuning supported for Yi-Large?

Yes. Fireworks supports LoRA-based fine-tuning for this model.

What rate limits apply on the shared endpoint?

On Fireworks, on-demand deployments have no rate limits. Serverless access is not supported for this model.

Metadata

State
Ready
Created on
6/26/2024
Kind
Base model
Provider
01.Ai

Specification

Calibrated
No
Mixture-of-Experts
No
Parameters
70B

Supported Functionality

Fine-tuning
Supported
Serverless
Not supported
Context Length
32.7k tokens
Function Calling
Not supported
Embeddings
Not supported
Rerankers
Not supported
Support image input
Not supported