The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes. The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.
Fine-tuningDocs | Llama 3.1 70B Instruct can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model |
On-demand DeploymentDocs | On-demand deployments give you dedicated GPUs for Llama 3.1 70B Instruct using Fireworks' reliable, high-performance system with no rate limits. |
Llama 3.1 70B Instruct is a multilingual instruction-tuned large language model developed by Meta AI. It is part of the Llama 3.1 family, which includes 8B, 70B, and 405B models. The 70B Instruct variant is fine-tuned with supervised data and RLHF for assistant-like use cases.
The model is optimized for:
It supports multilingual dialogue, with performance benchmarks across 8 languages.
Fireworks supports a maximum context length of 131,072 tokens.
The full 131.1K token window is usable on Fireworks AI’s infrastructure.
Yes, it supports 4-bit and 8-bit formats.
Meta reports risks in adversarial settings such as:
Refer to Meta’s Responsible Use Guide and red teaming reports for further details.
Yes, function calling is supported for this model.
The model has 70.6 billion parameters.
Yes. Fireworks supports LoRA-based fine-tuning for this model.
The model is governed by the Llama 3.1 Community License, a custom commercial license available via Meta’s GitHub.