The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes. The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.
Fine-tuningDocs | Llama 3.1 8B Instruct can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model |
On-demand DeploymentDocs | On-demand deployments give you dedicated GPUs for Llama 3.1 8B Instruct using Fireworks' reliable, high-performance system with no rate limits. |
Llama 3.1 8B Instruct is a multilingual, instruction-tuned large language model developed by Meta. It is part of the Llama 3.1 family, which includes models at 8B, 70B, and 405B parameter scales. The 8B Instruct variant is optimized for assistant-style chat and dialogue use cases across multiple languages.
Llama 3.1 8B Instruct is designed for:
It achieves strong results across benchmarks in reasoning, code, math, and multilingual understanding.
The model supports a maximum context length of 128k tokens.
Known limitations include:
Yes. The model supports tool-use schemas and function calling via structured prompts and chat templates.
Llama 3.1 8B Instruct has 8.03 billion parameters.
Yes. Fireworks supports fine-tuning Llama 3.1 8B Instruct using LoRA adapters on serverless or on-demand infrastructure.
The model is governed by the Llama 3.1 Community License, which allows commercial and research use.