GLM 5.2 is live! Opus-level intelligence at open-source rates. Pay per token on serverless. Try it today.

Model Library
/Meta/Llama 3 70B Instruct
Meta Mark

Llama 3 70B Instruct

Ready
model path:accounts/fireworks/models/llama-v3-70b-instruct

Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks.

Llama 3 70B Instruct API Features

Fine-tuning

Docs

Llama 3 70B Instruct can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model

On-demand Deployment

Docs

On-demand deployments allow you to use Llama 3 70B Instruct on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits.

Llama 3 70B Instruct FAQs

What is Llama 3 70B Instruct and who developed it?

Llama 3 70B Instruct is an instruction-tuned large language model developed by Meta, part of the Llama 3 family released in April 2024. It is optimized for assistant-style dialogue and natural language generation tasks.

What applications and use cases does Llama 3 70B Instruct excel at?

Llama 3 70B Instruct excels at:

  • Chat-based assistants
  • Code generation (HumanEval: 81.7%)
  • Reasoning tasks (MMLU: 82.0%, GSM8K: 93.0%)
  • QA and comprehension (SQuAD, DROP, BoolQ)
What is the maximum context length for Llama 3 70B Instruct?

The model supports a context length of 8.2k tokens.

What is the usable context window for Llama 3 70B Instruct?

Usable context is up to 8.2k tokens, which matches the maximum context length.

Does Llama 3 70B Instruct support quantized formats (4-bit/8-bit)?

The model supports 48 quantized variants, confirming availability in 4-bit and 8-bit formats.

What are known failure modes of Llama 3 70B Instruct?

Despite extensive safety evaluations, the model may:

  • Produce inaccurate or biased outputs
  • Exhibit residual refusal or safety edge cases
  • Show English-language bias (training focus was English)
Does Llama 3 70B Instruct support function-calling schemas?

No, function calling is not supported.

How many parameters does Llama 3 70B Instruct have?

The model has 70.6 billion parameters.

Is fine-tuning supported for Llama 3 70B Instruct?

Llama 3 70B Instruct supports LoRA fine-tuning, full fine-tuning, and serverless LoRA.

How are tokens counted (prompt vs completion)?

Tokens are billed per 1M input/output tokens.

What rate limits apply on the shared endpoint?

On-demand deployment is supported with no rate limits.

What license governs commercial use of Llama 3 70B Instruct?

The model is released under the Llama 3 Community License, which allows commercial use. License details are hosted at llama.meta.com/license.

Metadata

State
Ready
Created on
4/18/2024
Kind
Base model
Provider
Meta

Specification

Calibrated
Yes
Mixture-of-Experts
No
Parameters
70.5B

Supported Functionality

Fine-tuning
Supported
Serverless
Not supported
Context Length
8.19k tokens
Function Calling
Not supported
Embeddings
Not supported
Rerankers
Not supported
Support image input
Not supported