Fireworks RFT now available! Fine-tune open models that outperform frontier models. Try today

Model Library
/Meta/Llama 3 70B Instruct
Meta Mark

Llama 3 70B Instruct

Ready
fireworks/llama-v3-70b-instruct

    Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks.

    Llama 3 70B Instruct API Features

    Fine-tuning

    Docs

    Llama 3 70B Instruct can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model

    On-demand Deployment

    Docs

    On-demand deployments give you dedicated GPUs for Llama 3 70B Instruct using Fireworks' reliable, high-performance system with no rate limits.

    Llama 3 70B Instruct FAQs

    What is Llama 3 70B Instruct and who developed it?

    Llama 3 70B Instruct is an instruction-tuned large language model developed by Meta, part of the Llama 3 family released in April 2024. It is optimized for assistant-style dialogue and natural language generation tasks.

    What applications and use cases does Llama 3 70B Instruct excel at?

    Llama 3 70B Instruct excels at:

    • Chat-based assistants
    • Code generation (HumanEval: 81.7%)
    • Reasoning tasks (MMLU: 82.0%, GSM8K: 93.0%)
    • QA and comprehension (SQuAD, DROP, BoolQ)
    What is the maximum context length for Llama 3 70B Instruct?

    The model supports a context length of 8.2k tokens.

    What is the usable context window for Llama 3 70B Instruct?

    Usable context is up to 8.2k tokens, which matches the maximum context length.

    Does Llama 3 70B Instruct support quantized formats (4-bit/8-bit)?

    The model supports 48 quantized variants, confirming availability in 4-bit and 8-bit formats.

    What are known failure modes of Llama 3 70B Instruct?

    Despite extensive safety evaluations, the model may:

    • Produce inaccurate or biased outputs
    • Exhibit residual refusal or safety edge cases
    • Show English-language bias (training focus was English)
    Does Llama 3 70B Instruct support function-calling schemas?

    No, function calling is not supported.

    How many parameters does Llama 3 70B Instruct have?

    The model has 70.6 billion parameters.

    Is fine-tuning supported for Llama 3 70B Instruct?

    Llama 3 70B Instruct supports LoRA fine-tuning, full fine-tuning, and serverless LoRA.

    How are tokens counted (prompt vs completion)?

    Tokens are billed per 1M input/output tokens.

    What rate limits apply on the shared endpoint?

    On-demand deployment is supported with no rate limits.

    What license governs commercial use of Llama 3 70B Instruct?

    The model is released under the Llama 3 Community License, which allows commercial use. License details are hosted at llama.meta.com/license.

    Metadata

    State
    Ready
    Created on
    4/18/2024
    Kind
    Base model
    Provider
    Meta
    Hugging Face
    Meta-Llama-3-70B-Instruct

    Specification

    Calibrated
    Yes
    Mixture-of-Experts
    No
    Parameters
    70.6B

    Supported Functionality

    Fine-tuning
    Supported
    Serverless
    Not supported
    Serverless LoRA
    Supported
    Context Length
    8.2k tokens
    Function Calling
    Not supported
    Embeddings
    Not supported
    Rerankers
    Not supported
    Support image input
    Not supported