Llama 3.3 70B Instruct is the December update of Llama 3.1 70B. The model improves upon Llama 3.1 70B (released July 2024) with advances in tool calling, multilingual text support, math and coding. The model achieves industry leading results in reasoning, math and instruction following and provides similar performance as 3.1 405B but with significant speed and cost improvements.
Fine-tuningDocs | Llama 3.3 70B Instruct can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model |
ServerlessDocs | Immediately run model on pre-configured GPUs and pay-per-token |
On-demand DeploymentDocs | On-demand deployments give you dedicated GPUs for Llama 3.3 70B Instruct using Fireworks' reliable, high-performance system with no rate limits. |
Run queries immediately, pay only for usage
Llama 3.3 70B Instruct is a multilingual, instruction-tuned large language model developed by Meta AI. It is the December 2024 update to Llama 3.1 70B, offering improvements in reasoning, tool use, math, code generation, and multilingual capabilities.
The model is optimized for:
Fireworks supports a context length of 131,072 tokens.
The full 131.1K tokens are usable on Fireworks on-demand deployments.
Yes. The model supports 4-bit and 8-bit formats.
Meta outlines these limitations:
No, function calling is not supported for this model.
The model has 70.6 billion parameters.
Yes. Fireworks supports LoRA-based fine-tuning for this model.
The model is distributed under the Llama 3.3 Community License, a custom commercial license from Meta. Full license details are available here.