OpenAI gpt-oss-120b & 20b, open weight models designed for reasoning, agentic tasks, and versatile developer use cases is now available! Try Now

Llama 3.1 Nemotron 70B

Llama-3.1-Nemotron-70B-Instruct is a large language model customized by NVIDIA to improve the helpfulness of LLM generated responses to user queries. This model was trained using RLHF on a Llama-3.1-70B-Instruct model. As of 1 Oct 2024, this model is #1 on all three automatic alignment benchmarks (verified tab for AlpacaEval 2 LC), edging out strong frontier models such as GPT-4o and Claude 3.5 Sonnet.

Try Model

Fireworks Features

Fine-tuning

Llama 3.1 Nemotron 70B can be customized with your data to improve responses. Fireworks uses LoRA to efficiently train and deploy your personalized model

Learn More

On-demand Deployment

On-demand deployments give you dedicated GPUs for Llama 3.1 Nemotron 70B using Fireworks' reliable, high-performance system with no rate limits.

Learn More

Info

Model Type

LLM

Context Length

131072

Fine-Tuning

Available

Pricing Per 1M Tokens

$0.9