Introducing Llama 3.1 inference endpoints in partnership with Meta
By Chenyu Zhao |7/23/2024
Meta's Llama 3.2 models—1B, 3B, 11B, and 90B - available now. Read more
By Chenyu Zhao |7/23/2024
We’re thrilled to introduce Llama 3.1 inference endpoints in partnership with Meta. With expanded context length, multilingual support, tool calling, and the inclusion of Llama 3.1 405B, Llama 3.1 represents a significant leap forward in AI capabilities. Fireworks is proud to be a launch partner, offering AI developers immediate access to Llama 3.1 for production use from day one. Llama 3.1 is available on Fireworks AI inference engine and optimized for performance, which means you benefit from the lowest latency and most-efficient deployment.
List of Llama 3.1 models available on Fireworks
Our mission is to provide the fastest and most efficient inference platform and tools for building compound AI systems, equipping developers with the essential building blocks to create custom, production-ready AI applications. Fireworks is at the forefront of the rapid shift towards compound AI systems, which integrate multiple models and tools to enhance performance, reliability, and control. With the addition of Llama 3.1 to our portfolio of over 100 state-of-the-art models, and its new tool-calling capabilities, we are advancing this mission even further.
This week, we are also introducing AMD Instinct MI300 accelerators alongside NVIDIA H100 to power serverless inference for Llama 3.1 405B Instruct.
To quickly get up and running using Llama 3.1 on the Fireworks AI visit fireworks.ai to sign up for an account. Pickup the API Key from Profile on top right -> API Keys.
pip install
--
upgrade fireworks-ai
Below code snippet instantiates Fireworks
client and uses chat completions API to call the Llama 3.1 listed at - accounts/fireworks/models/llama-v3p1-405b-instruct
.
The above API request results in the below response.
At Fireworks AI, we believe that openness leads to better, safer products, faster innovation, and a healthier market. We are dedicated to the responsible release of models with our partners and continuously work with them on developing tools to ensure safety and security in AI applications.
We’re excited to see how the community leverages Fireworks to create groundbreaking applications with Llama 3.1. For inference pricing and deployment options, visit fireworks.ai/pricing.
For more information and to get started with Llama 3.1, visit fireworks.ai.