Skip to main content

Meta's Llama 3.2 models—1B, 3B, 11B, and 90B - available now. Read more

Meta logo

Meet Llama 3.2

Meta's Llama 3.2 models enable seamless integration of models, tools, and modalities for tasks like image reasoning and multimodal applications. Start building with the 1B, 3B, 11B, and 90B models today.

Try It Now

Featured Models

These models are deployed for industry-leading speeds to excel at production tasks

Language Models

The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes. The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks. 405B model is the most capable from the Llama 3.1 family. This model is served in FP8 closely matching reference implementation.accounts/fireworks/models/llama-v3p1-405b-instructServerlessContext 131,072

The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes. The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.accounts/fireworks/models/llama-v3p1-70b-instructServerlessContext 131,072

The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes. The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.accounts/fireworks/models/llama-v3p1-8b-instructServerlessContext 131,072

Llama 3.2 3B instruct is a lightweight, multilingual model from Meta. The model is designed for efficiency and offers substantial latency and cost improvements compared to larger models. Example use cases for the model include query and prompt rewriting and writing assistanceaccounts/fireworks/models/llama-v3p2-3b-instructServerlessContext 131,072

Mixtral MoE 8x22B Instruct v0.1 is the instruction-tuned version of Mixtral MoE 8x22B v0.1 and has the chat completions API enabled.accounts/fireworks/models/mixtral-8x22b-instructServerlessContext 65,536

Fireworks' latest and most performant function-calling model. Firefunction-v2 is based on Llama-3 and trained to excel at function-calling as well as chat and instruction-following. See blog post for more details https://fireworks.ai/blog/firefunction-v2-launch-postaccounts/fireworks/models/firefunction-v2ServerlessContext Unknown

Instruction-tuned image reasoning model from Meta with 11B parameters. Optimized for visual recognition, image reasoning, captioning, and answering general questions about an image. The model can understand visual data, such as charts and graphs and also bridge the gap between vision and language by generating text to describe images detailsaccounts/fireworks/models/llama-v3p2-11b-vision-instructServerlessContext 131,072

(chronos-13b-v2 + Nous-Hermes-Llama2-13b) 75/25 merge. This offers the imaginative writing style of chronos while still retaining coherency and being capable. Outputs are long and utilize exceptional prose.accounts/fireworks/models/chronos-hermes-13b-v2Context 4,096

CodeGemma is a collection of lightweight open code models built on top of Gemma. CodeGemma models are text-to-text and text-to-code decoder-only models and are available as a 7 billion pretrained variant that specializes in code completion and code generation tasks, a 7 billion parameter instruction-tuned variant for code chat and instruction following and a 2 billion parameter pretrained variant for fast code completion.accounts/fireworks/models/codegemma-2bContext 8,192

CodeGemma is a collection of lightweight open code models built on top of Gemma. CodeGemma models are text-to-text and text-to-code decoder-only models and are available as a 7 billion pretrained variant that specializes in code completion and code generation tasks, a 7 billion parameter instruction-tuned variant for code chat and instruction following and a 2 billion parameter pretrained variant for fast code completion.accounts/fireworks/models/codegemma-7bContext 8,192

Code Llama is a collection of pretrained and fine-tuned Large Language Models ranging in scale from 7 billion to 70 billion parameters, specializing in using both code and natural language prompts to generate code and natural language about code. This is the base 13B version.accounts/fireworks/models/code-llama-13bContext 32,768

Code Llama is a collection of pretrained and fine-tuned Large Language Models ranging in scale from 7 billion to 70 billion parameters, specializing in using both code and natural language prompts to generate code and natural language about code. This is the 13B instruction-tuned version.accounts/fireworks/models/code-llama-13b-instructContext 32,768

Code Llama is a collection of pretrained and fine-tuned Large Language Models ranging in scale from 7 billion to 70 billion parameters, specializing in using both code and natural language prompts to generate code and natural language about code. This is the 13B Python specialist version.accounts/fireworks/models/code-llama-13b-pythonContext 32,768

Code Llama is a collection of pretrained and fine-tuned Large Language Models ranging in scale from 7 billion to 70 billion parameters, specializing in using both code and natural language prompts to generate code and natural language about code. This is the 34B base version.accounts/fireworks/models/code-llama-34bContext 32,768

Code Llama is a collection of pretrained and fine-tuned Large Language Models ranging in scale from 7 billion to 70 billion parameters, specializing in using both code and natural language prompts to generate code and natural language about code. This is the 34B instruction-tuned version.accounts/fireworks/models/code-llama-34b-instructContext 32,768

Code Llama is a collection of pretrained and fine-tuned Large Language Models ranging in scale from 7 billion to 70 billion parameters, specializing in using both code and natural language prompts to generate code and natural language about code. This is the 34B Python specialist version.accounts/fireworks/models/code-llama-34b-pythonContext 32,768

Code Llama is a collection of pretrained and fine-tuned Large Language Models ranging in scale from 7 billion to 70 billion parameters, specializing in using both code and natural language prompts to generate code and natural language about code. This is the 70B Base version.accounts/fireworks/models/code-llama-70bContext 16,384

Code Llama is a collection of pretrained and fine-tuned Large Language Models ranging in scale from 7 billion to 70 billion parameters, specializing in using both code and natural language prompts to generate code and natural language about code. This is the 70B instruction-tuned version.accounts/fireworks/models/code-llama-70b-instructContext 4,096

Code Llama is a collection of pretrained and fine-tuned Large Language Models ranging in scale from 7 billion to 70 billion parameters, specializing in using both code and natural language prompts to generate code and natural language about code. This is the 70B Python specialist version.accounts/fireworks/models/code-llama-70b-pythonContext 4,096

Code Llama is a collection of pretrained and fine-tuned Large Language Models ranging in scale from 7 billion to 70 billion parameters, specializing in using both code and natural language prompts to generate code and natural language about code. This is the base 7B version.accounts/fireworks/models/code-llama-7bContext 32,768

...

Image Models

All currently deployed image models.

Image generation model, produced by stability.ai.accounts/fireworks/models/stable-diffusion-xl-1024-v1-0Serverless

The most capable text-to-image model produced by stability.ai, with greatly improved performance in multi-subject prompts, image quality, and spelling abilities. The Stable Diffusion 3 API is provided by Stability and the model is powered by Fireworks. Unlike other models on the Fireworks playground, you'll need a Stability API key to use this model. To use the API directly, visit https://platform.stability.ai/docs/api-reference#tag/Generate/paths/~1v2beta~1stable-image~1generate~1sd3/postaccounts/stability/models/sd3Serverless

2 billion parameter SD3 model with optimized performance and excels in areas where previous models struggled, such as photorealism and typography. The Stable Diffusion 3 API is provided by Stability and the model is powered by Fireworks. Unlike other models on the Fireworks playground, you'll need a Stability API key to use this model. To use the API directly, visit https://platform.stability.ai/docs/api-reference#tag/Generate/paths/~1v2beta~1stable-image~1generate~1sd3/postaccounts/stability/models/sd3-mediumServerless

Playground v2 is a diffusion-based text-to-image generative model. The model was trained from scratch by the research team at playground.com.accounts/fireworks/models/playground-v2-1024px-aestheticServerless

Playground v2.5 is a diffusion-based text-to-image generative model, and a successor to Playground v2.accounts/fireworks/models/playground-v2-5-1024px-aestheticServerless

Image generation model. Distilled from Stable Diffusion XL 1.0 and 50% smaller.accounts/fireworks/models/SSD-1BServerless

Japanese Stable Diffusion XL (JSDXL) is a Japanese-specific SDXL model that is capable of inputting prompts in Japanese and generating Japanese-style images.accounts/fireworks/models/japanese-stable-diffusion-xlServerless

Distilled, few-step version of Stable Diffusion 3, the newest image generation model from Stability AI, which is equal to or outperforms state-of-the-art text-to-image generation systems such as DALL-E 3 and Midjourney v6 in typography and prompt adherence, based on human preference evaluations. Stability AI has partnered with Fireworks AI, the fastest and most reliable API platform in the market, to deliver Stable Diffusion 3 and Stable Diffusion 3 Turbo. To use the API directly, visit https://platform.stability.ai/docs/api-reference#tag/Generate/paths/~1v2beta~1stable-image~1generate~1sd3/postaccounts/stability/models/sd3-turboServerless