Skip to main content

Introducing Llama 3.1 in partnership with Meta. Try Llama 3.1 405B

Meta logo

Meta Llama 3.1 405B

One of the largest open-source models, featuring a 128K context length, support for 8 languages, and powerful tool-calling capabilities, offering a more efficient and customizable alternative to GPT-4o, available as a production-grade API.

Try It Now

Featured Models

These models are deployed for industry-leading speeds to excel at production tasks

Language Models

The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes. The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.accounts/fireworks/models/llama-v3p1-405b-instructServerlessContext 131,072

The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes. The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.accounts/fireworks/models/llama-v3p1-70b-instructServerlessContext 131,072

The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes. The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.accounts/fireworks/models/llama-v3p1-8b-instructServerlessContext 131,072

Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks.accounts/fireworks/models/llama-v3-70b-instructServerlessContext 8,192

Mistral MoE 8x22B Instruct v0.1 model with Sparse Mixture of Experts. Fine tuned for instruction following.accounts/fireworks/models/mixtral-8x22b-instructServerlessContext 65,536

Yi-Large is among the top LLMs, with performance on the LMSYS benchmark leaderboard closely trailing GPT-4, Gemini 1.5 Pro, and Claude 3 Opus. It excels in multilingual capabilities, especially in Spanish, Chinese, Japanese, German, and French. Yi-Large is user-friendly, sharing the same API definition as OpenAI for easy integration.accounts/fireworks/models/yi-largeContext 32,768

Mistral MoE 8x7B Instruct v0.1 model with Sparse Mixture of Experts. Fine tuned for instruction followingaccounts/fireworks/models/mixtral-8x7b-instructServerlessContext 32,768

Fireworks' latest and most performant function-calling model. Firefunction-v2 is based on Llama-3 and trained to excel at function-calling as well as chat and instruction-following. See blog post for more details https://fireworks.ai/blog/firefunction-v2-launch-postaccounts/fireworks/models/firefunction-v2ServerlessContext 8,192

Vision-language model allowing both image and text as inputs (single image is recommended), trained on OSS model generated training data and open sourced on huggingface at fireworks-ai/FireLLaVA-13baccounts/fireworks/models/firellava-13bServerlessContext 4,096

(chronos-13b-v2 + Nous-Hermes-Llama2-13b) 75/25 merge. This offers the imaginative writing style of chronos while still retaining coherency and being capable. Outputs are long and utilize exceptional prose.accounts/fireworks/models/chronos-hermes-13b-v2Context 4,096

CodeGemma is a collection of lightweight open code models built on top of Gemma. CodeGemma models are text-to-text and text-to-code decoder-only models and are available as a 7 billion pretrained variant that specializes in code completion and code generation tasks, a 7 billion parameter instruction-tuned variant for code chat and instruction following and a 2 billion parameter pretrained variant for fast code completion.accounts/fireworks/models/codegemma-2bContext 8,192

CodeGemma is a collection of lightweight open code models built on top of Gemma. CodeGemma models are text-to-text and text-to-code decoder-only models and are available as a 7 billion pretrained variant that specializes in code completion and code generation tasks, a 7 billion parameter instruction-tuned variant for code chat and instruction following and a 2 billion parameter pretrained variant for fast code completion.accounts/fireworks/models/codegemma-7bContext 8,192

Code Llama is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 34 billion parameters. This is the base 13B version. This model is designed for general code synthesis and understanding.accounts/fireworks/models/code-llama-13bContext 32,768

Code Llama is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 34 billion parameters. This is the 13B instruct-tuned version. This model is designed for general code synthesis and understanding.accounts/fireworks/models/code-llama-13b-instructContext 32,768

Code Llama is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 34 billion parameters. This is the 13B Python specialist version. This model is designed for general code synthesis and understanding.accounts/fireworks/models/code-llama-13b-pythonContext 32,768

Code Llama is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 34 billion parameters. This is the base 34B version. This model is designed for general code synthesis and understanding.accounts/fireworks/models/code-llama-34bContext 32,768

Code Llama is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 34 billion parameters. This is the 34B instruct-tuned version. This model is designed for general code synthesis and understanding.accounts/fireworks/models/code-llama-34b-instructContext 32,768

Code Llama is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 34 billion parameters. This is the 34B Python specialist version. This model is designed for general code synthesis and understanding.accounts/fireworks/models/code-llama-34b-pythonContext 32,768

Code Llama is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 34 billion parameters. This is the base 70B version. This model is designed for general code synthesis and understanding.accounts/fireworks/models/code-llama-70bContext 16,384

Code Llama is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 34 billion parameters. This is the 70B instruct-tuned version. This model is designed for general code synthesis and understanding.accounts/fireworks/models/code-llama-70b-instructContext 4,096

Image Models

All currently deployed image models.

Image generation model, produced by stability.ai.accounts/fireworks/models/stable-diffusion-xl-1024-v1-0Serverless

The most capable text-to-image model produced by stability.ai, with greatly improved performance in multi-subject prompts, image quality, and spelling abilities. The Stable Diffusion 3 API is provided by Stability and the model is powered by Fireworks. Unlike other models on the Fireworks playground, you'll need a Stability API key to use this model. To use the API directly, visit https://platform.stability.ai/docs/api-reference#tag/Generate/paths/~1v2beta~1stable-image~1generate~1sd3/postaccounts/stability/models/sd3Serverless

2 billion parameter SD3 model with optimized performance and excels in areas where previous models struggled, such as photorealism and typography. The Stable Diffusion 3 API is provided by Stability and the model is powered by Fireworks. Unlike other models on the Fireworks playground, you'll need a Stability API key to use this model. To use the API directly, visit https://platform.stability.ai/docs/api-reference#tag/Generate/paths/~1v2beta~1stable-image~1generate~1sd3/postaccounts/stability/models/sd3-mediumServerless

Playground v2 is a diffusion-based text-to-image generative model. The model was trained from scratch by the research team at playground.com.accounts/fireworks/models/playground-v2-1024px-aestheticServerless

Playground v2.5 is a diffusion-based text-to-image generative model, and a successor to Playground v2.accounts/fireworks/models/playground-v2-5-1024px-aestheticServerless

Image generation model. Distilled from Stable Diffusion XL 1.0 and 50% smaller.accounts/fireworks/models/SSD-1BServerless

Japanese Stable Diffusion XL (JSDXL) is a Japanese-specific SDXL model that is capable of inputting prompts in Japanese and generating Japanese-style images.accounts/fireworks/models/japanese-stable-diffusion-xlServerless

Distilled, few-step version of Stable Diffusion 3, the newest image generation model from Stability AI, which is equal to or outperforms state-of-the-art text-to-image generation systems such as DALL-E 3 and Midjourney v6 in typography and prompt adherence, based on human preference evaluations. Stability AI has partnered with Fireworks AI, the fastest and most reliable API platform in the market, to deliver Stable Diffusion 3 and Stable Diffusion 3 Turbo. To use the API directly, visit https://platform.stability.ai/docs/api-reference#tag/Generate/paths/~1v2beta~1stable-image~1generate~1sd3/postaccounts/stability/models/sd3-turboServerless