Stable Diffusion 3
Fireworks has partnered with Stability to provide blazing fast image generation using SD3, the latest and most advanced generative image model yet.
Try It NowFeatured Models
These models are deployed for industry-leading speeds to excel at production tasks
Image Models
All currently deployed image models.
Model Name | Model Description | |
---|---|---|
Stable Diffusion 3Serverless |
The most capable text-to-image model produced by stability.ai, with greatly improved performance in multi-subject prompts, image quality, and spelling abilities.
The Stable Diffusion 3 API is provided by Stability and the model is powered by Fireworks.
Unlike other models on the Fireworks playground, you'll need a Stability API key to use this model.
To use the API directly, visit https://platform.stability.ai/docs/api-reference#tag/Generate/paths/~1v2beta~1stable-image~1generate~1sd3/post accounts/stability/models/sd3 | |
Stable Diffusion XLServerless | Image generation model, produced by stability.ai. accounts/fireworks/models/stable-diffusion-xl-1024-v1-0 | |
Playground v2 1024Serverless | Playground v2 is a diffusion-based text-to-image generative model. The model was trained from scratch by the research team at playground.com. accounts/fireworks/models/playground-v2-1024px-aesthetic | |
Playground v2.5 1024Serverless | Playground v2.5 is a diffusion-based text-to-image generative model, and a successor to Playground v2. accounts/fireworks/models/playground-v2-5-1024px-aesthetic | |
Segmind Stable Diffusion 1B (SSD-1B)Serverless | Image generation model. Distilled from Stable Diffusion XL 1.0 and 50% smaller. accounts/fireworks/models/SSD-1B | |
Japanese Stable Diffusion XLServerless | Japanese Stable Diffusion XL (JSDXL) is a Japanese-specific SDXL model that is capable of inputting prompts in Japanese and generating Japanese-style images. accounts/fireworks/models/japanese-stable-diffusion-xl | |
Stable Diffusion 3 TurboServerless |
Distilled, few-step version of Stable Diffusion 3, the newest image generation model from Stability AI,
which is equal to or outperforms state-of-the-art text-to-image generation systems such as DALL-E 3 and Midjourney v6 in typography and prompt adherence,
based on human preference evaluations.
Stability AI has partnered with Fireworks AI, the fastest and most reliable API platform in the market, to deliver Stable Diffusion 3 and Stable Diffusion 3 Turbo.
To use the API directly, visit https://platform.stability.ai/docs/api-reference#tag/Generate/paths/~1v2beta~1stable-image~1generate~1sd3/post accounts/stability/models/sd3-turbo |
Language Models
Serverless models are hosted by Fireworks — No need to configure hardware or deploy models. Usage is billed per token.
Model Name | Model Description | Context | |
---|---|---|---|
FireLLaVA-13BServerless | Vision-language model allowing both image and text as inputs (single image is recommended), trained on OSS model generated training data and open sourced on huggingface at fireworks-ai/FireLLaVA-13b accounts/fireworks/models/firellava-13b | 4,096 | |
FireFunction V1Serverless | Fireworks' open-source function calling model. accounts/fireworks/models/firefunction-v1 | 32,768 | |
Mixtral MoE 8x7B InstructServerless | Mistral MoE 8x7B Instruct v0.1 model with Sparse Mixture of Experts. Fine tuned for instruction following accounts/fireworks/models/mixtral-8x7b-instruct | 32,768 | |
Mixtral MoE 8x22B InstructServerless | Mistral MoE 8x22B Instruct v0.1 model with Sparse Mixture of Experts. Fine tuned for instruction following. accounts/fireworks/models/mixtral-8x22b-instruct | 65,536 | |
Llama 3 70B InstructServerless | Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. accounts/fireworks/models/llama-v3-70b-instruct | 8,192 | |
BleatServerless | Bleat allows you to enable function calling in LLaMA 2 in a similar fashion to OpenAI's implementation for ChatGPT. accounts/fireworks/models/bleat-adapter | 4,096 | |
Chinese Llama 2 LoRA 7BServerless | The LoRA version of Chinese-Llama-2 base on Llama-2-7b-hf. accounts/fireworks/models/chinese-llama-2-lora-7b | 4,096 | |
DBRX InstructServerless | DBRX Instruct is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. DBRX Instruct specializes in few-turn interactions. Dbrx is hosted as an experimental model. Fireworks only guarantees that it will be hosted serverless through April 2024. Future serverless availability will depend on overall usage. accounts/fireworks/models/dbrx-instruct | 32,768 | |
Gemma 7B InstructServerless | Gemma 7B Instruct from Google. Gemma is provided under and subject to the Gemma Terms of Use found at ai.google.dev/gemma/terms accounts/fireworks/models/gemma-7b-it | 8,192 | |
Hermes 2 Pro Mistral 7bServerless | Latest version of Nous Research's Hermes series of models, using an updated and cleaned version of the Hermes 2 dataset, and is now trained on a diverse and rich set of function calling and JSON mode samples accounts/fireworks/models/hermes-2-pro-mistral-7b | Unknown | |
Japanese StableLM Instruct Beta 70BServerless | japanese-stablelm-instruct-beta-70b is a 70B-parameter decoder-only language model based on japanese-stablelm-base-beta-70b and further fine tuned on Databricks Dolly-15k, Anthropic HH, and other public data. accounts/stability/models/japanese-stablelm-instruct-beta-70b | Unknown | |
Japanese Stable LM Instruct Gamma 7BServerless | This is a 7B-parameter decoder-only Japanese language model fine-tuned on instruction-following datasets, built on top of the base model Japanese Stable LM Base Gamma 7B. accounts/stability/models/japanese-stablelm-instruct-gamma-7b | Unknown | |
Llama 2 13B FrenchServerless | Fine-tuned meta-llama/Llama-2-13b-chat-hf to answer French questions in French. accounts/fireworks/models/llama-2-13b-fp16-french | 4,096 | |
Llama2 13B Guanaco QLoRA GGMLServerless | This chatbot model was built via parameter-efficient QLoRA finetuning of llama-2-13b on all 9.85k rows of timdettmers/openassistant-guanaco (a subset of OpenAssistant/oasst1 containing the highest-rated conversation paths). Finetuning was executed on a single A6000 (48 GB) for roughly 3.7 hours on the Lambda Labs platform. accounts/fireworks/models/llama-2-13b-guanaco-peft | 4,096 | |
Llama 7B SummarizeServerless | Summarizes articles and conversations. accounts/fireworks/models/llama2-7b-summarize | 4,096 | |
Llama Guard v2 8BServerless | Meta Llama Guard 2 is an 8B parameter Llama 3-based LLM safeguard model. Similar to Llama Guard, it can be used for classifying content in both LLM inputs (prompt classification) and in LLM responses (response classification). It acts as an LLM – it generates text in its output that indicates whether a given prompt or response is safe or unsafe, and if unsafe, it also lists the content categories violated. accounts/fireworks/models/llama-guard-2-8b | 8,192 | |
Llama 2 13BServerless | A 13B parameter Llama 2 model, trained on 2 trillion tokens with a context length of 4096. accounts/fireworks/models/llama-v2-13b | 4,096 | |
Llama 2 13B ChatServerless | A fine-tuned version of Llama 2 13B, optimized for dialogue applications using Reinforcement Learning from Human Feedback (RLHF), and perform comparably to ChatGPT according to human evaluations. accounts/fireworks/models/llama-v2-13b-chat | 4,096 | |
Llama 2 13B CodeServerless | Code Llama is a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models, infilling capabilities, support for large input contexts, and zero-shot instruction following ability for programming tasks. accounts/fireworks/models/llama-v2-13b-code | 4,096 | |
Llama 2 13B Code InstructServerless | The 13B parameter Code Llama instruct model, fine-tuned for understanding natural language instructions. accounts/fireworks/models/llama-v2-13b-code-instruct | 4,096 | |
Llama 2 34B CodeServerless | Code Llama is a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models, infilling capabilities, support for large input contexts, and zero-shot instruction following ability for programming tasks. accounts/fireworks/models/llama-v2-34b-code | 4,096 | |
Llama 2 34B Code InstructServerless | The 34B parameter Code Llama instruct model, fine-tuned for understanding natural language instructions. accounts/fireworks/models/llama-v2-34b-code-instruct | 4,096 | |
Llama 2 70B ChatServerless | A fine-tuned version of Llama 2 70B, optimized for dialogue applications using Reinforcement Learning from Human Feedback (RLHF), and perform comparably to ChatGPT according to human evaluations. accounts/fireworks/models/llama-v2-70b-chat | 4,096 | |
Llama 2 70B Code Llama instructServerless | An instruction-tuned version of Code Llama 70B, optimized for code generation. accounts/fireworks/models/llama-v2-70b-code-instruct | 4,096 | |
Llama 2 7BServerless | A 7B parameter Llama 2 model, trained on 2 trillion tokens with a context length of 4096. accounts/fireworks/models/llama-v2-7b | 4,096 | |
Llama 2 7B ChatServerless | A fine-tuned version of Llama 2 7B, optimized for dialogue applications using Reinforcement Learning from Human Feedback (RLHF), and perform comparably to ChatGPT according to human evaluations. accounts/fireworks/models/llama-v2-7b-chat | 4,096 | |
Llama 3 70B Instruct (HF version)Serverless | Llama v3 70B instruct (Hugging Face) accounts/fireworks/models/llama-v3-70b-instruct-hf | 8,192 | |
Llama 3 8B InstructServerless | Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. accounts/fireworks/models/llama-v3-8b-instruct | 8,192 | |
Llama 3 8B Instruct (HF version)Serverless | Llama V3 8B Instruct (huggingface) accounts/fireworks/models/llama-v3-8b-instruct-hf | 8,192 | |
LLaVA V1.6 Yi 34BServerless | Vision language model LLaVA 1.6 allowing both image and text inputs accounts/fireworks/models/llava-yi-34b | 4,096 | |
Mistral 7BServerless | The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. accounts/fireworks/models/mistral-7b | 32,768 | |
Mistral 7B InstructServerless | The Mistral-7B-Instruct-v0.1 Large Language Model (LLM) is a instruct fine-tuned version of the Mistral-7B-v0.1 generative text model using a variety of publicly available conversation datasets. accounts/fireworks/models/mistral-7b-instruct-4k | 32,768 | |
Mistral 7B Instruct v0.2Serverless | The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an improved instruct fine-tuned version of Mistral-7B-Instruct-v0.1. accounts/fireworks/models/mistral-7b-instruct-v0p2 | 32,768 | |
Mixtral MoE 8x22BServerless | Mixtral 8x22b that matches HF numerics accounts/fireworks/models/mixtral-8x22b-hf | 65,536 | |
Mixtral MoE 8x22B Instruct (HF version)Serverless | Mixtral 8x22b Instruct that matches HF numerics accounts/fireworks/models/mixtral-8x22b-instruct-hf | 65,536 | |
Mixtral MoE 8x7BServerless | Mistral MoE model. Warning: unofficial implementation as model code is not yet available. accounts/fireworks/models/mixtral-8x7b | 32,768 | |
Mixtral MoE 8x7B Instruct (HF version)Serverless | Mixtral MoE 8x7B Instruct, with matching numerics from Huggingface implementation accounts/fireworks/models/mixtral-8x7b-instruct-hf | 32,768 | |
MythoMax L2 13bServerless | An improved, potentially even perfected variant of MythoMix. accounts/fireworks/models/mythomax-l2-13b | 4,096 | |
Nous Hermes 2 - Mixtral 8x7B - DPO (fp8)Serverless | Nous Hermes 2 Mixtral 8x7B DPO is the new flagship Nous Research model trained over the Mixtral 8x7B MoE LLM. accounts/fireworks/models/nous-hermes-2-mixtral-8x7b-dpo-fp8 | 32,768 | |
Mistral 7B OpenOrcaServerless | We have used our own OpenOrca dataset to fine-tune on top of Mistral 7B. This dataset is our attempt to reproduce the dataset generated for Microsoft Research's Orca Paper. We use OpenChat packing, trained with Axolotl. accounts/fireworks/models/openorca-7b | 32,768 | |
Qwen1.5 72B ChatServerless | Qwen1.5 is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. accounts/fireworks/models/qwen1p5-72b-chat | 32,768 | |
StableLM 2 Zephyr 1.6BServerless | Stable LM 2 Zephyr 1.6B is a 1.6 billion parameter instruction tuned language model inspired by HugginFaceH4's Zephyr 7B training pipeline. The model is trained on a mix of publicly available datasets and synthetic datasets, utilizing Direct Preference Optimization (DPO). accounts/stability/models/stablelm-2-zephyr-2b | 4,096 | |
StableLM Zephyr 3BServerless | StableLM Zephyr 3B is a 3 billion parameter instruction tuned inspired by HugginFaceH4's Zephyr 7B training pipeline this model was trained on a mix of publicly available datasets, synthetic datasets using Direct Preference Optimization (DPO), evaluation for this model based on MT Bench and Alpaca Benchmark. accounts/stability/models/stablelm-zephyr-3b | 4,096 | |
StarCoder 15.5BServerless | A 15.5B parameter model trained on 80+ programming languages from The Stack (v1.2), using Multi Query Attention and the Fill-in-the-Middle objective. accounts/fireworks/models/starcoder-16b | 8,192 | |
StarCoder 7BServerless | A 7B parameter model trained on 80+ programming languages from The Stack (v1.2), using Multi Query Attention and the Fill-in-the-Middle objective. accounts/fireworks/models/starcoder-7b | 8,192 | |
Traditional Chinese Llama2Serverless | QLoRa fine-tuned Llama 2 model on traditional Chinese Alpaca dataset. accounts/fireworks/models/traditional-chinese-qlora-llama2 | 4,096 | |
Capybara 34BServerless | 34B chat model from NousResearch, based on Yi-34B-200k. accounts/fireworks/models/yi-34b-200k-capybara | 200,000 | |
Zephyr 7B BetaServerless | Zephyr is a series of language models that are trained to act as helpful assistants. Zephyr-7B-β is the second model in the series, and is a fine-tuned version of mistralai/Mistral-7B-v0.1 that was trained on on a mix of publicly available, synthetic datasets using Direct Preference Optimization (DPO). accounts/fireworks/models/zephyr-7b-beta | 4,096 |