FLUX.1 on Fireworks: Fast, Frugal, and Flexible
By Fireworks AI|10/22/2024
Llama 3.3 70B Instruct just dropped, featuring improved reasoning, math, and instruction-following. Try it out!
By Fireworks AI|10/22/2024
In partnership with Black Forest Labs, Fireworks is excited to announce commercially-usable FLUX.1 [dev] and FLUX.1 [schnell] models on Fireworks:
Fireworks has been committed to fast, production-ready image generation - through milestones like an exclusive launch of Stable Diffusion 3 and serving SDXL and Playground v2.5 with < 1 second generation times. Today, Fireworks is partnering with Black Forest Labs, the original creators of the Stable Diffusion image generation models, to offer FLUX.1 [dev] and FLUX.1 [schnell]. Fireworks’ platform is designed to bring AI applications from prototype to production usage. By default, FLUX.1 models used outside of Fireworks have restrictions on commercial usage, but Fireworks and Black Forest Labs’ partnership enables commercial usage of both models on the Fireworks platform. Fireworks offers:
The FLUX models are two of the highest-quality image models available and Fireworks offer the most customizable, fastest, and scalable services for running these models. The FLUX models are available both on Fireworks serverless, where you pay per image (per diffusion step) and do not need to configure GPUs.
Flux on Fireworks is served with industry-leading speeds and prices. FLUX.1 [dev] and FLUX.1 [schnell] cost $0.0005 and $0.00035, respectively, per diffusion step. This equates to a price of $0.014 and $0.0014 per image, respectively (with default settings), less than half the cost of other providers of FLUX which typically serve at $0.03 and $0.003 per image.
Example images generated with FLUX.1 [dev] text-to-image and ControlNet on Fireworks (see example code)
While the FLUX models are powerful on their own, they can be even more useful as part of a broader compound AI system. Production usage of AI frequently requires the use of multiple components, API calls, and models. That’s why Fireworks is excited to offer the most customizable implementation of FLUX.
Fireworks supports:
These customizations are available for FLUX models served on our on-demand deployments, where you deploy private, auto-scaling GPU(s). Beyond offering additional support for customizations, the private GPUs that back on-demand deployments are perfect for production traffic and demand spikes. On-demand are billed by GPU-second and users pay nothing when GPUs aren’t in use. Both FLUX.1 [schnell] and [dev] fit on a single A100 or H100.
Please note that these customizations are currently only available with BF16-precision versions of FLUX. Fireworks on-demand deployments support users’ choice of BF16 and FP8-quantization for either Flux model. Fireworks’ serverless implementation of FLUX models is FP8-quantized, where we’ve observed substantial speed improvements with negligible quality impact.
Fireworks’ FLUX.1 support has been built on an alpha version of Fireworks’ Flumina Server Apps framework. Multimedia models are frequently used alongside other models or code. For example, image generation models are often used with LoRAs or upscaling models. However, these multimedia apps can be hard to deploy and slow.
Flumina is a framework that solves this problem by enabling custom multimedia models and workloads to run on Fireworks infrastructure. Flumina unlocks:
Developers package together models, pre/postprocessing logic, and business logic into an app and Fireworks provides an API to optimized, scalable infrastructure. Get started today with Flux customizations on Flumina. See the implementations of FLUX and the ControlNet-Union adapter for examples. Stay tuned for the full Flumina announcement! Fill out this form if you have a specific multimedia (audio, video, image, etc) app that you’d want to deploy quickly and easily through guided, alpha usage of Flumina.
Deploy Flux FP8 models on-demand
Deploy Flux FP16 models on-demand
Use ControlNet with FP16 Flux models on-demand
With speed, cost-efficiency, and customizability, FLUX on Fireworks provides everything developers need to bring AI to production. The Fireworks platform provides the best building blocks for compound AI applications by providing a variety of customizable models and components, on top of Fireworks’ blazing-fast inference engine.
Ready to start building with FLUX?