Featured Blogs

Partnering with Meta to bring Llama 3 to Firework’s inference and fine-tuning

Partnering with Meta to bring Llama 3 to Firework’s inference and fine-tuning

Announcing Meta Llama 3 on Fireworks AI blazing-fast inference stack

Read More
Getting Started with Stability’s API Powered by Fireworks

Getting Started with Stability’s API Powered by Fireworks

4/17/2024

View Article
Optimizing Retrieval Augmented Generation (RAG) with MongoDB Atlas and Fireworks AI

Optimizing Retrieval Augmented Generation (RAG) with MongoDB Atlas and Fireworks AI

3/21/2024

View Article
Fireworks launches fine-tuning service - Rapidly iterate on quality and scale to production through Fireworks inference

Fireworks launches fine-tuning service - Rapidly iterate on quality and scale to production through Fireworks inference

3/8/2024

View Article
Fireworks Platform Spring 2024 Updates

Fireworks Platform Spring 2024 Updates

3/1/2024

View Article
FireFunction V1 - Fireworks’ GPT-4-level function calling model - 4x faster than GPT-4 and open weights

FireFunction V1 - Fireworks’ GPT-4-level function calling model - 4x faster than GPT-4 and open weights

2/20/2024

View Article
Why do all LLMs need structured output modes?

Why do all LLMs need structured output modes?

2/20/2024

View Article
FireLLaVA: the first commercially permissive OSS LLaVA model

FireLLaVA: the first commercially permissive OSS LLaVA model

1/18/2024

View Article
FireAttention — Serving Open Source Models 4x faster than vLLM by quantizing with ~no tradeoffs

FireAttention — Serving Open Source Models 4x faster than vLLM by quantizing with ~no tradeoffs

1/8/2024

View Article
Fireworks Raises the Quality Bar with Function Calling Model and API Release

Fireworks Raises the Quality Bar with Function Calling Model and API Release

12/20/2023

View Article
Mixtral 8x7B on Fireworks: faster, cheaper, even before the official release

Mixtral 8x7B on Fireworks: faster, cheaper, even before the official release

12/14/2023

View Article
LLM Inference Performance Benchmarking (Part 1)

LLM Inference Performance Benchmarking (Part 1)

11/3/2023

View Article
New in Fireworks: Image-to-Image and ControlNet support for SSD-1B and SDXL!

New in Fireworks: Image-to-Image and ControlNet support for SSD-1B and SDXL!

11/2/2023

View Article
Fireworks.ai Achieves SOC 2 Type II and HIPAA Compliance

Fireworks.ai Achieves SOC 2 Type II and HIPAA Compliance

10/27/2023

View Article
Accelerating Code Completion with Fireworks Fast LLM Inference

Accelerating Code Completion with Fireworks Fast LLM Inference

10/11/2023

View Article
Fireworks.ai Now Available on LangChain Prompt Playground

Fireworks.ai Now Available on LangChain Prompt Playground

10/2/2023

View Article
Simplifying Code Infilling with Code Llama and Fireworks.ai

Simplifying Code Infilling with Code Llama and Fireworks.ai

9/12/2023

View Article
Speed, Python: Pick Two. How CUDA Graphs Enable Fast Python Code for Deep Learning

Speed, Python: Pick Two. How CUDA Graphs Enable Fast Python Code for Deep Learning

8/29/2023

View Article
Fireworks.ai: Fast, Affordable, Customizable Gen AI Platform

Fireworks.ai: Fast, Affordable, Customizable Gen AI Platform

8/17/2023

View Article
Multi-Query Attention is All You Need

Multi-Query Attention is All You Need

7/12/2023

View Article

© 2024 Fireworks AI All rights reserved.