Kimi K2 Instruct, a 1T parameter model with state of the art quality for coding, reasoning, and agentic tool use, is now available on Fireworks! Try now

Blog
Story Notion

How Notion Cuts Latency 4x and Scales Enterprise AI Workflows with Fireworks AI

Notion

Notion’s journey from individual users to enterprise powerhouse showcases how Fireworks AI enables scalable, reliable, and efficient AI experiences for over 100 million users, including nearly 70% of Fortune 100 companies.

Challenge: Scaling AI Beyond Chat to Enterprise-Grade Agentic Workflows

“Not everyone at Notion is an AI expert, but every engineer needs to be fluent in how to work in this AI landscape,” explains Sarah Sachs, Head of AI Engineering at Notion. With an expanding enterprise customer base, Notion needed AI that could do more than answer questions. They needed sophisticated AI agents to reliably integrate with complex workflows across tools like Slack, Jira, and GitHub.

“Our users expect AI that helps them move naturally from meetings to tasks, not just a chat experience,” Sarah adds. “That transition from Q&A to agentic workflows is essential for ‘vibe working’ — our vision for how work should flow.”

The stakes were high: delivering seamless AI at scale meant overcoming latency, cost, and reliability challenges. “Latency is perceived as search quality,” Sarah notes, highlighting the critical need for speed. Long waits feel slow. Instant responses transform the user experience.

Solution: Fine-Tuned, Efficient Models and Scalable AI Agents

Partnering with Fireworks AI, Notion fine-tuned smaller models that run faster and more accurately than standard open-source models.

“By fine-tuning models, we reduced latency from about 2 seconds to 350 milliseconds, significantly improving performance and enabling us to launch AI features at scale,” says Sarah.

Impact: Powering Notion's "Vibe Working" with Speed and Scale

Notion’s engineers dogfood the product daily, creating a rapid feedback loop. “We have a leaderboard tracking meeting minutes recorded internally,” Sarah shares, “and I was trying to beat my boss on usage and bug reports.” This hands-on culture accelerates iteration and ensures the AI truly meets user needs.

Today, Notion delivers an AI-powered experience that feels natural, intuitive, and deeply integrated into workflows.

Fireworks AI’s platform gave Notion the power to:

  • Deliver AI responses 4X faster
  • Serve 100+ million users with reliable, low-latency workflows
  • Iterate quickly with robust tooling and monitoring
“By partnering with Fireworks AI to fine-tune models, we reduced latency from about 2 seconds to 350 milliseconds, significantly improving performance and enabling us to launch AI features at scale,” says Sarah. “That improvement is a game changer for delivering reliable, enterprise-scale AI powered by Fireworks.”

This partnership makes Notion’s vision of “vibe working” natural, with intuitive AI that helps users move from meetings to action.

Looking Ahead: Scaling AI Engineering Across the Organization

With AI core to their SaaS product, Notion must scale AI engineering beyond a few ML experts. “Our working model of 10 ML engineers owning all AI doesn’t work anymore,” Sarah admits.

Fireworks AI’s scalable infrastructure and tooling help Notion empower hundreds of engineers to build and own AI-powered workflows. As Sarah puts it:

“We’re not just building for today’s use cases. Fireworks helps us stay ahead by enabling fast iteration, reliable scaling, and the flexibility to adapt as AI agents evolve.”

Watch the full fireside chat to hear Notion’s AI leaders share their insights and journey.