
Fireworks AI has acquired Hathora, and we're thrilled to bring their team and technology into the Fireworks family.
Lin Qiao shared her excitement about the acquisition, noting, “Hathora’s intense focus on every millisecond and every routing decision is precisely the discipline required for cutting-edge AI inference.”
Since the first multiplayer games appeared on the internet, lag has been the enemy. In gaming, milliseconds determine whether you win or lose. Speed isn’t a feature; it’s survival.
AI inferences is entering that same era.
Solving that requires a particular kind of team: engineers who obsess over systems, performance, and reliability at a global scale.
From the beginning, Fireworks has set out to build an elite group of infrastructure engineers. People who care deeply about kernel performance, scheduling decisions, networking paths, and the invisible layers that make intelligent systems instantaneous. The Hathora team fits that ethos perfectly.
Over four years, Harsh, Sid, and their team built a global container orchestration platform designed for latency-sensitive, real-time workloads. Their system routes containers across 14 regions, multiple bare-metal providers, and four clouds with the performance guarantees demanded by multiplayer gaming. If orchestration adds even 20 milliseconds of latency, players notice immediately.
That discipline, an obsession with every millisecond and every routing decision, is exactly what AI inference needs.
With Hathora’s expertise, we are accelerating our ability to deliver faster inference, smarter routing, and globally resilient infrastructure. Their orchestration technology strengthens the core layer that determines how quickly requests reach the optimal GPU and how seamlessly applications scale under real-world demand. By integrating their technology into Fireworks' inference cloud, we will increase the reliability of AI applications powered by Fireworks. Whether your users are in Tokyo or New York, they deserve to experience consistent, sub-second response times, which are made possible by automatically routing requests to the most available capacity in real-time.
Just as importantly, this acquisition reflects how we plan to build Fireworks going forward. We are building a world-class team of infrastructure obsessives. Engineers who care about scheduling algorithms, kernel bypass networking, GPU memory pressure, placement strategies, and failure domains. Builders who want to work at the layer where distributed systems, hardware, and AI models intersect. The layer where milliseconds matter.
The next generation of AI products will not tolerate slow infrastructure. Real-time agents, multimodal copilots, and production-grade reasoning systems require orchestration that feels invisible. That requires a team that treats latency as a bug and reliability as table stakes.
If you want to build the fastest AI inference platform in the world, and you care about the hard systems problems that make that possible, we are hiring.