Molmo2 is a family of open vision-language models developed by the Allen Institute for AI (Ai2) that support image, video and multi-image understanding and grounding. Molmo 2 (4B) is Qwen 3-based – optimized for efficiency.
On-demand DeploymentDocs | On-demand deployments allow you to use Molmo2-4B on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits. |