Molmo2 is a family of open vision-language models developed by the Allen Institute for AI (Ai2) that support image, video and multi-image understanding and grounding. Molmo 2 (4B) is Qwen 3-based – optimized for efficiency.
On-demand DeploymentDocs | On-demand deployments give you dedicated GPUs for Molmo2-4B using Fireworks' reliable, high-performance system with no rate limits. |