The InternVL3 collection of models are advanced multimodal large language models that combine superior vision and language understanding capabilities. Built with a ViT-MLP-LLM architecture, these models excel at multimodal reasoning, document analysis, video understanding, and complex visual tasks while supporting dynamic resolution processing and extended context understanding.
On-demand deployments give you dedicated GPUs for InternVL3 78B using Fireworks' reliable, high-performance system with no rate limits.
Learn MoreFireworks AI
16384
$0.9