Llama 3.2 11B Vision Instruct API & Playground

Instruction-tuned image reasoning model from Meta with 11B parameters. Optimized for visual recognition, image reasoning, captioning, and answering general questions about an image. The model can understand visual data, such as charts and graphs and also bridge the gap between vision and language by generating text to describe images details

Fireworks Features

On-demand Deployment

Docs

On-demand deployments give you dedicated GPUs for Llama 3.2 11B Vision Instruct using Fireworks' reliable, high-performance system with no rate limits.

Metadata

State

Ready

Created on

9/24/2024

Kind

Base model

Provider

Specification

Calibrated

Mixture-of-Experts

Parameters

10.7B

Supported Functionality

Fine-tuning

Not supported

Serverless

Not supported

Serverless LoRA

Supported

Context Length

131.1k tokens

Function Calling

Not supported

Embeddings

Not supported

Rerankers

Not supported

Support image input

Supported