Mistral: Mistral Small 3.2 24B
Mistral Small 3.2 24B is a multimodal model from Mistral AI that accepts both image and text inputs and generates text responses. It supports a 128,000-token context window with a maximum of 16,384 output tokens per completion, and it supports tool use. It does not include a dedicated reasoning mode, and structured output support is unconfirmed. At $0.075 per million input tokens and $0.20 per million output tokens, this model sits at the budget end of the multimodal market. Its blended benchmark score of 59.6 comes from a single benchmark, so that figure should be treated as a limited data point rather than a comprehensive picture of capability. Buyers who need image-plus-text processing and tool calling at low cost may find it worth evaluating, but those requiring proven reasoning or structured output should look elsewhere until broader benchmark coverage is available.
- Model ID
- mistralai/mistral-small-3.2-24b-instruct
- Vendor
- mistralai
- Tokenizer
- Mistral
- Input Modalities
- image, text
- Output Modalities
- text
- Max Output
- 16,384 tokens
- Tool Calling
- ✓ supported
- Structured Output
- ✓ supported
- Reasoning Mode
- not supported
- Vision
- ✓ accepts images
- Audio
- no
- Moderated
- no