Qwen: Qwen3 VL 235B A22B Thinking
Qwen3 VL 235B A22B Thinking is a multimodal model from Qwen that accepts both text and image inputs, with a 131,072-token context window and a 32,768-token output limit. It supports tool use and reasoning, making it suited for multi-step tasks that involve external function calls or extended chains of thought. Structured output support is unconfirmed based on available data. On price, it sits at $0.26 per million input tokens and $2.60 per million output tokens, which places it in a mid-range tier worth comparing against similarly priced alternatives. The practical catch is that no independent benchmark coverage currently exists, so performance claims cannot be verified against standardized tests. Buyers who need vision plus reasoning and tool use in a single model may find it worth testing, but anyone relying on benchmark data to make a decision should treat this model as unproven until independent scores become available.
- Model ID
- qwen/qwen3-vl-235b-a22b-thinking
- Vendor
- qwen
- Tokenizer
- Qwen3
- Input Modalities
- text, image
- Output Modalities
- text
- Max Output
- 32,768 tokens
- Tool Calling
- ✓ supported
- Structured Output
- ✓ supported
- Reasoning Mode
- ✓ supported
- Vision
- ✓ accepts images
- Audio
- no
- Moderated
- no