Qwen: Qwen3.5-9B
Qwen3.5-9B is a multimodal model from Qwen that accepts text, image, and video inputs. It supports tool use and reasoning, which makes it applicable to agentic workflows where model-driven decision-making is needed. Its context window of 262,144 tokens accommodates long documents and extended conversations, and the matching maximum completion length means output is not artificially capped relative to input. At $0.10 per million input tokens and $0.15 per million output tokens, this model sits at the budget end of the pricing spectrum. Its blended benchmark score of 50.1 covers only 3 benchmarks, so performance claims should be treated as preliminary rather than well-established. The agentic score of 61.7 is its strongest result, making it a reasonable candidate for cost-sensitive teams building tool-calling pipelines, but buyers who need confidence across coding or general reasoning tasks should weigh that limited benchmark coverage before committing.
- Model ID
- qwen/qwen3.5-9b
- Vendor
- qwen
- Tokenizer
- Qwen3
- Input Modalities
- text, image, video
- Output Modalities
- text
- Max Output
- 262,144 tokens
- Tool Calling
- ✓ supported
- Structured Output
- ✓ supported
- Reasoning Mode
- ✓ supported
- Vision
- ✓ accepts images
- Audio
- no
- Moderated
- no