Qwen: Qwen3.5-35B-A3B
Qwen3.5-35B-A3B is a multimodal model from Qwen that accepts text, image, and video inputs. It supports tool use and reasoning, making it usable for agentic workflows and multi-step tasks. Its context window of 262,144 tokens accommodates long documents or extended conversations, and it can generate up to 81,920 tokens per response. At $0.14 per million input tokens and $1.00 per million output tokens, it sits on the affordable end for a reasoning-capable multimodal model. Its blended benchmark score of 59.4 is based on only three benchmarks, so treat that figure as a limited signal rather than a comprehensive verdict. The agentic benchmark score of 72.8 is its strongest result, which suggests it may be worth shortlisting for tool-calling and agent-oriented use cases where video and long-context support are also useful. Teams needing broader performance validation should wait for wider benchmark coverage before committing.
- Model ID
- qwen/qwen3.5-35b-a3b
- Vendor
- qwen
- Tokenizer
- Qwen3
- Input Modalities
- text, image, video
- Output Modalities
- text
- Max Output
- 81,920 tokens
- Tool Calling
- ✓ supported
- Structured Output
- ✓ supported
- Reasoning Mode
- ✓ supported
- Vision
- ✓ accepts images
- Audio
- no
- Moderated
- no
Category rankings
Where Qwen: Qwen3.5-35B-A3B places across the 8 categories it ranks in. How we rank →
| # | Category | Score |
|---|---|---|
| #16 | Video SummarizationVideo · of 25 ranked | 148 |
| #17 | Image CaptioningVision · of 25 ranked | 120 |
| #19 | Social Media PostsWriting · of 25 ranked | 119 |
| #19 | Voice Assistant BackendVoice · of 25 ranked | 123 |
| #19 | Video Auto-TaggingVideo · of 25 ranked | 123 |
| #20 | Self-Hosted / LocalCost · of 25 ranked | 117 |
| #20 | Real-Time ChatLatency · of 25 ranked | 117 |
| #24 | Bulk Data LabelingData · of 25 ranked | 132 |