qwen

Qwen: Qwen3.5-35B-A3B

Qwen3.5-35B-A3B is a multimodal model from Qwen that accepts text, image, and video inputs. It supports tool use and reasoning, making it usable for agentic workflows and multi-step tasks. Its context window of 262,144 tokens accommodates long documents or extended conversations, and it can generate up to 81,920 tokens per response. At $0.14 per million input tokens and $1.00 per million output tokens, it sits on the affordable end for a reasoning-capable multimodal model. Its blended benchmark score of 59.4 is based on only three benchmarks, so treat that figure as a limited signal rather than a comprehensive verdict. The agentic benchmark score of 72.8 is its strongest result, which suggests it may be worth shortlisting for tool-calling and agent-oriented use cases where video and long-context support are also useful. Teams needing broader performance validation should wait for wider benchmark coverage before committing.

Quality Score
100/100
price + capability + benchmarks
Input Price
$0.14
per 1M tokens
Output Price
$1.00
per 1M tokens
Context Window
262,144
tokens
Model ID
qwen/qwen3.5-35b-a3b
Vendor
qwen
Tokenizer
Qwen3
Input Modalities
text, image, video
Output Modalities
text
Max Output
81,920 tokens
Tool Calling
✓ supported
Structured Output
✓ supported
Reasoning Mode
✓ supported
Vision
✓ accepts images
Audio
no
Moderated
no

Category rankings

Where Qwen: Qwen3.5-35B-A3B places across the 8 categories it ranks in. How we rank →

#CategoryScore
#16 Video SummarizationVideo · of 25 ranked 148
#17 Image CaptioningVision · of 25 ranked 120
#19 Social Media PostsWriting · of 25 ranked 119
#19 Voice Assistant BackendVoice · of 25 ranked 123
#19 Video Auto-TaggingVideo · of 25 ranked 123
#20 Self-Hosted / LocalCost · of 25 ranked 117
#20 Real-Time ChatLatency · of 25 ranked 117
#24 Bulk Data LabelingData · of 25 ranked 132

Similar models