Vision · best for
Best AI model for Screenshot Debugging (2026)
Diagnosing UI bugs from a screenshot. Ranked from 346 live models on the OpenRouter catalog, weighted for vision input, reasoning quality.
| # | Model | Score | In / 1M | Out / 1M | Context | |
|---|---|---|---|---|---|---|
| 1 | MoonshotAI: Kimi K2.6moonshotai/kimi-k2.6 | 123 | $0.80 | $3.50 | 262,144 | Try → |
| 2 | Google: Gemma 4 26B A4B (free)google/gemma-4-26b-a4b-it:free | 123 | Free | Free | 262,144 | Try → |
| 3 | Google: Gemma 4 26B A4B google/gemma-4-26b-a4b-it | 123 | $0.07 | $0.35 | 262,144 | Try → |
| 4 | Google: Gemma 4 31B (free)google/gemma-4-31b-it:free | 123 | Free | Free | 262,144 | Try → |
| 5 | Google: Gemma 4 31Bgoogle/gemma-4-31b-it | 123 | $0.13 | $0.38 | 262,144 | Try → |
| 6 | Qwen: Qwen3.6 Plusqwen/qwen3.6-plus | 123 | $0.33 | $1.95 | 1,000,000 | Try → |
| 7 | Z.ai: GLM 5V Turboz-ai/glm-5v-turbo | 123 | $1.20 | $4.00 | 202,752 | Try → |
| 8 | xAI: Grok 4.20x-ai/grok-4.20 | 123 | $2.00 | $6.00 | 2,000,000 | Try → |
| 9 | Xiaomi: MiMo-V2-Omnixiaomi/mimo-v2-omni | 123 | $0.40 | $2.00 | 262,144 | Try → |
| 10 | OpenAI: GPT-5.4 Nanoopenai/gpt-5.4-nano | 123 | $0.20 | $1.25 | 400,000 | Try → |
| 11 | OpenAI: GPT-5.4 Miniopenai/gpt-5.4-mini | 123 | $0.75 | $4.50 | 400,000 | Try → |
| 12 | Mistral: Mistral Small 4mistralai/mistral-small-2603 | 123 | $0.15 | $0.60 | 262,144 | Try → |
| 13 | ByteDance Seed: Seed-2.0-Litebytedance-seed/seed-2.0-lite | 123 | $0.25 | $2.00 | 262,144 | Try → |
| 14 | Qwen: Qwen3.5-9Bqwen/qwen3.5-9b | 123 | $0.10 | $0.15 | 262,144 | Try → |
| 15 | OpenAI: GPT-5.4openai/gpt-5.4 | 123 | $2.50 | $15.00 | 1,050,000 | Try → |
How we ranked these
For Screenshot Debugging, we weight models on vision input, reasoning quality. Higher means better. Scores combine OpenRouter's model metadata (context length, modality support, tool calling, structured output, reasoning capability) with public pricing. See full methodology →
Related tasks
Vision
Best for Image Captioning
Accessible alt text and detailed image descriptions.
Vision
Best for Image Generation
Models that produce images, not just read them.
Vision
Best for Diagram Extraction
Reading flowcharts, org charts, architecture diagrams.
Vision
Best for Chart & Graph Reading
Pulling numbers off charts in research papers and reports.