Agents · best for
Best AI model for Agent Workflows (2026)
Multi-step tool-using agents with planning. Ranked from 346 live models on the OpenRouter catalog, weighted for tool calling, reasoning quality, context window.
| # | Model | Score | In / 1M | Out / 1M | Context | |
|---|---|---|---|---|---|---|
| 1 | Qwen: Qwen3.6 Plusqwen/qwen3.6-plus | 140 | $0.33 | $1.95 | 1,000,000 | Try → |
| 2 | xAI: Grok 4.20x-ai/grok-4.20 | 140 | $2.00 | $6.00 | 2,000,000 | Try → |
| 3 | OpenAI: GPT-5.4 Nanoopenai/gpt-5.4-nano | 140 | $0.20 | $1.25 | 400,000 | Try → |
| 4 | OpenAI: GPT-5.4 Miniopenai/gpt-5.4-mini | 140 | $0.75 | $4.50 | 400,000 | Try → |
| 5 | OpenAI: GPT-5.4openai/gpt-5.4 | 140 | $2.50 | $15.00 | 1,050,000 | Try → |
| 6 | Google: Gemini 3.1 Flash Lite Previewgoogle/gemini-3.1-flash-lite-preview | 140 | $0.25 | $1.50 | 1,048,576 | Try → |
| 7 | Qwen: Qwen3.5-Flashqwen/qwen3.5-flash-02-23 | 140 | $0.07 | $0.26 | 1,000,000 | Try → |
| 8 | Google: Gemini 3.1 Pro Preview Custom Toolsgoogle/gemini-3.1-pro-preview-customtools | 140 | $2.00 | $12.00 | 1,048,576 | Try → |
| 9 | OpenAI: GPT-5.3-Codexopenai/gpt-5.3-codex | 140 | $1.75 | $14.00 | 400,000 | Try → |
| 10 | Google: Gemini 3.1 Pro Previewgoogle/gemini-3.1-pro-preview | 140 | $2.00 | $12.00 | 1,048,576 | Try → |
| 11 | Qwen: Qwen3.5 Plus 2026-02-15qwen/qwen3.5-plus-02-15 | 140 | $0.26 | $1.56 | 1,000,000 | Try → |
| 12 | Google: Gemini 3 Flash Previewgoogle/gemini-3-flash-preview | 140 | $0.50 | $3.00 | 1,048,576 | Try → |
| 13 | OpenAI: GPT-5.2openai/gpt-5.2 | 140 | $1.75 | $14.00 | 400,000 | Try → |
| 14 | Amazon: Nova 2 Liteamazon/nova-2-lite-v1 | 140 | $0.30 | $2.50 | 1,000,000 | Try → |
| 15 | xAI: Grok 4.1 Fastx-ai/grok-4.1-fast | 140 | $0.20 | $0.50 | 2,000,000 | Try → |
How we ranked these
For Agent Workflows, we weight models on tool calling, reasoning quality, context window. Higher means better. Scores combine OpenRouter's model metadata (context length, modality support, tool calling, structured output, reasoning capability) with public pricing. See full methodology →
Related tasks
Agents
Best for Browser Automation
Models that drive headless browsers reliably.
Agents
Best for Function / Tool Calling
Reliable JSON tool-call generation.
Agents
Best for RAG Pipelines
Retrieval-augmented question answering.
Agents
Best for Long-Context Q&A
Answering questions over 100K+ token docs.
Agents
Best for Coding Agents
Models that operate codebases end-to-end.