head-to-head

Google: Gemma 4 31B vs xAI: Grok 4.20

Side-by-side comparison of specs, pricing, benchmark scores, and task rankings. Updated 2026-06-16.

Google: Gemma 4 31B xAI: Grok 4.20
Vendorgooglex-ai
Quality Score100100
Benchmark Score68.574.7
Input Price$0.12/M$1.25/M
Output Price$0.35/M$2.50/M
Context Window262,1442,000,000
Max Output262,144-
Tool Calling
Structured Output
Reasoning Mode
Vision
Audio--
Benchmark Scores
ai_index64.781.4
ai_index_agentic67.688.9
ai_index_coding63.969.6
eqbench70.855.8

Who wins by task?

TaskGoogle: Gemma 4 31BxAI: Grok 4.20
SQL Generation 164 171
Code Review 161 170
Code Completion 132 122
Code Refactoring 157 168
Bug Fixing 173 185
Unit Test Generation 148 154
Code Documentation 141 147
Regex Writing 135 137
CI/CD Pipelines 140 146
Frontend Component Design 143 146
Data Analysis 163 170
CSV / Spreadsheet Cleanup 146 153
ETL Scripting 149 157
JSON Extraction 143 136
Bulk Data Labeling 133 125
OCR / Document Parsing 141 145
Table Extraction from PDFs 141 145
Long-Document Summarization 155 166
Short-Form Summarization 131 124
Blog Post Writing 138 143

Scores reflect capability match + benchmark data + pricing for each task. Methodology →

Related comparisons

MoonshotAI: Kimi K2.7 Code vs Google: Gemma 4 31B MoonshotAI: Kimi K2.7 Code vs xAI: Grok 4.20 Qwen: Qwen3.7 Plus vs Google: Gemma 4 31B Qwen: Qwen3.7 Plus vs xAI: Grok 4.20 MiniMax: MiniMax M3 vs Google: Gemma 4 31B MiniMax: MiniMax M3 vs xAI: Grok 4.20 StepFun: Step 3.7 Flash vs Google: Gemma 4 31B StepFun: Step 3.7 Flash vs xAI: Grok 4.20