head-to-head

OpenAI: GPT-5.3-Codex vs Anthropic: Claude Sonnet 4.6

Side-by-side comparison of specs, pricing, benchmark scores, and task rankings. Updated 2026-05-12.

Who wins by task?

Task	OpenAI: GPT-5.3-Codex	Anthropic: Claude Sonnet 4.6
SQL Generation	132	181
Code Review	132	177
Code Completion	116	118
Code Refactoring	136	172
Bug Fixing	136	194
Unit Test Generation	124	163
Code Documentation	128	144
Regex Writing	116	139
CI/CD Pipelines	120	152
Frontend Component Design	122	153
Data Analysis	124	184
CSV / Spreadsheet Cleanup	132	158
ETL Scripting	128	162
JSON Extraction	120	141
Bulk Data Labeling	117	123
OCR / Document Parsing	131	150
Table Extraction from PDFs	131	150
Long-Document Summarization	136	166
Short-Form Summarization	112	123
Blog Post Writing	120	145

Scores reflect capability match + benchmark data + pricing for each task. Methodology →