head-to-head

StepFun: Step 3.7 Flash vs Mistral: Mistral Medium 3.5

Side-by-side comparison of specs, pricing, benchmark scores, and task rankings. Updated 2026-06-12.

StepFun: Step 3.7 Flash Mistral: Mistral Medium 3.5
Vendorstepfunmistralai
Quality Score100100
Benchmark Score74.467.9
Input Price$0.20/M$1.50/M
Output Price$1.15/M$7.50/M
Context Window256,000262,144
Max Output256,000-
Tool Calling
Structured Output
Reasoning Mode
Vision
Audio--
Benchmark Scores
ai_index70.364.7
ai_index_agentic98.287.7
ai_index_coding61.258.4

Who wins by task?

TaskStepFun: Step 3.7 FlashMistral: Mistral Medium 3.5
SQL Generation 163 160
Code Review 156 154
Code Completion 130 116
Code Refactoring 151 149
Bug Fixing 171 168
Unit Test Generation 146 144
Code Documentation 136 134
Regex Writing 135 132
CI/CD Pipelines 137 136
Frontend Component Design 142 141
Data Analysis 166 162
CSV / Spreadsheet Cleanup 143 141
ETL Scripting 144 142
JSON Extraction 143 133
Bulk Data Labeling 133 123
OCR / Document Parsing 139 139
Table Extraction from PDFs 139 139
Long-Document Summarization 148 146
Short-Form Summarization 131 121
Blog Post Writing 135 133

Scores reflect capability match + benchmark data + pricing for each task. Methodology →

Related comparisons

Qwen: Qwen3.7 Plus vs StepFun: Step 3.7 Flash Qwen: Qwen3.7 Plus vs Mistral: Mistral Medium 3.5 MiniMax: MiniMax M3 vs StepFun: Step 3.7 Flash MiniMax: MiniMax M3 vs Mistral: Mistral Medium 3.5 StepFun: Step 3.7 Flash vs xAI: Grok Build 0.1 StepFun: Step 3.7 Flash vs Google: Gemini 3.5 Flash StepFun: Step 3.7 Flash vs Google: Gemini 3.1 Flash Lite StepFun: Step 3.7 Flash vs xAI: Grok 4.3