head-to-head

StepFun: Step 3.7 Flash vs Mistral: Mistral Small 4

Side-by-side comparison of specs, pricing, benchmark scores, and task rankings. Updated 2026-06-12.

StepFun: Step 3.7 Flash Mistral: Mistral Small 4
Vendorstepfunmistralai
Quality Score100100
Benchmark Score74.47.1
Input Price$0.20/M$0.15/M
Output Price$1.15/M$0.60/M
Context Window256,000262,144
Max Output256,000-
Tool Calling
Structured Output
Reasoning Mode
Vision
Audio--
Benchmark Scores
ai_index70.316.8
ai_index_agentic98.2-
ai_index_coding61.2-

Who wins by task?

TaskStepFun: Step 3.7 FlashMistral: Mistral Small 4
SQL Generation 163 133
Code Review 156 129
Code Completion 130 129
Code Refactoring 151 130
Bug Fixing 171 133
Unit Test Generation 146 123
Code Documentation 136 127
Regex Writing 135 121
CI/CD Pipelines 137 119
Frontend Component Design 142 124
Data Analysis 166 126
CSV / Spreadsheet Cleanup 143 129
ETL Scripting 144 124
JSON Extraction 143 131
Bulk Data Labeling 133 129
OCR / Document Parsing 139 129
Table Extraction from PDFs 139 129
Long-Document Summarization 148 132
Short-Form Summarization 131 124
Blog Post Writing 135 121

Scores reflect capability match + benchmark data + pricing for each task. Methodology →

Related comparisons

Qwen: Qwen3.7 Plus vs StepFun: Step 3.7 Flash Qwen: Qwen3.7 Plus vs Mistral: Mistral Small 4 MiniMax: MiniMax M3 vs StepFun: Step 3.7 Flash MiniMax: MiniMax M3 vs Mistral: Mistral Small 4 StepFun: Step 3.7 Flash vs xAI: Grok Build 0.1 StepFun: Step 3.7 Flash vs Google: Gemini 3.5 Flash StepFun: Step 3.7 Flash vs Google: Gemini 3.1 Flash Lite StepFun: Step 3.7 Flash vs xAI: Grok 4.3