stepfun

StepFun: Step 3.7 Flash

Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model. It pairs a 196B-parameter language backbone with a vision encoder for native image and video understanding, activating roughly 11B parameters...

Quality Score
100/100
price + capability + benchmarks
Input Price
$0.20
per 1M tokens
Output Price
$1.15
per 1M tokens
Context Window
256,000
tokens
Model ID
stepfun/step-3.7-flash
Vendor
stepfun
Tokenizer
Other
Input Modalities
text, image, video
Output Modalities
text
Max Output
256,000 tokens
Tool Calling
✓ supported
Structured Output
✓ supported
Reasoning Mode
✓ supported
Vision
✓ accepts images
Audio
no
Moderated
no

Strong choice for

Similar models