bytedance-seed

ByteDance Seed: Seed-2.0-Mini

Seed-2.0-Mini is a multimodal model from ByteDance that accepts text, image, and video inputs, making it broader in modality coverage than many models in its price tier. It supports tool use and reasoning, offers a 262,144-token context window, and can produce up to 131,072 output tokens in a single completion. Structured output support is unconfirmed. Pricing sits at $0.10 per million input tokens and $0.40 per million output tokens. The case for shortlisting it is mostly economic and exploratory. At that price point it undercuts many comparable multimodal models, which makes it worth testing for cost-sensitive workloads involving long documents or video content. The significant caveat is that there is currently no independent benchmark coverage, so performance relative to peers is unproven. Buyers who need verified quality benchmarks before committing should wait for third-party evaluations; those comfortable running their own evals may find the pricing worth the trial.

Quality Score
100/100
price + capability + benchmarks
Input Price
$0.10
per 1M tokens
Output Price
$0.40
per 1M tokens
Context Window
262,144
tokens
Model ID
bytedance-seed/seed-2.0-mini
Vendor
bytedance-seed
Tokenizer
Other
Input Modalities
text, image, video
Output Modalities
text
Max Output
131,072 tokens
Tool Calling
✓ supported
Structured Output
✓ supported
Reasoning Mode
✓ supported
Vision
✓ accepts images
Audio
no
Moderated
no

Similar models