ByteDance Seed: Seed-2.0-Mini
Seed-2.0-Mini is a multimodal model from ByteDance that accepts text, image, and video inputs, making it broader in modality coverage than many models in its price tier. It supports tool use and reasoning, offers a 262,144-token context window, and can produce up to 131,072 output tokens in a single completion. Structured output support is unconfirmed. Pricing sits at $0.10 per million input tokens and $0.40 per million output tokens. The case for shortlisting it is mostly economic and exploratory. At that price point it undercuts many comparable multimodal models, which makes it worth testing for cost-sensitive workloads involving long documents or video content. The significant caveat is that there is currently no independent benchmark coverage, so performance relative to peers is unproven. Buyers who need verified quality benchmarks before committing should wait for third-party evaluations; those comfortable running their own evals may find the pricing worth the trial.
- Model ID
- bytedance-seed/seed-2.0-mini
- Vendor
- bytedance-seed
- Tokenizer
- Other
- Input Modalities
- text, image, video
- Output Modalities
- text
- Max Output
- 131,072 tokens
- Tool Calling
- ✓ supported
- Structured Output
- ✓ supported
- Reasoning Mode
- ✓ supported
- Vision
- ✓ accepts images
- Audio
- no
- Moderated
- no