ByteDance Seed: Seed-2.0-Lite
Seed-2.0-Lite is a multimodal model from ByteDance that accepts text, image, and video inputs, making it one of the few models in its price tier capable of processing video alongside other content types. It supports tool use and reasoning, carries a 262,144-token context window, and allows up to 131,072 completion tokens. Structured output support is unconfirmed, which is worth checking before building pipelines that depend on it. At $0.25 per million input tokens and $2.00 per million output tokens, the pricing is competitive, particularly for high-volume or long-context workloads. The core tradeoff is that there is currently no independent benchmark coverage, so performance on standard tasks is unverified. Buyers who need video understanding at a low input cost may find it worth evaluating, but teams requiring validated accuracy before deployment should treat Seed-2.0-Lite as unproven until third-party results are available.
- Model ID
- bytedance-seed/seed-2.0-lite
- Vendor
- bytedance-seed
- Tokenizer
- Other
- Input Modalities
- text, image, video
- Output Modalities
- text
- Max Output
- 131,072 tokens
- Tool Calling
- ✓ supported
- Structured Output
- ✓ supported
- Reasoning Mode
- ✓ supported
- Vision
- ✓ accepts images
- Audio
- no
- Moderated
- no
Category rankings
Where ByteDance Seed: Seed-2.0-Lite places across the 1 category it ranks in. How we rank →
| # | Category | Score |
|---|---|---|
| #15 | Video Auto-TaggingVideo · of 25 ranked | 123 |