bytedance-seed

ByteDance Seed: Seed-2.0-Lite

Seed-2.0-Lite is a multimodal model from ByteDance that accepts text, image, and video inputs, making it one of the few models in its price tier capable of processing video alongside other content types. It supports tool use and reasoning, carries a 262,144-token context window, and allows up to 131,072 completion tokens. Structured output support is unconfirmed, which is worth checking before building pipelines that depend on it. At $0.25 per million input tokens and $2.00 per million output tokens, the pricing is competitive, particularly for high-volume or long-context workloads. The core tradeoff is that there is currently no independent benchmark coverage, so performance on standard tasks is unverified. Buyers who need video understanding at a low input cost may find it worth evaluating, but teams requiring validated accuracy before deployment should treat Seed-2.0-Lite as unproven until third-party results are available.

Quality Score
100/100
price + capability + benchmarks
Input Price
$0.25
per 1M tokens
Output Price
$2.00
per 1M tokens
Context Window
262,144
tokens
Model ID
bytedance-seed/seed-2.0-lite
Vendor
bytedance-seed
Tokenizer
Other
Input Modalities
text, image, video
Output Modalities
text
Max Output
131,072 tokens
Tool Calling
✓ supported
Structured Output
✓ supported
Reasoning Mode
✓ supported
Vision
✓ accepts images
Audio
no
Moderated
no

Category rankings

Where ByteDance Seed: Seed-2.0-Lite places across the 1 category it ranks in. How we rank →

#CategoryScore
#15 Video Auto-TaggingVideo · of 25 ranked 123

Similar models