Poly Logo

Polylabs

Free ToolsBlog
ByteDance
Bytedance

UI-TARS 7B

Updated: June 2026

UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, including desktop interfaces, web browsers, mobile systems, and games. Built by ByteDance, it builds upon the UI-TARS framework with reinforcement...

Specifications

Context
128K
Input
$0.1/M
Output
$0.2/M

Capabilities

VISIONTEXTCODINGTHINKINGWRITING

Similarly Priced Models

ModelProviderContextInput PriceOutput Price
Qwen3.5-9B
QwenQwen
262K$0.1/M$0.15/M
ByteDance Seed: Seed-2.0-Mini
ByteDanceByteDance
262K$0.1/M$0.4/M
Qwen3.5 Flash
QwenQwen
1M$0.1/M$0.4/M
Ministral 3 3B 2512
MistralMistralAI
131K$0.1/M$0.1/M
Voxtral Small 24B 2507
MistralMistralAI
32K$0.1/M$0.3/M

Curious about UI-TARS 7B?