Loading video...

Video Failed to Load

Go Home

Introducing TraceVLA: a fully open-source Vision-Language-Action model reimagining spatial-temporal awareness: ✨ 3.5x gains on real robots, SOTA in simulation 💡 Fine-tunes on just 150K trajectories ⚡ Compact 4B model = 7B performance

39,378 views • 1 year ago •via X (Twitter)

11 Comments

Yongyuan Liang's profile picture
Yongyuan Liang1 year ago

We introduce visual trace prompting:🔹Track robot's movement via point-tracking (Co-Tracker) 🔹Overlay traces on observations Model processes: 1️⃣ Original view (preserve full info) 2️⃣ View with traces as prompts A simple yet powerful technique to boost VLA's spatial understanding

Yongyuan Liang's profile picture
Yongyuan Liang1 year ago

TraceVLA in action: Watch it excel at diverse manipulation tasks on a real WidowX-250 robot! From soft-object handling to precision pick-and-place, TraceVLA consistently outperforms OpenVLA in both in-distribution and out-of-distribution tasks.

Yongyuan Liang's profile picture
Yongyuan Liang1 year ago

Superior simulation results: On Google’s SimplerEnv robot tasks, TraceVLA outshines OpenVLA across all metrics in both 7B and 4B versions! 🚀 20% boost in handling: ▪️ Camera changes ▪️ Distractors ▪️ Varied visual backgrounds

Yongyuan Liang's profile picture
Yongyuan Liang1 year ago

Efficient and lightweight: 🔸 TraceVLA requires <10GB memory on 8 H100 GPUs 🔸 Adds only 0.036s per timestep A powerful VLA upgrade with minimal overhead!

Yongyuan Liang's profile picture
Yongyuan Liang1 year ago

Available resources include: ▫️7B TraceVLA checkpoints ▫️Lightweight 4B Phi3V-OpenVLA & TraceVLA models ▫️Fine-tuned TraceVLA models 💻 Code: 🤗 Models: Try TraceVLA family models today!

Yongyuan Liang's profile picture
Yongyuan Liang1 year ago

Check out our project page: ArXiv: Joint work with @ruijie_zheng12 @ShuaiyiH @JianfengGao0217 @haldaume3 @Andrey__Kolobov @furongh @jw2yang4ai

Yang's profile picture
Yang1 year ago

Want to learn how practical AI skills and automations for your business and work? Check out our 50+ step-by-step video tutorials 100% FREE 20+ hours of Ai and Automation goodness absolutely free 🥳

Mu Cai @ Industry Job Market's profile picture
Mu Cai @ Industry Job Market1 year ago

Congratulations! Really interesting work on applying visual prompts on VLA tasks!

Yongyuan Liang's profile picture
Yongyuan Liang1 year ago

Thanks!!!

Dmytro Kuzmenko's profile picture
Dmytro Kuzmenko1 year ago

thank you very much for sharing, great idea and rather impressive results!

Ray | AI marketer - Social Media Assistant's profile picture
Ray | AI marketer - Social Media Assistant1 year ago

real-time engagement is key. we help brands connect with their audience 24/7, no burnout.

Related Videos