正在加载视频...

视频加载失败

Introducing TAPIR & RoboTAP, our latest research from Google DeepMind. It focuses on spatial intelligence via point tracking, outlining how it enables applications from robotics to video generation to augmented reality, and more!

47,833 次观看 • 2 年前 •via X (Twitter)

9 条评论

Carl Doersch 的头像
Carl Doersch2 年前

Our robotic system can learn industry-relevant tasks from 4-6 demonstrations. Above, at each moment, the system automatically identifies which points must move (red) and where they must move to (cyan) to complete the task. Below, we show points as discovered from demos.

Carl Doersch 的头像
Carl Doersch2 年前

In video generation, we demonstrate a system which first generates motions and then generates pixels to match those motions, leading to generated videos containing complex motions while keeping textures consistent over time.

Carl Doersch 的头像
Carl Doersch2 年前

Powering it all is TAPIR, our open-source model which can track with high quality and in real time. Newly-released is our unsupervised clustering code, which lets you segment moving objects automatically from videos. Try it at:

Carl Doersch 的头像
Carl Doersch2 年前

Joint work with @yangyi02, Mel Vecerik, @joaocarreira @tdavchev, @JonathanScholz2, Andrew Zisserman, @yusufaytar, Stannis Zhou, @dilaragoekay, Ankush Gupta, @LourdesAgapito, @RaiaHadsell

Lucas Beyer (bl16) 的头像
Lucas Beyer (bl16)2 年前

@GoogleDeepMind This « points need to move » is a pretty cool way of formalizing the task, congrats!

Get off X! @ChuckBaggett Chuck Baggett 的头像
Get off X! @ChuckBaggett Chuck Baggett2 年前

@GoogleDeepMind

Marcel Hussing 的头像
Marcel Hussing2 年前

@GoogleDeepMind This is a great video visualization! The moving points immediately made me think about algorithms classes. 😁

We'llmakeitbrahs 的头像
We'llmakeitbrahs2 年前

@GoogleDeepMind will code for RoboTAP be open-sourced as well?

Sam 的头像
Sam2 年前

@DynamicWebPaige @GoogleDeepMind New GPU architecture when? Lol

相关视频