Загрузка видео...
Не удалось загрузить видео
Diffusion has shown great promise for generating robot **actions**, can it act as a **world model** to generate the future conditioned on actions? In our work led by han qi Haocheng Yin and in collaboration with Yilun Du, we show a **controllable** action-conditioned video diffusion model can produce photorealistic... show more
38,390 просмотров • 1 год назад •via X (Twitter)
Комментарии: 9

@hanqi359246 @hcy1n @du_yilun This is a huge step forward! Using diffusion models as world models for action-conditioned predictions could revolutionize robotics. Excited to see how this improves policy learning and control.

In this episode of the 'In Security' Podcast, coming to you from the Himalayas, @WilHarm3, Operating Partner and CISO at @craft_ventures, and Josh Mullis, Head of Information Security at @productiv_inc, share thoughts on the evolving role of a CISO. 🔗:

@hanqi359246 @hcy1n @du_yilun 😮

@hanqi359246 @hcy1n @du_yilun When I see this I think 3D printer control.

@hanqi359246 @hcy1n @du_yilun Melt the glaciers

@hanqi359246 @hcy1n @du_yilun 😯

@hanqi359246 @hcy1n @du_yilun cool work!

@hanqi359246 @hcy1n @du_yilun Seems like a bit wasteful (for compute) to plan in image space, could we adapt this with V-JEPA which gives us video prediction in a latent space? Or is there a benefit to images?

@hanqi359246 @hcy1n @du_yilun Great comment. Definitely prediction in latent space should be the way forward. Perhaps not just latent space, but more structured representations that are object-centric/semantic. Images may be just a showcase of possibility and first step.
