Загрузка видео...
Не удалось загрузить видео
Presenting DemoDiffusion: An extremely simple approach enabling a pre-trained 'generalist' diffusion policy to follow a human-demonstration for a novel task during inference One-shot human imitation *without* requiring any paired human-robot data or online RL 🙂 1/n
32,830 просмотров • 11 месяцев назад •via X (Twitter)
Комментарии: 8

The key insight of DemoDiffusion is to start the denoising process for the diffusion policy with the re-targeted human hand trajectory (instead of starting from pure noise) This simple approach doesn't require fine-tuning/updating the diffusion policy in any way! 2/n

Results show that DemoDiffusion can perform tasks that the pre-trained diffusion policy (pi-0) fails at zero-shot, just from one human demonstration of the task! 3/n

We even see zero-shot generalization to objects different from what the human demonstration was shown on! This suggests DemoDiffusion is able to exploit the semantic/spatial generalization of the pre-trained diffusion policy - while guiding it based on the human demo 4/n

DemoDiffusion is made possible by @sungj1026 's amazing lead, and @shubhtuls 's precise insights on diffusion models @CMU_Robotics Code, Videos, Paper: (finally, thanks to @physical_int for pi0 and @geopavlakos @JitendraMalikCV et al. for HaMeR) n/n

@shubhtuls @CMU_Robotics @physical_int @geopavlakos @JitendraMalikCV Also check out this alternate thread from @sungj1026 on DemoDiffusion (n+1)/n

Nice work! Warm-starting the denoising progress with a human prior is very smart.

Perhaps true mastery lies in effortless adaptation, not rigid programming.

Thats clever, skipping the fine-tuning part is a flex
