
Yining Hong
@yining_hong • 4,227 subscribers
💻Postdoc in CS AI @stanford | 🤖3D-LLMs | embodied world models | Test-Time Training | Musician -🎸Multi-Instrumentalist & Composer | Metalhead 🤘🏼
Videos

Excited to share ESI-BENCH, a benchmark for Embodied Spatial Intelligence! Most spatial reasoning benchmarks assume an oracle observer: the agent is given the right image, view, or 3D scene. But in the real world, the observer is also an actor. To understand space, agents must decide where to look, how to move, and when to interact, to reveal what is hidden: occlusions, containment, contact, dynamics, and functionality. In many cases, the hard part is not perception itself, but choosing the right action to make informative perception possible. ESI-BENCH tests this perception-action loop. Agents receive an egocentric observation and a spatial question, then must actively gather evidence through perception, locomotion, and manipulationbefore answering. The benchmark spans 10 task categories, 29 subcategories, and 3,081 instances, built in BEHAVIOR-1K across realistic interactive scenes. 🌍Webpage: 💻Code & data: Thanks for collaborators: Jiageng, Han, Manling Li , Leonidas Guibas, Fei-Fei Li , Jiajun Wu , Yejin Choi
Yining Hong48,170 görüntüleme • 29 gün önce

Meet Embodied Web Agents that bridge physical-digital realms. Imagine embodied agents that can search for online recipes, shop for ingredients and cook for you. Embodied web agents search internet information for implementing real-world embodied tasks. All data, codes and web environments are available at Paper link:
Yining Hong53,882 görüntüleme • 1 yıl önce
Daha fazla içerik yok.