Video yükleniyor...
Video Yüklenemedi
Imitation learning has a data scarcity problem. Introducing EgoDex from Apple, the largest and most diverse dataset of dexterous human manipulation to date — 829 hours of egocentric video + paired 3D hand poses across 194 tasks. Now on arxiv: (1/4)
113,815 görüntüleme • 1 yıl önce •via X (Twitter)
11 Yorum

Unlike teleoperation, egocentric video is passively scalable - like text and images on the Internet. We use Apple Vision Pro to collect video + precise pose annotations (unlike Ego4D, which lacks native pose data). This unlocks 5x the scale of existing large datasets like DROID.

We also propose new benchmarks and train imitation learning policies for dexterous trajectory prediction. Below are 30 Hz wrist and fingertip trajectories on the test set, where blue = ground truth, red = model predictions, and points get lighter up to 2 seconds in the future.

The full dataset is now publicly available to the community, access details are in the paper. Sample code for data loading is coming soon. Enjoy!

⚠️The average person generates 2.5 quintillion bytes of data annually. That's enough to fill 575,000 libraries!📚 This data is used to track, target, and manipulate you. #Cybersecurity matters.💡 Cybersecurity Dictionary for Everyone is on Apple Books:

Perfect for Optimus to learn new skills

Fyi the dataset links dont work: “ NoSuchKeyThe specified key does not exist.datasets/egodex/[filename].zip9C2FBJJ7FJHKHDT3Uk3l1oHoR9NeaNJdC7gInDjt5u8slFtW5lRt9wFR0MQIWNXIk4sTWiLGEYF22KUPQQ9X6CVC+UU=”

Looks great. You mention that it’s now public but I don’t find the link anywhere.

excellent work, congrats!

Congrats Ryan! Awesome work as always!!

Impressive! Interesting use of the Apple Vision Pro.

So hype

