正在加载视频...

视频加载失败

TidyBot: Personalized Robot Assistance with Large Language Models approach enables fast adaptation and achieves 91.2% accuracy on unseen objects in our benchmark dataset. We also demonstrate our approach on a real-world mobile manipulator called TidyBot, which successfully puts away 85.0% of objects in real-world test scenarios abs: project page:...

325,986 次观看 • 3 年前 •via X (Twitter)

7 条评论

Jimmy Wu 的头像
Jimmy Wu3 年前

Thanks @_akhaliq for sharing our work! I wrote a thread with more details here:

hardmaru 的头像
hardmaru3 年前

This is the easy part :) I need a robot that can clean the bits of pieces of jam, bread crumbs, diapers, and occasionally pieces of poo around various corners of the room, under the tables, and hidden in the kids play area of the house.

hgtp:// Alkimi $ADS $QNT Cat 🐈‍⬛ 的头像
hgtp:// Alkimi $ADS $QNT Cat 🐈‍⬛3 年前

How do I order one??

Defend Intelligence (Anis Ayari) 的头像
Defend Intelligence (Anis Ayari)3 年前

Really nice ! thank you for demonstrating this capability. LLM could then indeed be used as the "reasoning" block to achieves unseen world reasonning and allow tasks to get a better generalization to accomply them.

Dan Rockwell 的头像
Dan Rockwell3 年前

I felt like it was missing something..

St. Clair Newbern IV 的头像
St. Clair Newbern IV3 年前

Should be the standard upsell on all children. 😂

Astral Turf 的头像
Astral Turf3 年前

@ericjang11 Not very impressive really.

相关视频

DisCo: Disentangled Control for Referring Human Dance Generation in Real World paper page: Generative AI has made significant strides in computer vision, particularly in image/video synthesis conditioned on text descriptions. Despite the advancements, it remains challenging especially in the generation of human-centric content such as dance synthesis. Existing dance synthesis methods struggle with the gap between synthesized content and real-world dance scenarios. In this paper, we define a new problem setting: Referring Human Dance Generation, which focuses on real-world dance scenarios with three important properties: (i) Faithfulness: the synthesis should retain the appearance of both human subject foreground and background from the reference image, and precisely follow the target pose; (ii) Generalizability: the model should generalize to unseen human subjects, backgrounds, and poses; (iii) Compositionality: it should allow for composition of seen/unseen subjects, backgrounds, and poses from different sources. To address these challenges, we introduce a novel approach, DISCO, which includes a novel model architecture with disentangled control to improve the faithfulness and compositionality of dance synthesis, and an effective human attribute pre-training for better generalizability to unseen humans. Extensive qualitative and quantitative results demonstrate that DISCO can generate high-quality human dance images and videos with diverse appearances and flexible motions.

AK

161,453 次观看 • 2 年前