正在加载视频...

视频加载失败

🚨 ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning RGB-D images -> 3D scene-graphs -> enhance with vision-language features Enables a wide range of perception and planning abilities. Deployed in many real-world use-cases.

75,593 次观看 • 2 年前 •via X (Twitter)

11 条评论

Krishna Murthy 的头像
Krishna Murthy2 年前

We designed a 3D mapping pipeline that takes in posed RGB-D images and builds an object-based map. Each object in the map is augmented with multi-view fused CLIP features and language descriptions. We build a scene graph; which can easily be parsed by LLMs

Krishna Murthy 的头像
Krishna Murthy2 年前

Text queries can be abstract / complex. We ask our robot to find "something that goes well with a Ronald MacDonald outfit", and it finds a pair of "red and white sneakers"

Krishna Murthy 的头像
Krishna Murthy2 年前

We can also identify misplaced objects in the scene graph and search for them. Here, we move the shoes away from the earlier location. The robot then begins searching for plausible locations for the red-white sneakers, and finds them at a nearby shoerack

Krishna Murthy 的头像
Krishna Murthy2 年前

Another query: "Something to wear for a space party" Finds a t-shirt that has NASA written on it

Krishna Murthy 的头像
Krishna Murthy2 年前

Queries can involve both text and image context. Show the robot a photograph of Michael Jordan, and ask it to go to "something this guy would play with" The robot finds a nearby basketball

Krishna Murthy 的头像
Krishna Murthy2 年前

Another example. Here, a handwritten note that reads "go to the laundry bag" indicates to the robot its target location.

Krishna Murthy 的头像
Krishna Murthy2 年前

This time, a mobile manipulation task, where a Spot mini has to pick up "something healthy to eat". It picks up a plastic mango

Krishna Murthy 的头像
Krishna Murthy2 年前

"My wrist hurts from using my screwdriver all day. Anything to help?" Robot finds a power drill

Krishna Murthy 的头像
Krishna Murthy2 年前

We implement a particle-filter based localization approach using per-object CLIP features. This helps us relocalize against a prebuilt map; identify and add new objects to the map; and detect objects that no longer exist in the scene.

Krishna Murthy 的头像
Krishna Murthy2 年前

We're excited about the prospects ConceptGraphs brings about! Great fun collab: HUGE shoutout to @qiaogu1997 @alihkw_ @SachaMori without whom this work wouldn't have taken the shape it has w/ @bipashasen31 @skymanaditya1 @duckietown_coo @florian_shkurti and many others

Krishna Murthy 的头像
Krishna Murthy2 年前

ConceptGraphs significantly improves upon ConceptFusion ( making it possible to scale to larger scenes, run inference on-board CPUs, and handles more complex queries. Find Ali's short explainer video at:

相关视频