正在加载视频...
视频加载失败
🚨 ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning RGB-D images -> 3D scene-graphs -> enhance with vision-language features Enables a wide range of perception and planning abilities. Deployed in many real-world use-cases.
11 条评论

We designed a 3D mapping pipeline that takes in posed RGB-D images and builds an object-based map. Each object in the map is augmented with multi-view fused CLIP features and language descriptions. We build a scene graph; which can easily be parsed by LLMs

Text queries can be abstract / complex. We ask our robot to find "something that goes well with a Ronald MacDonald outfit", and it finds a pair of "red and white sneakers"

We can also identify misplaced objects in the scene graph and search for them. Here, we move the shoes away from the earlier location. The robot then begins searching for plausible locations for the red-white sneakers, and finds them at a nearby shoerack

Another query: "Something to wear for a space party" Finds a t-shirt that has NASA written on it

Queries can involve both text and image context. Show the robot a photograph of Michael Jordan, and ask it to go to "something this guy would play with" The robot finds a nearby basketball

Another example. Here, a handwritten note that reads "go to the laundry bag" indicates to the robot its target location.

This time, a mobile manipulation task, where a Spot mini has to pick up "something healthy to eat". It picks up a plastic mango

"My wrist hurts from using my screwdriver all day. Anything to help?" Robot finds a power drill

We implement a particle-filter based localization approach using per-object CLIP features. This helps us relocalize against a prebuilt map; identify and add new objects to the map; and detect objects that no longer exist in the scene.

We're excited about the prospects ConceptGraphs brings about! Great fun collab: HUGE shoutout to @qiaogu1997 @alihkw_ @SachaMori without whom this work wouldn't have taken the shape it has w/ @bipashasen31 @skymanaditya1 @duckietown_coo @florian_shkurti and many others

ConceptGraphs significantly improves upon ConceptFusion ( making it possible to scale to larger scenes, run inference on-board CPUs, and handles more complex queries. Find Ali's short explainer video at:
