Loading video...
Video Failed to Load
3D-LLM: Injecting the 3D World into Large Language Models paper page: Large language models (LLMs) and Vision-Language Models (VLMs) have been proven to excel at multiple tasks, such as commonsense reasoning. Powerful as these models can be, they are not grounded in the 3D physical world, which involves richer... show more
249,572 views • 2 years ago •via X (Twitter)
7 Comments

Yining Hong2 years ago
Thanks for featuring our work!

DevHunterAI2 years ago
Wow

AssistedEvolution2 years ago
Looks like nice work but surprising that folk have not been doing this already as transformer -> hippocample complex so this theoretically is exactly the way you might expect to train it. i.e. with spatio- temporal context.

JP2 years ago
Could this be leveraged to understand n dimensional spaces such as the weights and biases of a NN

Ori ~ᗜˬᗜ〜♡ — e/acc2 years ago
🔥

Reverie2 years ago
I guess MAXAR Tech starts looking for this, More precision LLMs and VLMs for their 3D large-scale maps. Such a great work!

Ippi2 years ago
It's Skynet Alpha version noooooo
