Loading video...

Video Failed to Load

Go Home

Introducing The Matrix --- a foundation world model for generating infinite-length, hyper-realistic videos with real-time, frame-level control: - Infinite-length video generation - 720p high-quality rendering - Real-time, frame-level control at 16 FPS - Generalization to real-world video control 🔗Blog: 📄Paper: 💻Code & Playable Demo: Coming soon! Key Innovation: A...

178,232 views • 1 year ago •via X (Twitter)

10 Comments

Hongyang Zhang's profile picture
Hongyang Zhang1 year ago

We compare The Matrix with many state-of-the-art game simulators.

Hongyang Zhang's profile picture
Hongyang Zhang1 year ago

Interestingly, pre-trained on a vast collection of internet videos combined with AAA game footage, The Matrix demonstrates impressive domain generalization. For instance, it enables scenarios like driving a BMW X3 through an office area.

Hongyang Zhang's profile picture
Hongyang Zhang1 year ago

Here’s an example showcasing The Matrix generating an ultra-long video with precise real-time control lasting over 14 minutes (>13440 frames). For more examples, visit our project page:

Vaibhav (VB) Srivastav's profile picture
Vaibhav (VB) Srivastav1 year ago

Amazing! Looking forward to the release! 🔥 If you do open model checkpoint release as well, then I’d be happy to help you with that from @huggingface side 🤗 My DMs are open! Let’s make this huge!

Hongyang Zhang's profile picture
Hongyang Zhang1 year ago

@huggingface Thank you for the offer. 🤗

xiao sun's profile picture
xiao sun1 year ago

in some sense it's a 1D generation (or fake 2D), same view won't show up twice when you look back at it again.

Jason Kneen's profile picture
Jason Kneen1 year ago

Everything about gaming, mobile gaming and everything else is about to change. Can't wait for the code drop people are going to go crazy with this :)

Bobby's profile picture
Bobby1 year ago

nice! however "consistency models in real-time" need to also apply to the backgrounds, the mountains completely changing in 5 seconds after the camera turns isn't going to work. Every asset needs to remain consistent - unless the goal is an acidtrip experience for the end user.🫠

Hongyang Zhang's profile picture
Hongyang Zhang1 year ago

is also at X, the first author of this project.

mcquin mcdonalds westwood skyhigh underground's profile picture
mcquin mcdonalds westwood skyhigh underground1 year ago

@Yuchenj_UW host on hyperbolic

Related Videos