Загрузка видео...

Не удалось загрузить видео

На главную

Wow, diffusion models (used in AI image generation) are also game engines - a type of world simulation. By predicting the next frame of the classic shooter DOOM, you get a playable game at 20 fps without any underlying real game engine. This video is from the diffusion model.

1,768,653 просмотров • 1 год назад •via X (Twitter)

Комментарии: 9

Фото профиля Ethan Mollick
Ethan Mollick1 год назад

Paper and details:

Фото профиля Elon Musk
Elon Musk1 год назад

Tesla can do something similar with real world video

Фото профиля Kristoph
Kristoph1 год назад

I honestly don’t find this compelling. They obviously trained on a large corpus of games screenshots and it’s just generating the screen that probilistixally follows the current one. The issue is that this can only be achieved by having an initial game from which a corpus can be derived. If there is no game there is no corpus so where is the value add?

Фото профиля Kristoph
Kristoph1 год назад

How does the game maintain state? When you turn around how does it know what came before?

Фото профиля Gurwinder
Gurwinder1 год назад

Gives a whole new meaning to P(doom)!

Фото профиля George Saoulidis ⚡
George Saoulidis ⚡1 год назад

So you made DOOM run inside an LLM. Just say it

Фото профиля Rammy
Rammy1 год назад

So you’re saying that it shows you images based on where you’re looking? As in it only renders when observed? 👀 *Tinfoil hat intensifies*

Фото профиля Bart Trzynadlowski
Bart Trzynadlowski1 год назад

Amazing but also supremely irritating that the source code isn’t available given that it’s based on an open source model.

Фото профиля Lucidyn
Lucidyn1 год назад

Ok a few questions that i have what's the difference in power consumption between the original dos release the doom 64 bit release and the AI generated version here. Its cool but i am guessing it will come at a massive cost showing a traditional rendering will be more efficient.

Похожие видео

At Avalon we are building "Real-time creating" - the ability to generate gameplay ready persistent worlds prompted from text. While others are building real-time video world models, Avalon is building real-time world generation inside a fully playable, persistent multiplayer engine. Internally running at 3840×2180 at 60 FPS. Built on Unreal Engine. Multiplayer by default. Persistent by default. Gameplay-ready by default. This is not a video latent replay. Not a simulation of interaction. It is a real 3D world with physics, logic, and authoritative multiplayer state. Avalon is trained on proprietary Avalon interaction data and powered by a hybrid system that combines language understanding, 3D model generation, procedural systems, and structured gameplay logic synthesis. Players can walk through a live world and generate environments, assets, mechanics, and entirely new gameplay modes using natural language. We accomplish this through a combination of 3D model generation, game logic generation based on our proprietary systems, and AI driven world creation. While other players are inside it. Changes persist instantly. State is synchronized in real time. Creation happens inside the world, not outside of it. Describe a biome. Spawn a civilization. Create a survival mode. Build a dungeon crawler. Launch a new game inside the world. Avalon interprets intent and integrates it directly into the live multiplayer environment. This is not a world model predicting video. This is a gameplay engine that understands language. If you can describe it, you can build it. And others can walk into it instantly.

AVALON

59,410 просмотров • 4 месяцев назад

Tencent presents GameGen-O Open-world Video Game Generation We introduce GameGen-O, the first diffusion transformer model tailored for the generation of open-world video games. This model facilitates high-quality, open-domain generation by simulating a wide array of game engine features, such as innovative characters, dynamic environments, complex actions, and diverse events. Additionally, it provides interactive controllability, thus allowing for the gameplay simulation. The development of GameGen-O involves a comprehensive data collection and processing effort from scratch. We collect and build the first Open-World Video Game Dataset (OGameData), amassed extensive data from over a hundred of next-generation open-world games, employing a proprietary data pipeline for efficient sorting, scoring, filtering, and decoupled captioning. This robust and extensive OGameData forms the foundation of our model's training process. GameGen-O undergoes a two-stage training process, consisting of foundation model pretraining and instruction tuning. In the first phase, the model is pre-trained on the OGameData via the text-to-video and video continuation, endowing GameGen-O with the capability for open-domain video game generation. In the second phase, the pre-trained model is frozen, and we fine-tuned using a trainable InstructNet, which enables the production of subsequent frames based on multimodal structural instructions. This whole training process imparts the model with the ability to generate and interactively control content. In summary, GameGen-O represents a notable initial step forward in the realm of open-world video game generation via generative models. It underscores the potential of generative models to serve as an alternative to rendering techniques, which can efficiently combine creative generation with interactive capabilities.

AK

366,948 просмотров • 1 год назад