Video wird geladen...
Video konnte nicht geladen werden
๐ Introducing Emu3.5 โ a large-scale multimodal world model that natively predicts the next vision-language state. ๐ฅ Trained on over 10T interleaved vision-language tokens and enhanced with reinforcement learning, Emu3.5 achieves powerful multimodal reasoning and generation. โก Powered by our new Discrete Diffusion Adaptation (DiDA) for 20ร faster inference.... show more
51,683 Aufrufe โข vor 7 Monaten โขvia X (Twitter)
0 Kommentare
Keine Kommentare verfรผgbar
Kommentare vom Original-Post werden hier angezeigt

