Zhiting Hu's banner

Zhiting Hu

@ZhitingHu • 4,428 subscribers

Assist. Prof. at UC San Diego; Artificial Intelligence, Machine Learning, Natural Language Processing

Videos

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Exclusive private shows

1.2k viewers online

Private Show

Join now for exclusive access

Free preview available • Premium content

Enjoy breakfast made by robot roommate😉

Enjoy breakfast made by robot roommate😉

64,803 Aufrufe • vor 1 Jahr

🔥Really excited to see the release of PAN world model, a project I had been working over the past years. PAN is a general world model capable of simulating physical, agentic, and nested worlds, synthesizing infinite interactive experiences for training AI agents. Building on top of pretrained LLMs and video diffusion models, PAN connects language, perception, action, and latent thoughts, for long-horizon simulation and reasoning. PAN shows overwhelming performance gains over JEPA-2, Cosmos-2, and other prior models. More in the thread👇 ... 1/

🔥Really excited to see the release of PAN world model, a project I had been working over the past years. PAN is a general world model capable of simulating physical, agentic, and nested worlds, synthesizing infinite interactive experiences for training AI agents. Building on top of pretrained LLMs and video diffusion models, PAN connects language, perception, action, and latent thoughts, for long-horizon simulation and reasoning. PAN shows overwhelming performance gains over JEPA-2, Cosmos-2, and other prior models. More in the thread👇 ... 1/

31,213 Aufrufe • vor 8 Monaten

Super excited to introduce Pandora, a generative video World Model interactively controllable by language. #Sora and #GPT4 are both powerful. How about fusing them in a single model? 💥 Pandora gives a preview:🔭 > Build a General World Model (GWM) super efficiently by integrating pretrained autoregressive LLM and diffusion Video Model, aligning them in the representation space. > Let the LLM control the VM on-the-fly. Instruction tuning maximizes the controllability. > Autoregressive LLM empowers VM to generate indefinitely long videos: Starting with a VM for 2-second videos, Pandora extends it for 8-second videos. Would #Sora+#GPT4 under Pandora produce hours-long videos? 📽️ > World Model is beyond just video generation. It’s sensory-level information processing + concept-level reasoning and reflection. Pandora bridges both, with the concept- / language-level backbone (LLM) managing & steering the sensory-level VM functionalities. 👁️🧠 Check out for a bunch of interesting results:

Super excited to introduce Pandora, a generative video World Model interactively controllable by language. #Sora and #GPT4 are both powerful. How about fusing them in a single model? 💥 Pandora gives a preview:🔭 > Build a General World Model (GWM) super efficiently by integrating pretrained autoregressive LLM and diffusion Video Model, aligning them in the representation space. > Let the LLM control the VM on-the-fly. Instruction tuning maximizes the controllability. > Autoregressive LLM empowers VM to generate indefinitely long videos: Starting with a VM for 2-second videos, Pandora extends it for 8-second videos. Would #Sora+#GPT4 under Pandora produce hours-long videos? 📽️ > World Model is beyond just video generation. It’s sensory-level information processing + concept-level reasoning and reflection. Pandora bridges both, with the concept- / language-level backbone (LLM) managing & steering the sensory-level VM functionalities. 👁️🧠 Check out for a bunch of interesting results:

63,456 Aufrufe • vor 2 Jahren

A humanoid robot dancing with agility and flair💃 ... in a world _interactively_ simulated by world model Here’s the choreography we told the model to simulate, step by step: 💃Wave both arms and start jumping 👋 💃Dance dance dance‼️ 💃Stand still and put left arm behind back 💃Grasp a rose🌹behind and show the rose to the audience; raise arm high in the air 💃Bend body slightly and raise arm high in the air🦿 💃Stand straight and raise both arms above head 💃Bend body together with hands 💃Stand up straight again; wave right hand 💃Turn right and walk away from the camera; wave right hand🚶 💃Stop walking; look around 💃Make a heart shape with hands💕

A humanoid robot dancing with agility and flair💃 ... in a world _interactively_ simulated by world model Here’s the choreography we told the model to simulate, step by step: 💃Wave both arms and start jumping 👋 💃Dance dance dance‼️ 💃Stand still and put left arm behind back 💃Grasp a rose🌹behind and show the rose to the audience; raise arm high in the air 💃Bend body slightly and raise arm high in the air🦿 💃Stand straight and raise both arms above head 💃Bend body together with hands 💃Stand up straight again; wave right hand 💃Turn right and walk away from the camera; wave right hand🚶 💃Stop walking; look around 💃Make a heart shape with hands💕

14,063 Aufrufe • vor 1 Jahr

🚨Do frontier VLMs (o3, Gemini 2.5, Claude 3.5, Qwen…) actually learn an internal world model🌍? Surprisingly, the answer appears to be a hard NO—as revealed by our WM Atomic Benchmark⚛️. Even o3 struggles with the most basic, atomic-level questions: ❌Confuse triangles📐 with circles⭕️ ❌Believe 🟦blue objects move faster than 🟩green ones ❌Fail at compositional and transitive reasoning While humans perform nearly perfectly, these frontier models often score at chance level‼️ 🔎 More details in the thread below 👇

🚨Do frontier VLMs (o3, Gemini 2.5, Claude 3.5, Qwen…) actually learn an internal world model🌍? Surprisingly, the answer appears to be a hard NO—as revealed by our WM Atomic Benchmark⚛️. Even o3 struggles with the most basic, atomic-level questions: ❌Confuse triangles📐 with circles⭕️ ❌Believe 🟦blue objects move faster than 🟩green ones ❌Fail at compositional and transitive reasoning While humans perform nearly perfectly, these frontier models often score at chance level‼️ 🔎 More details in the thread below 👇

12,925 Aufrufe • vor 1 Jahr

Keine weiteren Inhalte verfügbar