X Square Robot's banner

X Square Robot

@XSquareRobot • 1,092 subscribers

Making robots an integral part of everyday human life WALL-OSS-0.5 and WALL-WM Paper, code and videos here: https://t.co/OUM4OGeR9r

Videos

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Exclusive private shows

1.2k viewers online

Private Show

Join now for exclusive access

Free preview available • Premium content

Meet the world at home, where life happens and bots become family 35 days ago, at our “Born to Bot, Bot to Family” launch event, we shared our vision of bringing robots into real homes. Today, we’re very happy to share that our robots are now gradually entering real families. For embodied AI, the real world is everyday life: different routines, different kitchens, and different ways of doing even the simplest tasks. This is where robots meet the world at home, where life happens and bots become family. They are still learning. They may move slowly, hesitate, and sometimes look a little clumsy. But every home they enter helps them understand the world a little better.

Meet the world at home, where life happens and bots become family 35 days ago, at our “Born to Bot, Bot to Family” launch event, we shared our vision of bringing robots into real homes. Today, we’re very happy to share that our robots are now gradually entering real families. For embodied AI, the real world is everyday life: different routines, different kitchens, and different ways of doing even the simplest tasks. This is where robots meet the world at home, where life happens and bots become family. They are still learning. They may move slowly, hesitate, and sometimes look a little clumsy. But every home they enter helps them understand the world a little better.

1,203,760 views • 28 days ago

Introducing WALL-WM, our open-source World Model for embodied AI and the next piece of our open-source robotics stack. Carving World Action Modeling at the Event Joints Read the blog: Why it matters WALL-WM shifts robot world modeling from fixed-length action chunks to event-grounded video-action pretraining. It learns around events like reaching, contact, grasping, lifting, moving, and placing, so language, vision, and action align more naturally. Why you should care WALL-WM brings together: •Event-grounded VLA pretraining •Prior-aligned video-action architecture •Wan-based video tower + randomly initialized action DiT •Multi-view perception with sight-cone masking, tube patch masking, and Camera RoPE •Event Mode for variable-length execution •Unified Mode with Staircase Decoding •DMuon for large-scale training The goal: help robots learn what physically matters, not just what happens in the next fixed slice of time. Code (coming soon): #opensource #EmbodiedAI

Introducing WALL-WM, our open-source World Model for embodied AI and the next piece of our open-source robotics stack. Carving World Action Modeling at the Event Joints Read the blog: Why it matters WALL-WM shifts robot world modeling from fixed-length action chunks to event-grounded video-action pretraining. It learns around events like reaching, contact, grasping, lifting, moving, and placing, so language, vision, and action align more naturally. Why you should care WALL-WM brings together: •Event-grounded VLA pretraining •Prior-aligned video-action architecture •Wan-based video tower + randomly initialized action DiT •Multi-view perception with sight-cone masking, tube patch masking, and Camera RoPE •Event Mode for variable-length execution •Unified Mode with Staircase Decoding •DMuon for large-scale training The goal: help robots learn what physically matters, not just what happens in the next fixed slice of time. Code (coming soon): #opensource #EmbodiedAI

37,962 views • 24 days ago

We are open-sourcing Wall-OSS-0.5. Pretrain Once, Act Anywhere. Wall-OSS-0.5 is a VLA model for real-world robotic manipulation, exploring whether pretraining alone can produce robot capabilities directly testable on physical hardware before task-specific fine-tuning. Key technical highlights: • Gradient-bridged co-training • Vision-Aligned RVQ Action Tokenizer • Action-Space Supervision • DMuon distributed optimizer In zero-shot real-robot evaluation, the pretrained checkpoint achieved task-progress scores above 80 on multiple tasks, including Block Sorting, Fruit Sorting, Ring Stacking, and Rope Tightening. Paper, code, blog, and uncut videos:

We are open-sourcing Wall-OSS-0.5. Pretrain Once, Act Anywhere. Wall-OSS-0.5 is a VLA model for real-world robotic manipulation, exploring whether pretraining alone can produce robot capabilities directly testable on physical hardware before task-specific fine-tuning. Key technical highlights: • Gradient-bridged co-training • Vision-Aligned RVQ Action Tokenizer • Action-Space Supervision • DMuon distributed optimizer In zero-shot real-robot evaluation, the pretrained checkpoint achieved task-progress scores above 80 on multiple tasks, including Block Sorting, Fruit Sorting, Ring Stacking, and Rope Tightening. Paper, code, blog, and uncut videos:

23,789 views • 25 days ago

No more content to load