Stephen James's banner

Stephen James

@stepjamUK • 7,400 subscribers

CEO @Neuracore_AI | Assistant Professor @imperialcollege | ex-Director of Dyson Robot Learning Lab | Postdoc @UCBerkeley w/ @pabbeel | PhD ICL w/ @ajdDavison

Shorts

Furniture assembly is the task everyone name-drops and nobody actually attempts at real scale. Every demo I have seen is a scaled down IKEA leg or a single arm on a toy chair. This paper does it properly, real scale, bimanual, up to 7 subtasks and 1,550 control steps per episode, and it is validated on a real Kinova Gen3, not just in sim. That real-robot number is the one that matters: only a 16 percent drop on the hardest task going from simulation to hardware. That is a small enough gap to take seriously, and it did not happen by accident. They built a VR teleoperation rig specifically for coordinated dual-arm collection, because generic single-arm teleop setups do not capture the coordination real assembly needs, and the model predicts a continuous progress signal alongside the action chunk rather than a discrete subtask label, letting it auto-transition and catch drift before it compounds into total failure. The simulation ablation is what got them there, 48 to 80 percent over baselines, with another 21 points from their perception and control design study alone, but that is groundwork, not the headline. Watch the video, there is a clip of the robot misgrasping the seat panel, reopening the gripper, and regrasping on its own. That is not scripted recovery behaviour, it emerged from training, and it emerged on hardware. Excellent work from the team from Mitsubishi Electric Research Laboratories, with Oxford and UNC Chapel Hill Clinical Laboratory Science. Video and project page in comments. #Robotics #Manipulation #VLA

Furniture assembly is the task everyone name-drops and nobody actually attempts at real scale. Every demo I have seen is a scaled down IKEA leg or a single arm on a toy chair. This paper does it properly, real scale, bimanual, up to 7 subtasks and 1,550 control steps per episode, and it is validated on a real Kinova Gen3, not just in sim. That real-robot number is the one that matters: only a 16 percent drop on the hardest task going from simulation to hardware. That is a small enough gap to take seriously, and it did not happen by accident. They built a VR teleoperation rig specifically for coordinated dual-arm collection, because generic single-arm teleop setups do not capture the coordination real assembly needs, and the model predicts a continuous progress signal alongside the action chunk rather than a discrete subtask label, letting it auto-transition and catch drift before it compounds into total failure. The simulation ablation is what got them there, 48 to 80 percent over baselines, with another 21 points from their perception and control design study alone, but that is groundwork, not the headline. Watch the video, there is a clip of the robot misgrasping the seat panel, reopening the gripper, and regrasping on its own. That is not scripted recovery behaviour, it emerged from training, and it emerged on hardware. Excellent work from the team from Mitsubishi Electric Research Laboratories, with Oxford and UNC Chapel Hill Clinical Laboratory Science. Video and project page in comments. #Robotics #Manipulation #VLA

14,952 görüntüleme

𝗜'𝘃𝗲 𝗵𝗲𝗮𝗿𝗱 𝘁𝗵𝗶𝘀 𝗮 𝗹𝗼𝘁 𝗿𝗲𝗰𝗲𝗻𝘁𝗹𝘆: "𝗪𝗲 𝘁𝗿𝗮𝗶𝗻𝗲𝗱 𝗼𝘂𝗿 𝗿𝗼𝗯𝗼𝘁 𝗼𝗻 𝗼𝗻𝗲 𝗼𝗯𝗷𝗲𝗰𝘁 𝗮𝗻𝗱 𝗶𝘁 𝗴𝗲𝗻𝗲𝗿𝗮𝗹𝗶𝘀𝗲𝗱 𝘁𝗼 𝗮 𝗻𝗼𝘃𝗲𝗹 𝗼𝗯𝗷𝗲𝗰𝘁 - 𝘁𝗵𝗲𝘀𝗲 𝗻𝗲𝘄 𝗩𝗟𝗔 𝗺𝗼𝗱𝗲𝗹𝘀 𝗮𝗿𝗲 𝗰𝗿𝗮𝘇𝘆!" Let's talk about what's actually happening in that "A" (Action) part of your VLA model. The Vision and Language components? They're incredible. Pre-trained on internet-scale data, they understand objects, spatial relationships, and task instructions better than ever. But the Action component? That's still learned from scratch on your specific robot demonstrations. 𝗛𝗲𝗿𝗲'𝘀 𝘁𝗵𝗲 𝗿𝗲𝗮𝗹𝗶𝘁𝘆: Your VLA model has internet-scale understanding of what a screwdriver looks like and what "tighten the screw" means. But the actual motor pattern for "rotating wrist while applying downward pressure"? That comes from your 500 robot demos. 𝗪𝗵𝗮𝘁 𝘁𝗵𝗶𝘀 𝗺𝗲𝗮𝗻𝘀 𝗳𝗼𝗿 "𝗴𝗲𝗻𝗲𝗿𝗮𝗹𝗶𝘀𝗮𝘁𝗶𝗼𝗻": • 𝗩𝗶𝘀𝗶𝗼𝗻 𝗴𝗲𝗻𝗲𝗿𝗮𝗹𝗶𝘀𝗮𝘁𝗶𝗼𝗻: Recognises novel objects instantly (thanks to pre-training) • 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗴𝗲𝗻𝗲𝗿𝗮𝗹𝗶𝘀𝗮𝘁𝗶𝗼𝗻: Understands new task instructions (thanks to pre-training) • 𝗔𝗰𝘁𝗶𝗼𝗻 𝗴𝗲𝗻𝗲𝗿𝗮𝗹𝗶𝘀𝗮𝘁𝗶𝗼𝗻: Still limited to motor patterns seen during robot training Ask that same robot to "unscrew the bottle cap" and it fails because: • Vision: Recognises bottle and cap • Language: Understands "unscrew" • Action: Never learned the "twist while pulling" motor pattern 𝗧𝗵𝗲 𝗵𝗮𝗿𝗱 𝘁𝗿𝘂𝘁𝗵 𝗮𝗯𝗼𝘂𝘁 𝗩𝗟𝗔 𝗺𝗼𝗱𝗲𝗹𝘀: The "VL" gives you incredible zero-shot understanding. The "A" still requires task-specific demonstrations. We've cracked the perception and reasoning problem. We haven't cracked the motor generalisation problem.

𝗜'𝘃𝗲 𝗵𝗲𝗮𝗿𝗱 𝘁𝗵𝗶𝘀 𝗮 𝗹𝗼𝘁 𝗿𝗲𝗰𝗲𝗻𝘁𝗹𝘆: "𝗪𝗲 𝘁𝗿𝗮𝗶𝗻𝗲𝗱 𝗼𝘂𝗿 𝗿𝗼𝗯𝗼𝘁 𝗼𝗻 𝗼𝗻𝗲 𝗼𝗯𝗷𝗲𝗰𝘁 𝗮𝗻𝗱 𝗶𝘁 𝗴𝗲𝗻𝗲𝗿𝗮𝗹𝗶𝘀𝗲𝗱 𝘁𝗼 𝗮 𝗻𝗼𝘃𝗲𝗹 𝗼𝗯𝗷𝗲𝗰𝘁 - 𝘁𝗵𝗲𝘀𝗲 𝗻𝗲𝘄 𝗩𝗟𝗔 𝗺𝗼𝗱𝗲𝗹𝘀 𝗮𝗿𝗲 𝗰𝗿𝗮𝘇𝘆!" Let's talk about what's actually happening in that "A" (Action) part of your VLA model. The Vision and Language components? They're incredible. Pre-trained on internet-scale data, they understand objects, spatial relationships, and task instructions better than ever. But the Action component? That's still learned from scratch on your specific robot demonstrations. 𝗛𝗲𝗿𝗲'𝘀 𝘁𝗵𝗲 𝗿𝗲𝗮𝗹𝗶𝘁𝘆: Your VLA model has internet-scale understanding of what a screwdriver looks like and what "tighten the screw" means. But the actual motor pattern for "rotating wrist while applying downward pressure"? That comes from your 500 robot demos. 𝗪𝗵𝗮𝘁 𝘁𝗵𝗶𝘀 𝗺𝗲𝗮𝗻𝘀 𝗳𝗼𝗿 "𝗴𝗲𝗻𝗲𝗿𝗮𝗹𝗶𝘀𝗮𝘁𝗶𝗼𝗻": • 𝗩𝗶𝘀𝗶𝗼𝗻 𝗴𝗲𝗻𝗲𝗿𝗮𝗹𝗶𝘀𝗮𝘁𝗶𝗼𝗻: Recognises novel objects instantly (thanks to pre-training) • 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗴𝗲𝗻𝗲𝗿𝗮𝗹𝗶𝘀𝗮𝘁𝗶𝗼𝗻: Understands new task instructions (thanks to pre-training) • 𝗔𝗰𝘁𝗶𝗼𝗻 𝗴𝗲𝗻𝗲𝗿𝗮𝗹𝗶𝘀𝗮𝘁𝗶𝗼𝗻: Still limited to motor patterns seen during robot training Ask that same robot to "unscrew the bottle cap" and it fails because: • Vision: Recognises bottle and cap • Language: Understands "unscrew" • Action: Never learned the "twist while pulling" motor pattern 𝗧𝗵𝗲 𝗵𝗮𝗿𝗱 𝘁𝗿𝘂𝘁𝗵 𝗮𝗯𝗼𝘂𝘁 𝗩𝗟𝗔 𝗺𝗼𝗱𝗲𝗹𝘀: The "VL" gives you incredible zero-shot understanding. The "A" still requires task-specific demonstrations. We've cracked the perception and reasoning problem. We haven't cracked the motor generalisation problem.

51,356 görüntüleme

𝗣𝗼𝗽𝘂𝗹𝗮𝗿 𝗼𝗽𝗶𝗻𝗶𝗼𝗻: "𝗝𝘂𝘀𝘁 𝗴𝗲𝗻𝗲𝗿𝗮𝘁𝗲 𝗺𝗼𝗿𝗲 𝘀𝗶𝗺𝘂𝗹𝗮𝘁𝗶𝗼𝗻 𝗱𝗮𝘁𝗮." After working with many 𝗿𝗼𝗯𝗼𝘁 𝗺𝗮𝗻𝗶𝗽𝘂𝗹𝗮𝘁𝗶𝗼𝗻 teams who've fallen into the simulation trap, here's what I've learned: Simulation teaches your robot to be really, really good at simulation. Unlike blind locomotion policies that can get away with sim-to-real transfer because they rely mainly on proprioception and contact forces, 𝘃𝗶𝘀𝗶𝗼𝗻-𝗴𝘂𝗶𝗱𝗲𝗱 𝗺𝗮𝗻𝗶𝗽𝘂𝗹𝗮𝘁𝗶𝗼𝗻 𝗶𝘀 𝗲𝘅𝘁𝗿𝗲𝗺𝗲𝗹𝘆 𝘀𝗲𝗻𝘀𝗶𝘁𝗶𝘃𝗲 𝘁𝗼 𝘃𝗶𝘀𝘂𝗮𝗹 𝗱𝗼𝗺𝗮𝗶𝗻 𝗴𝗮𝗽. The subtle differences accumulate: - Simulated friction vs real surface textures - Perfect lighting vs shadows, reflections, glare - Ideal object geometries vs manufacturing tolerances - Instantaneous sensor readings vs real-world noise and latency - Clean backgrounds vs cluttered, dynamic environments 𝗧𝗵𝗲 𝗰𝗹𝗮𝘀𝘀𝗶𝗰 𝗽𝗿𝗼𝗴𝗿𝗲𝘀𝘀𝗶𝗼𝗻: Week 1: "Our model works perfectly in sim!" Week 2: "Let's collect some real data to fine-tune." Week 3: "The real data completely contradicts what the sim taught..." Week 4: "Okay, let's collect way more real data." Month 2: "We basically need to retrain from scratch." 𝗧𝗵𝗲 𝗽𝗮𝗶𝗻𝗳𝘂𝗹 𝘁𝗿𝘂𝘁𝗵: There's no shortcut to real-world data collection for vision-based manipulation. Simulation is amazing for debugging, prototyping, safety testing, and of course to supplement your real data. But it's not a substitute for understanding how your robot actually behaves in the actual environment. 𝗪𝗵𝗮𝘁 𝘄𝗼𝗿𝗸𝘀: Use simulation strategically - for exploring edge cases, testing safety boundaries, and rapid iteration. But build your production models on real data from real environments. The teams that succeed treat simulation as a powerful tool, not a magic solution. This is why Neuracore focuses on making real-world data collection so much easier and faster. Because the physics of your actual environment can't be simulated away. 𝗪𝗼𝗿𝗹𝗱 𝗺𝗼𝗱𝗲𝗹𝘀, 𝘆𝗼𝘂 𝘀𝗮𝘆? 𝗪𝗲𝗹𝗹, 𝗽𝗲𝗿𝗵𝗮𝗽𝘀 𝗺𝗼𝗿𝗲 𝗼𝗻 𝘁𝗵𝗮𝘁 𝗶𝗻 𝗮𝗻𝗼𝘁𝗵𝗲𝗿 𝗽𝗼𝘀𝘁! 𝗪𝗵𝗮𝘁'𝘀 𝗯𝗲𝗲𝗻 𝘆𝗼𝘂𝗿 𝗲𝘅𝗽𝗲𝗿𝗶𝗲𝗻𝗰𝗲 𝘄𝗶𝘁𝗵 𝘀𝗶𝗺-𝘁𝗼-𝗿𝗲𝗮𝗹 𝘁𝗿𝗮𝗻𝘀𝗳𝗲𝗿? 𝗛𝗮𝘀 𝗶𝘁 𝘄𝗼𝗿𝗸𝗲𝗱 𝗮𝘀 𝘄𝗲𝗹𝗹 𝗮𝘀 𝗲𝘅𝗽𝗲𝗰𝘁𝗲𝗱?

𝗣𝗼𝗽𝘂𝗹𝗮𝗿 𝗼𝗽𝗶𝗻𝗶𝗼𝗻: "𝗝𝘂𝘀𝘁 𝗴𝗲𝗻𝗲𝗿𝗮𝘁𝗲 𝗺𝗼𝗿𝗲 𝘀𝗶𝗺𝘂𝗹𝗮𝘁𝗶𝗼𝗻 𝗱𝗮𝘁𝗮." After working with many 𝗿𝗼𝗯𝗼𝘁 𝗺𝗮𝗻𝗶𝗽𝘂𝗹𝗮𝘁𝗶𝗼𝗻 teams who've fallen into the simulation trap, here's what I've learned: Simulation teaches your robot to be really, really good at simulation. Unlike blind locomotion policies that can get away with sim-to-real transfer because they rely mainly on proprioception and contact forces, 𝘃𝗶𝘀𝗶𝗼𝗻-𝗴𝘂𝗶𝗱𝗲𝗱 𝗺𝗮𝗻𝗶𝗽𝘂𝗹𝗮𝘁𝗶𝗼𝗻 𝗶𝘀 𝗲𝘅𝘁𝗿𝗲𝗺𝗲𝗹𝘆 𝘀𝗲𝗻𝘀𝗶𝘁𝗶𝘃𝗲 𝘁𝗼 𝘃𝗶𝘀𝘂𝗮𝗹 𝗱𝗼𝗺𝗮𝗶𝗻 𝗴𝗮𝗽. The subtle differences accumulate: - Simulated friction vs real surface textures - Perfect lighting vs shadows, reflections, glare - Ideal object geometries vs manufacturing tolerances - Instantaneous sensor readings vs real-world noise and latency - Clean backgrounds vs cluttered, dynamic environments 𝗧𝗵𝗲 𝗰𝗹𝗮𝘀𝘀𝗶𝗰 𝗽𝗿𝗼𝗴𝗿𝗲𝘀𝘀𝗶𝗼𝗻: Week 1: "Our model works perfectly in sim!" Week 2: "Let's collect some real data to fine-tune." Week 3: "The real data completely contradicts what the sim taught..." Week 4: "Okay, let's collect way more real data." Month 2: "We basically need to retrain from scratch." 𝗧𝗵𝗲 𝗽𝗮𝗶𝗻𝗳𝘂𝗹 𝘁𝗿𝘂𝘁𝗵: There's no shortcut to real-world data collection for vision-based manipulation. Simulation is amazing for debugging, prototyping, safety testing, and of course to supplement your real data. But it's not a substitute for understanding how your robot actually behaves in the actual environment. 𝗪𝗵𝗮𝘁 𝘄𝗼𝗿𝗸𝘀: Use simulation strategically - for exploring edge cases, testing safety boundaries, and rapid iteration. But build your production models on real data from real environments. The teams that succeed treat simulation as a powerful tool, not a magic solution. This is why Neuracore focuses on making real-world data collection so much easier and faster. Because the physics of your actual environment can't be simulated away. 𝗪𝗼𝗿𝗹𝗱 𝗺𝗼𝗱𝗲𝗹𝘀, 𝘆𝗼𝘂 𝘀𝗮𝘆? 𝗪𝗲𝗹𝗹, 𝗽𝗲𝗿𝗵𝗮𝗽𝘀 𝗺𝗼𝗿𝗲 𝗼𝗻 𝘁𝗵𝗮𝘁 𝗶𝗻 𝗮𝗻𝗼𝘁𝗵𝗲𝗿 𝗽𝗼𝘀𝘁! 𝗪𝗵𝗮𝘁'𝘀 𝗯𝗲𝗲𝗻 𝘆𝗼𝘂𝗿 𝗲𝘅𝗽𝗲𝗿𝗶𝗲𝗻𝗰𝗲 𝘄𝗶𝘁𝗵 𝘀𝗶𝗺-𝘁𝗼-𝗿𝗲𝗮𝗹 𝘁𝗿𝗮𝗻𝘀𝗳𝗲𝗿? 𝗛𝗮𝘀 𝗶𝘁 𝘄𝗼𝗿𝗸𝗲𝗱 𝗮𝘀 𝘄𝗲𝗹𝗹 𝗮𝘀 𝗲𝘅𝗽𝗲𝗰𝘁𝗲𝗱?

31,009 görüntüleme

𝗘𝘃𝗲𝗿𝘆𝗼𝗻𝗲’𝘀 𝘁𝗮𝗹𝗸𝗶𝗻𝗴 𝗮𝗯𝗼𝘂𝘁 “𝗣𝗵𝘆𝘀𝗶𝗰𝗮𝗹 𝗔𝗜" - the idea that we can simulate real-world environments so well that robots trained in simulation will work perfectly in reality. 𝗧𝗵𝗲 𝗽𝗿𝗼𝗺𝗶𝘀𝗲: Train in virtual worlds → deploy anywhere. 𝗧𝗵𝗲 𝗿𝗲𝗮𝗹𝗶𝘁𝘆: I’ve seen too many teams fall into this trap. After working with manipulation teams at Berkeley, Imperial, and Dyson, here’s the pattern: • 𝗪𝗲𝗲𝗸 𝟭: “Our policy works perfectly in simulation!” • 𝗪𝗲𝗲𝗸 𝟰: “Why doesn’t this work on real objects?” • 𝗠𝗼𝗻𝘁𝗵 𝟮: “We basically need to retrain from scratch with real data.” 𝗧𝗵𝗲 𝗴𝗮𝗽 𝘀𝗶𝗺𝘂𝗹𝗮𝘁𝗶𝗼𝗻𝘀 𝗰𝗮𝗻’𝘁 𝗯𝗿𝗶𝗱𝗴𝗲: Unlike blind locomotion policies that can get away with sim-to-real transfer because they rely mainly on proprioception and contact forces, 𝘃𝗶𝘀𝗶𝗼𝗻-𝗴𝘂𝗶𝗱𝗲𝗱 𝗺𝗮𝗻𝗶𝗽𝘂𝗹𝗮𝘁𝗶𝗼𝗻 𝗶𝘀 𝗲𝘅𝘁𝗿𝗲𝗺𝗲𝗹𝘆 𝘀𝗲𝗻𝘀𝗶𝘁𝗶𝘃𝗲 𝘁𝗼 𝘃𝗶𝘀𝘂𝗮𝗹 𝗱𝗼𝗺𝗮𝗶𝗻 𝗴𝗮𝗽𝘀. • Real friction vs simulated surface textures • Manufacturing tolerances vs perfect CAD models • Dynamic lighting vs controlled virtual environments • Sensor noise vs instantaneous virtual readings 𝗛𝗲𝗿𝗲'𝘀 𝘄𝗵𝗮𝘁 𝗽𝗲𝗼𝗽𝗹𝗲 𝗱𝗼𝗻'𝘁 𝘁𝗮𝗹𝗸 𝗮𝗯𝗼𝘂𝘁: Building these detailed simulated environments takes forever. If it takes 7 days to build a simulated kitchen in simulation, wouldn't it be better to just collect real-world data in a real kitchen instead? 𝗗𝗼𝗻'𝘁 𝗴𝗲𝘁 𝗺𝗲 𝘄𝗿𝗼𝗻𝗴 - simulation is incredible for debugging, safety testing, and exploring edge cases. But it's not a magic solution to real-world deployment. 𝗪𝗵𝗮𝘁 𝗮𝗰𝘁𝘂𝗮𝗹𝗹𝘆 𝘄𝗼𝗿𝗸𝘀: Use simulation strategically while making real-world data collection as efficient and flexible as possible. This is why Neuracore focuses on streamlined real-world data infrastructure. Because no amount of virtual training can replace understanding how your robot actually behaves in actual environments. 𝗧𝗵𝗲 𝗽𝗵𝘆𝘀𝗶𝗰𝘀 𝗼𝗳 𝘆𝗼𝘂𝗿 𝗱𝗲𝗽𝗹𝗼𝘆𝗺𝗲𝗻𝘁 𝗲𝗻𝘃𝗶𝗿𝗼𝗻𝗺𝗲𝗻𝘁 𝗰𝗮𝗻'𝘁 𝗯𝗲 𝘀𝗶𝗺𝘂𝗹𝗮𝘁𝗲𝗱 𝗮𝘄𝗮𝘆. What’s been your experience with sim-to-real transfer?

𝗘𝘃𝗲𝗿𝘆𝗼𝗻𝗲’𝘀 𝘁𝗮𝗹𝗸𝗶𝗻𝗴 𝗮𝗯𝗼𝘂𝘁 “𝗣𝗵𝘆𝘀𝗶𝗰𝗮𝗹 𝗔𝗜" - the idea that we can simulate real-world environments so well that robots trained in simulation will work perfectly in reality. 𝗧𝗵𝗲 𝗽𝗿𝗼𝗺𝗶𝘀𝗲: Train in virtual worlds → deploy anywhere. 𝗧𝗵𝗲 𝗿𝗲𝗮𝗹𝗶𝘁𝘆: I’ve seen too many teams fall into this trap. After working with manipulation teams at Berkeley, Imperial, and Dyson, here’s the pattern: • 𝗪𝗲𝗲𝗸 𝟭: “Our policy works perfectly in simulation!” • 𝗪𝗲𝗲𝗸 𝟰: “Why doesn’t this work on real objects?” • 𝗠𝗼𝗻𝘁𝗵 𝟮: “We basically need to retrain from scratch with real data.” 𝗧𝗵𝗲 𝗴𝗮𝗽 𝘀𝗶𝗺𝘂𝗹𝗮𝘁𝗶𝗼𝗻𝘀 𝗰𝗮𝗻’𝘁 𝗯𝗿𝗶𝗱𝗴𝗲: Unlike blind locomotion policies that can get away with sim-to-real transfer because they rely mainly on proprioception and contact forces, 𝘃𝗶𝘀𝗶𝗼𝗻-𝗴𝘂𝗶𝗱𝗲𝗱 𝗺𝗮𝗻𝗶𝗽𝘂𝗹𝗮𝘁𝗶𝗼𝗻 𝗶𝘀 𝗲𝘅𝘁𝗿𝗲𝗺𝗲𝗹𝘆 𝘀𝗲𝗻𝘀𝗶𝘁𝗶𝘃𝗲 𝘁𝗼 𝘃𝗶𝘀𝘂𝗮𝗹 𝗱𝗼𝗺𝗮𝗶𝗻 𝗴𝗮𝗽𝘀. • Real friction vs simulated surface textures • Manufacturing tolerances vs perfect CAD models • Dynamic lighting vs controlled virtual environments • Sensor noise vs instantaneous virtual readings 𝗛𝗲𝗿𝗲'𝘀 𝘄𝗵𝗮𝘁 𝗽𝗲𝗼𝗽𝗹𝗲 𝗱𝗼𝗻'𝘁 𝘁𝗮𝗹𝗸 𝗮𝗯𝗼𝘂𝘁: Building these detailed simulated environments takes forever. If it takes 7 days to build a simulated kitchen in simulation, wouldn't it be better to just collect real-world data in a real kitchen instead? 𝗗𝗼𝗻'𝘁 𝗴𝗲𝘁 𝗺𝗲 𝘄𝗿𝗼𝗻𝗴 - simulation is incredible for debugging, safety testing, and exploring edge cases. But it's not a magic solution to real-world deployment. 𝗪𝗵𝗮𝘁 𝗮𝗰𝘁𝘂𝗮𝗹𝗹𝘆 𝘄𝗼𝗿𝗸𝘀: Use simulation strategically while making real-world data collection as efficient and flexible as possible. This is why Neuracore focuses on streamlined real-world data infrastructure. Because no amount of virtual training can replace understanding how your robot actually behaves in actual environments. 𝗧𝗵𝗲 𝗽𝗵𝘆𝘀𝗶𝗰𝘀 𝗼𝗳 𝘆𝗼𝘂𝗿 𝗱𝗲𝗽𝗹𝗼𝘆𝗺𝗲𝗻𝘁 𝗲𝗻𝘃𝗶𝗿𝗼𝗻𝗺𝗲𝗻𝘁 𝗰𝗮𝗻'𝘁 𝗯𝗲 𝘀𝗶𝗺𝘂𝗹𝗮𝘁𝗲𝗱 𝗮𝘄𝗮𝘆. What’s been your experience with sim-to-real transfer?

25,300 görüntüleme

As a newly appointed 𝗔𝘀𝘀𝗶𝘀𝘁𝗮𝗻𝘁 𝗣𝗿𝗼𝗳𝗲𝘀𝘀𝗼𝗿 at Imperial College London, I'm thrilled to announce the 𝗦𝗮𝗳𝗲 𝗪𝗵𝗼𝗹𝗲-𝗯𝗼𝗱𝘆 𝗜𝗻𝘁𝗲𝗹𝗹𝗶𝗴𝗲𝗻𝘁 𝗥𝗼𝗯𝗼𝘁𝗶𝗰𝘀 𝗟𝗮𝗯 (𝗦𝗪𝗜𝗥𝗟) at 𝗜𝗺𝗽𝗲𝗿𝗶𝗮𝗹 𝗖𝗼𝗹𝗹𝗲𝗴𝗲 𝗟𝗼𝗻𝗱𝗼𝗻. 𝗦𝗮𝗳𝗲 𝗪𝗵𝗼𝗹𝗲-𝗯𝗼𝗱𝘆 𝗜𝗻𝘁𝗲𝗹𝗹𝗶𝗴𝗲𝗻𝘁 𝗥𝗼𝗯𝗼𝘁𝗶𝗰𝘀 𝗟𝗮𝗯 (𝗦𝗪𝗜𝗥𝗟) ( is a new research lab focused on the intersection of safety and intelligence in next-generation robotics. We're hiring exceptional PhD students who are passionate about pushing the boundaries of robot learning. 𝗪𝗵𝗮𝘁 𝗺𝗮𝗸𝗲𝘀 𝗦𝗪𝗜𝗥𝗟 𝘂𝗻𝗶𝗾𝘂𝗲? We operate at the exciting convergence of: • Online & offline reinforcement learning • Imitation learning & human demonstrations • Sample-efficient learning methods • Whole-body and soft robotics systems We're 𝗹𝗼𝗼𝗸𝗶𝗻𝗴 𝗳𝗼𝗿 𝗽𝗿𝗼𝘀𝗽𝗲𝗰𝘁𝗶𝘃𝗲 𝗣𝗵𝗗 𝘀𝘁𝘂𝗱𝗲𝗻𝘁𝘀 interested in: • Developing safe exploration algorithms for robotic systems • Creating sample-efficient learning methods that minimize real-world trials • Building foundation models for robotics with safety guarantees • Advancing soft robotics and compliant human-robot interaction • Bridging theory and practice in embodied AI Why now? As robots become more capable and work closer with humans, we need systems that are both intelligent enough to handle complex tasks 𝗔𝗡𝗗 safe enough for real-world deployment. Traditional approaches treat safety and intelligence as competing priorities, we believe they're synergistic. If you're a motivated researcher who wants to develop the theoretical foundations and practical algorithms for tomorrow's safe, intelligent robots, I'd love to hear from you. Want to join? Apply via

As a newly appointed 𝗔𝘀𝘀𝗶𝘀𝘁𝗮𝗻𝘁 𝗣𝗿𝗼𝗳𝗲𝘀𝘀𝗼𝗿 at Imperial College London, I'm thrilled to announce the 𝗦𝗮𝗳𝗲 𝗪𝗵𝗼𝗹𝗲-𝗯𝗼𝗱𝘆 𝗜𝗻𝘁𝗲𝗹𝗹𝗶𝗴𝗲𝗻𝘁 𝗥𝗼𝗯𝗼𝘁𝗶𝗰𝘀 𝗟𝗮𝗯 (𝗦𝗪𝗜𝗥𝗟) at 𝗜𝗺𝗽𝗲𝗿𝗶𝗮𝗹 𝗖𝗼𝗹𝗹𝗲𝗴𝗲 𝗟𝗼𝗻𝗱𝗼𝗻. 𝗦𝗮𝗳𝗲 𝗪𝗵𝗼𝗹𝗲-𝗯𝗼𝗱𝘆 𝗜𝗻𝘁𝗲𝗹𝗹𝗶𝗴𝗲𝗻𝘁 𝗥𝗼𝗯𝗼𝘁𝗶𝗰𝘀 𝗟𝗮𝗯 (𝗦𝗪𝗜𝗥𝗟) ( is a new research lab focused on the intersection of safety and intelligence in next-generation robotics. We're hiring exceptional PhD students who are passionate about pushing the boundaries of robot learning. 𝗪𝗵𝗮𝘁 𝗺𝗮𝗸𝗲𝘀 𝗦𝗪𝗜𝗥𝗟 𝘂𝗻𝗶𝗾𝘂𝗲? We operate at the exciting convergence of: • Online & offline reinforcement learning • Imitation learning & human demonstrations • Sample-efficient learning methods • Whole-body and soft robotics systems We're 𝗹𝗼𝗼𝗸𝗶𝗻𝗴 𝗳𝗼𝗿 𝗽𝗿𝗼𝘀𝗽𝗲𝗰𝘁𝗶𝘃𝗲 𝗣𝗵𝗗 𝘀𝘁𝘂𝗱𝗲𝗻𝘁𝘀 interested in: • Developing safe exploration algorithms for robotic systems • Creating sample-efficient learning methods that minimize real-world trials • Building foundation models for robotics with safety guarantees • Advancing soft robotics and compliant human-robot interaction • Bridging theory and practice in embodied AI Why now? As robots become more capable and work closer with humans, we need systems that are both intelligent enough to handle complex tasks 𝗔𝗡𝗗 safe enough for real-world deployment. Traditional approaches treat safety and intelligence as competing priorities, we believe they're synergistic. If you're a motivated researcher who wants to develop the theoretical foundations and practical algorithms for tomorrow's safe, intelligent robots, I'd love to hear from you. Want to join? Apply via

16,598 görüntüleme

Videos

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Exclusive private shows

1.2k viewers online

Private Show

Join now for exclusive access

Free preview available • Premium content

𝗗𝗟𝗥 𝗿𝗲𝘀𝗲𝗮𝗿𝗰𝗵𝗲𝗿𝘀 𝗴𝗮𝘃𝗲 𝗮 𝗿𝗼𝗯𝗼𝘁𝗶𝗰 𝗮𝗿𝗺 𝗳𝘂𝗹𝗹-𝗯𝗼𝗱𝘆 𝘁𝗼𝘂𝗰𝗵 𝘀𝗲𝗻𝘀𝗶𝘁𝗶𝘃𝗶𝘁𝘆 𝘄𝗶𝘁𝗵 𝗻𝗼 𝗮𝗿𝘁𝗶𝗳𝗶𝗰𝗶𝗮𝗹 𝘀𝗸𝗶𝗻 𝗻𝗲𝗲𝗱𝗲𝗱. They used internal force-torque sensors at 8 kHz + deep learning. The robot can feel where you touch it, recognize letters drawn on its surface, and respond to virtual buttons placed anywhere on its body. What's interesting is the infrastructure behind it. To train these models, you need high-frequency sensor streams, manifold learning to unfold trajectories, and the ability to iterate fast. They collected 2,300 samples from 20 people and hit 95.5% accuracy on digit recognition. This is what's possible when you have the right data infrastructure. 📄 Video credit: DLR - English

𝗗𝗟𝗥 𝗿𝗲𝘀𝗲𝗮𝗿𝗰𝗵𝗲𝗿𝘀 𝗴𝗮𝘃𝗲 𝗮 𝗿𝗼𝗯𝗼𝘁𝗶𝗰 𝗮𝗿𝗺 𝗳𝘂𝗹𝗹-𝗯𝗼𝗱𝘆 𝘁𝗼𝘂𝗰𝗵 𝘀𝗲𝗻𝘀𝗶𝘁𝗶𝘃𝗶𝘁𝘆 𝘄𝗶𝘁𝗵 𝗻𝗼 𝗮𝗿𝘁𝗶𝗳𝗶𝗰𝗶𝗮𝗹 𝘀𝗸𝗶𝗻 𝗻𝗲𝗲𝗱𝗲𝗱. They used internal force-torque sensors at 8 kHz + deep learning. The robot can feel where you touch it, recognize letters drawn on its surface, and respond to virtual buttons placed anywhere on its body. What's interesting is the infrastructure behind it. To train these models, you need high-frequency sensor streams, manifold learning to unfold trajectories, and the ability to iterate fast. They collected 2,300 samples from 20 people and hit 95.5% accuracy on digit recognition. This is what's possible when you have the right data infrastructure. 📄 Video credit: DLR - English

173,663 görüntüleme • 9 ay önce

𝗣𝗼𝗽𝘂𝗹𝗮𝗿 𝗼𝗽𝗶𝗻𝗶𝗼𝗻: "𝗪𝗲 𝗻𝗲𝗲𝗱 𝗺𝗼𝗿𝗲 𝗱𝗮𝘁𝗮." 𝗔𝗰𝘁𝘂𝗮𝗹 𝗿𝗲𝗮𝗹𝗶𝘁𝘆: 𝟭𝟬𝟬 𝗴𝗿𝗲𝗮𝘁 𝗱𝗲𝗺𝗼𝘀 > 𝟭𝟬,𝟬𝟬𝟬 𝗺𝗲𝗱𝗶𝗼𝗰𝗿𝗲 𝗼𝗻𝗲𝘀. 𝗧𝗵𝗲 𝗺𝘆𝘁𝗵: More demonstrations always mean better models. 𝗧𝗵𝗲 𝗿𝗲𝗮𝗹𝗶𝘁𝘆: I've seen models trained on 200 high-quality demonstrations outperform models trained on 20,000 sloppy ones. Consistently. What makes a demonstration "high-quality"? - Smooth, deliberate motions (not rushed or hesitant) - Consistent task execution strategy across demos - Clear success/failure outcomes with proper labeling - Representative of actual deployment conditions - Performed by skilled demonstrators who understand the task What kills demonstration quality: - Human fatigue after 50+ consecutive demos - Time pressure ("let's just get this data collected") - Inconsistent demonstrators with different techniques - Collecting in unrealistic lab conditions - No real-time feedback on demonstration quality Unlike computer vision where you can throw away bad images easily, 𝗿𝗼𝗯𝗼𝘁 𝗱𝗲𝗺𝗼𝗻𝘀𝘁𝗿𝗮𝘁𝗶𝗼𝗻𝘀 𝗮𝗿𝗲 𝗲𝘅𝗽𝗲𝗻𝘀𝗶𝘃𝗲 𝗮𝗻𝗱 𝗵𝗮𝗿𝗱 𝘁𝗼 𝗿𝗲𝗽𝗹𝗮𝗰𝗲. Every single one needs to count. 𝗗𝗲𝗺𝗼𝗻𝘀𝘁𝗿𝗮𝘁𝗶𝗼𝗻 𝗾𝘂𝗮𝗹𝗶𝘁𝘆 𝗶𝘀 𝗮 𝗱𝗮𝘁𝗮 𝘀𝘁𝗿𝗮𝘁𝗲𝗴𝘆, 𝗻𝗼𝘁 𝗮 𝗰𝗼𝗹𝗹𝗲𝗰𝘁𝗶𝗼𝗻 𝗮𝗳𝘁𝗲𝗿𝘁𝗵𝗼𝘂𝗴𝗵𝘁. What's your ratio of "data collected" vs "data actually used for training"?

𝗣𝗼𝗽𝘂𝗹𝗮𝗿 𝗼𝗽𝗶𝗻𝗶𝗼𝗻: "𝗪𝗲 𝗻𝗲𝗲𝗱 𝗺𝗼𝗿𝗲 𝗱𝗮𝘁𝗮." 𝗔𝗰𝘁𝘂𝗮𝗹 𝗿𝗲𝗮𝗹𝗶𝘁𝘆: 𝟭𝟬𝟬 𝗴𝗿𝗲𝗮𝘁 𝗱𝗲𝗺𝗼𝘀 > 𝟭𝟬,𝟬𝟬𝟬 𝗺𝗲𝗱𝗶𝗼𝗰𝗿𝗲 𝗼𝗻𝗲𝘀. 𝗧𝗵𝗲 𝗺𝘆𝘁𝗵: More demonstrations always mean better models. 𝗧𝗵𝗲 𝗿𝗲𝗮𝗹𝗶𝘁𝘆: I've seen models trained on 200 high-quality demonstrations outperform models trained on 20,000 sloppy ones. Consistently. What makes a demonstration "high-quality"? - Smooth, deliberate motions (not rushed or hesitant) - Consistent task execution strategy across demos - Clear success/failure outcomes with proper labeling - Representative of actual deployment conditions - Performed by skilled demonstrators who understand the task What kills demonstration quality: - Human fatigue after 50+ consecutive demos - Time pressure ("let's just get this data collected") - Inconsistent demonstrators with different techniques - Collecting in unrealistic lab conditions - No real-time feedback on demonstration quality Unlike computer vision where you can throw away bad images easily, 𝗿𝗼𝗯𝗼𝘁 𝗱𝗲𝗺𝗼𝗻𝘀𝘁𝗿𝗮𝘁𝗶𝗼𝗻𝘀 𝗮𝗿𝗲 𝗲𝘅𝗽𝗲𝗻𝘀𝗶𝘃𝗲 𝗮𝗻𝗱 𝗵𝗮𝗿𝗱 𝘁𝗼 𝗿𝗲𝗽𝗹𝗮𝗰𝗲. Every single one needs to count. 𝗗𝗲𝗺𝗼𝗻𝘀𝘁𝗿𝗮𝘁𝗶𝗼𝗻 𝗾𝘂𝗮𝗹𝗶𝘁𝘆 𝗶𝘀 𝗮 𝗱𝗮𝘁𝗮 𝘀𝘁𝗿𝗮𝘁𝗲𝗴𝘆, 𝗻𝗼𝘁 𝗮 𝗰𝗼𝗹𝗹𝗲𝗰𝘁𝗶𝗼𝗻 𝗮𝗳𝘁𝗲𝗿𝘁𝗵𝗼𝘂𝗴𝗵𝘁. What's your ratio of "data collected" vs "data actually used for training"?

35,147 görüntüleme • 11 ay önce

𝗛𝗲𝗿𝗲’𝘀 𝗮 𝗺𝗶𝘀𝘁𝗮𝗸𝗲 𝗜 𝘀𝗲𝗲 𝗮𝗹𝗹 𝘁𝗵𝗲 𝘁𝗶𝗺𝗲. Teams collect robot data at 30Hz because “that’s what our robot runs at” and then wonder why their models underperform. The truth is that 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁 𝘁𝗮𝘀𝗸𝘀 𝗻𝗲𝗲𝗱 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁 𝘁𝗲𝗺𝗽𝗼𝗿𝗮𝗹 𝗿𝗲𝘀𝗼𝗹𝘂𝘁𝗶𝗼𝗻𝘀. Pick-and-place often works best around 10Hz for smooth motions. Dynamic catching requires 30Hz or more. Assembly tasks move slower, so 5Hz can suffice. Precision insertion demands 50Hz or higher for micro-adjustments. Here’s the catch. The 𝗼𝗽𝘁𝗶𝗺𝗮𝗹 𝗳𝗿𝗲𝗾𝘂𝗲𝗻𝗰𝘆 𝗶𝘀𝗻’𝘁 𝘀𝗼𝗺𝗲𝘁𝗵𝗶𝗻𝗴 𝘆𝗼𝘂 𝗰𝗮𝗻 𝗴𝘂𝗲𝘀𝘀. The traditional approach wastes time and demos. You pick a frequency, collect thousands of demos, train a model, get mediocre results, and realize you have to start over. A better way is to 𝗰𝗼𝗹𝗹𝗲𝗰𝘁 𝗱𝗮𝘁𝗮 𝗮𝘁 𝗵𝗶𝗴𝗵 𝗳𝗿𝗲𝗾𝘂𝗲𝗻𝗰𝘆, experiment with sync rates, and find what actually works for your task. We’ve seen grasping tasks perform best at 12Hz - not 10Hz, not 15Hz - discovered only through systematic testing. 𝗦𝘆𝗻𝗰𝗵𝗿𝗼𝗻𝗶𝘇𝗮𝘁𝗶𝗼𝗻 𝗳𝗿𝗲𝗾𝘂𝗲𝗻𝗰𝘆 𝗶𝘀 𝗮 𝗵𝘆𝗽𝗲𝗿𝗽𝗮𝗿𝗮𝗺𝗲𝘁𝗲𝗿. 𝗧𝗿𝗲𝗮𝘁 𝗶𝘁 𝗹𝗶𝗸𝗲 𝗹𝗲𝗮𝗿𝗻𝗶𝗻𝗴 𝗿𝗮𝘁𝗲 𝗼𝗿 𝗯𝗮𝘁𝗰𝗵 𝘀𝗶𝘇𝗲. 𝗗𝗼𝗻’𝘁 𝗱𝗲𝗰𝗶𝗱𝗲 𝗶𝘁 𝗼𝗻𝗰𝗲 𝗮𝗻𝗱 𝗿𝗲𝗴𝗿𝗲𝘁 𝗶𝘁 𝗳𝗼𝗿𝗲𝘃𝗲𝗿. How do you choose your data collection frequency, fixed upfront or experimental?

𝗛𝗲𝗿𝗲’𝘀 𝗮 𝗺𝗶𝘀𝘁𝗮𝗸𝗲 𝗜 𝘀𝗲𝗲 𝗮𝗹𝗹 𝘁𝗵𝗲 𝘁𝗶𝗺𝗲. Teams collect robot data at 30Hz because “that’s what our robot runs at” and then wonder why their models underperform. The truth is that 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁 𝘁𝗮𝘀𝗸𝘀 𝗻𝗲𝗲𝗱 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁 𝘁𝗲𝗺𝗽𝗼𝗿𝗮𝗹 𝗿𝗲𝘀𝗼𝗹𝘂𝘁𝗶𝗼𝗻𝘀. Pick-and-place often works best around 10Hz for smooth motions. Dynamic catching requires 30Hz or more. Assembly tasks move slower, so 5Hz can suffice. Precision insertion demands 50Hz or higher for micro-adjustments. Here’s the catch. The 𝗼𝗽𝘁𝗶𝗺𝗮𝗹 𝗳𝗿𝗲𝗾𝘂𝗲𝗻𝗰𝘆 𝗶𝘀𝗻’𝘁 𝘀𝗼𝗺𝗲𝘁𝗵𝗶𝗻𝗴 𝘆𝗼𝘂 𝗰𝗮𝗻 𝗴𝘂𝗲𝘀𝘀. The traditional approach wastes time and demos. You pick a frequency, collect thousands of demos, train a model, get mediocre results, and realize you have to start over. A better way is to 𝗰𝗼𝗹𝗹𝗲𝗰𝘁 𝗱𝗮𝘁𝗮 𝗮𝘁 𝗵𝗶𝗴𝗵 𝗳𝗿𝗲𝗾𝘂𝗲𝗻𝗰𝘆, experiment with sync rates, and find what actually works for your task. We’ve seen grasping tasks perform best at 12Hz - not 10Hz, not 15Hz - discovered only through systematic testing. 𝗦𝘆𝗻𝗰𝗵𝗿𝗼𝗻𝗶𝘇𝗮𝘁𝗶𝗼𝗻 𝗳𝗿𝗲𝗾𝘂𝗲𝗻𝗰𝘆 𝗶𝘀 𝗮 𝗵𝘆𝗽𝗲𝗿𝗽𝗮𝗿𝗮𝗺𝗲𝘁𝗲𝗿. 𝗧𝗿𝗲𝗮𝘁 𝗶𝘁 𝗹𝗶𝗸𝗲 𝗹𝗲𝗮𝗿𝗻𝗶𝗻𝗴 𝗿𝗮𝘁𝗲 𝗼𝗿 𝗯𝗮𝘁𝗰𝗵 𝘀𝗶𝘇𝗲. 𝗗𝗼𝗻’𝘁 𝗱𝗲𝗰𝗶𝗱𝗲 𝗶𝘁 𝗼𝗻𝗰𝗲 𝗮𝗻𝗱 𝗿𝗲𝗴𝗿𝗲𝘁 𝗶𝘁 𝗳𝗼𝗿𝗲𝘃𝗲𝗿. How do you choose your data collection frequency, fixed upfront or experimental?

22,574 görüntüleme • 8 ay önce

🚨Important update from our Robot Learning Lab in London. Following recent news, we’re moving on after a wonderful 2 years… Today, we unveil 4 big pieces of research from our incredible team. Check out the compilation video and thread below to see our final work! 📽️👇

🚨Important update from our Robot Learning Lab in London. Following recent news, we’re moving on after a wonderful 2 years… Today, we unveil 4 big pieces of research from our incredible team. Check out the compilation video and thread below to see our final work! 📽️👇

31,095 görüntüleme • 2 yıl önce

Hierarchical diffusion policy is another step along the journey of making hierarchical next-best pose agents more capable, through introduction of a kinematically-aware low-level diffusion planner.🤖 New work from the Dyson Robot Learning Lab. CVPR 2024

Hierarchical diffusion policy is another step along the journey of making hierarchical next-best pose agents more capable, through introduction of a kinematically-aware low-level diffusion planner.🤖 New work from the Dyson Robot Learning Lab. CVPR 2024

33,928 görüntüleme • 2 yıl önce

Daha fazla içerik yok.