Video yükleniyor...
Video Yüklenemedi
Introducing Dynamism v1 (DYNA-1) by Dyna Robotics – the first robot foundation model built for round-the-clock, high-throughput dexterous autonomy. Here is a time-lapse video of our model autonomously folding 850+ napkins in a span of 24 hours with • 99.4% success rate — zero human intervention • 60% human... show more
518,131 görüntüleme • 1 yıl önce •via X (Twitter)
10 Yorum

When we founded Dyna, we began with a first-principles question that shaped our focus: What single technical hurdle must we clear to unlock unlimited demand for robots? After talking to hundreds of customers, the answer is so clear yet so underemphasized in current robotics discourse: PERFORMANCE — high throughput and high-quality output, consistently. Delivering robot performance has been our singular north star ever since; no one wants a robot that only kind of works slowly. We have tested DYNA-1 with several distinct 24-hr trials under different natural environment variations and found DYNA-1 robustly complete 700+ napkins in all trials. My favorite part of these timelapse videos is the 6-7am period when the sunrises and the model just continues performing the task as usual!

In contrast, We find that standard recipes for training robot foundation models are insufficient for real-world PERFORMANCE. On a demanding production task like restaurant-grade napkin folding, state-of-the-art models saturate at ≈ 80% single-episode success even after hundreds of hours of domain-specific data. At that level, the chance of 30 flawless consecutive executions is (0.8)³⁰ ≈ 0.1 %—functionally zero for 24/7 operations. We observe the same failure mode in-house: after one to two hours the policy drifts into unfamiliar states and cannot self-recover. The time-lapse below shows our strongest base model collapsing despite an initially perfect run. Flashy demos hide this brittleness; sustained autonomy demands ≥ 99% step-level reliability and robust fault-recovery, not just high single-episode accuracy. So what is our approach?

“Don’t practice until you get it right. Practice until you can’t get it wrong.” We have developed a general recipe for robust and autonomous robot foundation models for real-world applications. The linchpin in our recipe is an accurate reward model (RM) that scores every robot interaction with precision. Building on our prior research, we have delivered the first scalable foundation reward model for robotics. This model outperforms previous approaches and can reliably estimate task progress on challenging dexterity tasks, like napkin folding. This capability unlocks a host of production-critical capabilities, such as (1) autonomous exploration, (2) intentional error recovery, (3) high-quality dataset creation and curation, and much more.

A unique challenge we run into at Dyna is how can we make best use of the large amount of data autonomously collected by DYNA-1 during deployment? In continuous deployment settings, robot data does not naturally come with episodic boundaries. We have also developed an approach that can automatically segment the streaming data and provide accurate progress estimation and language labeling to enhance the model's task understanding.

By scaling our RM-in-the-loop training, DYNA-1 has leapt forward in just a few weeks: • Week 1: Base model can complete single success, but falls apart after 5 minutes • Week 2: Ran 1 hours unaided, but compounding errors make recovery impossible • Week 3: Ran 8 hours, but executed only 6-7 napkins per hour (~10 mins per fold) • Week 4: Completed our first 24-hour run—but ~200 folds at low quality and speed • Week 5: Completed 24+ hours with ~350 folds at decent production-grade quality. • Week 6: Sustained 24+ hours with ~850 folds and high production-grade quality From stop-and-go to round-the-clock excellence, our continual learning recipe drives rapid, tangible gains.

Over this learning process, DYNA-1 iteratively becomes much better at handling extremely difficult and out-of-distribution situations. Napkin folding is particularly challenging because: Single-pull precision: Extracting exactly one napkin from a tall stack demands fine control and rapid feedback; otherwise the gripper drags out multiple napkins, causing misfolds and chaos (as you can see in the attached videos). Flattening: When a multi-pull leaves napkins crumpled, the policy must (1) detect that multiple sheets were removed, (2) locate corners folded inward, and (3) separate & flatten overlapped layers before refolding. All of which are nontrivial dexterous endeavors. Rapid self-recovery: Once in an out-of-distribution state, the robot must untangle the mess and resume folding fast enough to keep throughput intact. Every extra second spent on edge cases erodes throughput, so the policy needs to find the quickest remedy. DYNA-1’s ability to handle chaotic scenarios even surprised us, and is the fundamental reason why it can go on for 24-hr with 99+% completion rate. There are too many robustness snippets to list, but here are a few of our favorites:

DYNA-1 achieved an unprecedented level of robustness for robot foundation models. But at Dyna, we hold ourselves to an even higher standard: production-grade quality (grades 4 or 5 out of 5 point scale). While 98% of folds reach near-perfect quality (grade ≥3), only 75% hit our rigorous quality bar. What’s the difference? Less than ⅓ inch precision on the initial fold separates perfection (grade 5) from near-perfection (grade 3). Our customers demand perfection—not near perfection—and we deliver. Tiny differences define commercial-grade quality at Dyna. This level of precision also raises the bar of our research, as every research idea is rigorously vetted to ensure measurable and significant real-world performance improvement.

We found that DYNA-1 can achieve zero-shot environment generalization for long-horizon dexterity. While we have seen foundation models that can generalize to new environments for simple pick-and-place skills by training on diverse environments and objects, such results remain elusive for bi-manual fine-grained dexterity. The video below shows DYNA-1 folding napkins at a customer site with no additional training, but it’s worth noting that we did observe noticeable performance loss (a topic we will be conducting more research on). With additional on-site training, DYNA-1 quickly improves and becomes adept at continuous folding at the customer site. This milestone represents a significant step towards our vision of delivering performance, out of the box.

…and what about task generalization? By focusing on dexterity and real-world robustness, we’re seeing strong positive transfer to other tough commercial tasks, such as laundry folding and, at a client’s request, cup-filling. DYNA-1 can autonomously fold many shirts of different sizes and materials in a row and also fill ingredient cups with utmost precision. Cup-filling is perhaps the hardest “no-reset” task we’ve encountered: delicate pickup, precise placement, handover, tool use—one slip ends the run. Though not perfect yet, DYNA-1 can clear every step while our internal baselines fail to move beyond the first step consistently.

DYNA-1 now folds napkins for paying customers, and we’re unlocking more skills to ship into more commercial environments in the coming weeks & months. Mastering napkin folding won’t transform daily life, but it’s a pivotal step toward making embodied AI commercially viable. This is a dream come true. After years of PhD research aimed at making robots genuinely useful in the real world, nothing felt as close as to what we accomplished in the last few months at Dyna. As we embark on this journey, we are excited to start sharing some of our results and research more widely with the robotics community! Alongside real-world robustness, we are also pushing the boundaries of cutting-edge large-scale robot learning. Join us in building robots for the real world. We look forward to hearing from you! Check out our blog post with our results:
