Video yükleniyor...

Video Yüklenemedi

Bu video yüklenirken bir sorun oluştu. Bu geçici bir ağ sorunundan kaynaklanıyor olabilir veya video kullanılamıyor olabilir.

Ana Sayfaya Dön

Robotics keeps hitting the same wall. Single task RL works, but... it does not scale to hundreds of tasks or new embodiments. This new paper looks like a real step toward fixing that. The team introduces MMBench, a benchmark with 200 tasks across many domains and robots, and Newt,... a language conditioned world model trained online across all 200 tasks at once. The simple idea behind Newt: The model learns from demos to get the right priors It trains across many tasks through online interaction It uses language to ground the goal It adapts fast when a new task shows up What stood out to me: ✅ One model trained on 200 tasks at the same time ✅ Language conditioned control for both states and RGB ✅ Better data efficiency than strong baselines ✅ Strong open loop control ✅ Fast adaptation to new tasks and embodiments ✅ Full release of 200 checkpoints, 4000 demos, code, and benchmark This is a good push toward general control instead of one model per task. If you want the full paper: Project page: —- Weekly robotics and AI insights. Subscribe free:show more

Ilir Aliu

50,507 subscribers

70,090 görüntüleme • 7 ay önce •via X (Twitter)

Anya Rossi• Live Now

Private livecam show

0 Yorum

Yorum bulunmuyor

Orijinal gönderinin yorumları burada görünecek

Benzer Videolar

Placing objects sounds simple… until robots have to do it. This method makes it simple, fast & reliable. [Github ⬇️] Robotic object placement is tough, especially with stacking, hanging, or insertion. AnyPlace is a new two-stage method that uses only synthetic data and a vision-language model to teach robots where and how to place objects; even in the real world. Why this works ✅ Finds the right spot with help from vision-language models ✅ Handles stacking, insertion, and hanging with no real-world training ✅ Trained on synthetic data using Blender and IsaacSim ✅ Works in the real world without fine-tuning It shows that smart use of simulation and language models can make robotic placement tasks easier, faster, and more reliable. Github: Paper: Thank you for sharing Animesh Garg !

Placing objects sounds simple… until robots have to do it. This method makes it simple, fast & reliable. [Github ⬇️] Robotic object placement is tough, especially with stacking, hanging, or insertion. AnyPlace is a new two-stage method that uses only synthetic data and a vision-language model to teach robots where and how to place objects; even in the real world. Why this works ✅ Finds the right spot with help from vision-language models ✅ Handles stacking, insertion, and hanging with no real-world training ✅ Trained on synthetic data using Blender and IsaacSim ✅ Works in the real world without fine-tuning It shows that smart use of simulation and language models can make robotic placement tasks easier, faster, and more reliable. Github: Paper: Thank you for sharing Animesh Garg !

Ilir Aliu - eu/acc

22,843 görüntüleme • 1 yıl önce

🛠️ What if a robot could invent its own tools. And teach itself how to use them? That’s exactly what VLMgineer does: a new framework that lets Vision Language Models (VLMs) design physical tools and the actions to use them, entirely on their own. No templates. No human demonstrations. Just raw, AI-driven creativity. Why it matters ✅ Co-designs tools and actions together using VLMs, ensuring tight coupling between form and function ✅ Uses VLM-guided evolution (not random search) to refine designs intelligently ✅ Outperforms human-designed tools by +64.7% in task success across 12 RoboToolBench challenges ✅ Produces better-than-everyday tools for real manipulation tasks—measured in success rate and elegance It builds on the emerging trend of large-model-guided evolutionary design (like Eureka and AlphaEvolve) and brings it into physical robotics. It opens the door to general-purpose, automated hardware design, no strong priors needed. Code & paper: —- Weekly robotics and AI insights. Subscribe free:

🛠️ What if a robot could invent its own tools. And teach itself how to use them? That’s exactly what VLMgineer does: a new framework that lets Vision Language Models (VLMs) design physical tools and the actions to use them, entirely on their own. No templates. No human demonstrations. Just raw, AI-driven creativity. Why it matters ✅ Co-designs tools and actions together using VLMs, ensuring tight coupling between form and function ✅ Uses VLM-guided evolution (not random search) to refine designs intelligently ✅ Outperforms human-designed tools by +64.7% in task success across 12 RoboToolBench challenges ✅ Produces better-than-everyday tools for real manipulation tasks—measured in success rate and elegance It builds on the emerging trend of large-model-guided evolutionary design (like Eureka and AlphaEvolve) and brings it into physical robotics. It opens the door to general-purpose, automated hardware design, no strong priors needed. Code & paper: —- Weekly robotics and AI insights. Subscribe free:

Ilir Aliu

13,984 görüntüleme • 6 ay önce

Robots struggle with strict action rules…memory and symbols help them learn fast. [Project + Full video link ⬇️] Robots struggle when tasks require specific steps in a fixed order. What if memory helped them think symbolically and learn faster? Solving tasks like unlocking a door then opening it is hard for deep RL. But by learning constraint relationships and storing them in memory, robots can solve these tasks much faster; with fewer trials and less training. Why it works ✅ Learns symbolic rules about action constraints ✅ Uses memory to transfer what it learned across tasks ✅ Handles real-world exploration with just 30 minutes of data ✅ Needs 10x fewer episodes than deep RL approaches This memory-based method shows a promising path forward for robots learning structured, real-world tasks. Full video: Paper: Thank you, Mrinal Verghese for sharing this amazing work! 🙏

Robots struggle with strict action rules…memory and symbols help them learn fast. [Project + Full video link ⬇️] Robots struggle when tasks require specific steps in a fixed order. What if memory helped them think symbolically and learn faster? Solving tasks like unlocking a door then opening it is hard for deep RL. But by learning constraint relationships and storing them in memory, robots can solve these tasks much faster; with fewer trials and less training. Why it works ✅ Learns symbolic rules about action constraints ✅ Uses memory to transfer what it learned across tasks ✅ Handles real-world exploration with just 30 minutes of data ✅ Needs 10x fewer episodes than deep RL approaches This memory-based method shows a promising path forward for robots learning structured, real-world tasks. Full video: Paper: Thank you, Mrinal Verghese for sharing this amazing work! 🙏

Ilir Aliu - eu/acc

10,241 görüntüleme • 1 yıl önce

🤖 Another zero-shot reward model is now in LeRobot: ROBOMETER. A general-purpose, zero-shot video-language reward model from University of South Carolina, UT Dallas, Massachusetts Institute of Technology (MIT), University of Washington, Ai2, and NVIDIA that predicts frame-level task progress. Trained on 1M+ trajectories from 21 robot embodiments, generalizes zero-shot to unseen tasks, scenes, and robots. 2.4–4.5x better downstream success rates across online RL, offline RL, data filtering, failure detection, and data retrieval for IL. Project: Paper:

🤖 Another zero-shot reward model is now in LeRobot: ROBOMETER. A general-purpose, zero-shot video-language reward model from University of South Carolina, UT Dallas, Massachusetts Institute of Technology (MIT), University of Washington, Ai2, and NVIDIA that predicts frame-level task progress. Trained on 1M+ trajectories from 21 robot embodiments, generalizes zero-shot to unseen tasks, scenes, and robots. 2.4–4.5x better downstream success rates across online RL, offline RL, data filtering, failure detection, and data retrieval for IL. Project: Paper:

LeRobot

32,625 görüntüleme • 1 ay önce

New Generation Model! 🚨 We're introducing the Mistral model to our expanding lineup of generation models. Mistral brings efficient performance and strong language understanding capabilities to our platform. Initial testing shows promising results in code comprehension and generation tasks, making it a valuable addition to development workflows. While we continue to optimize its implementation, early benchmarks demonstrate consistent and reliable outputs across various programming tasks.

New Generation Model! 🚨 We're introducing the Mistral model to our expanding lineup of generation models. Mistral brings efficient performance and strong language understanding capabilities to our platform. Initial testing shows promising results in code comprehension and generation tasks, making it a valuable addition to development workflows. While we continue to optimize its implementation, early benchmarks demonstrate consistent and reliable outputs across various programming tasks.

ALCHEMIST AI 🔮

16,418 görüntüleme • 1 yıl önce

Most of what I actually need help with, I never think to tell a model. But why is it on me to remember? Our new paper asks: what if AI could proactively specialize to individuals and the tasks they’re carrying out at this very moment? 🧵

Most of what I actually need help with, I never think to tell a model. But why is it on me to remember? Our new paper asks: what if AI could proactively specialize to individuals and the tasks they’re carrying out at this very moment? 🧵

Michelle Lam

48,774 görüntüleme • 2 ay önce

🚀Thrilled to share what we’ve been building at TRI over the past several months: our first Large Behavior Models (LBMs) are here! I’m proud to have been a core contributor to the multi-task policy learning and post-training efforts. At TRI, we’ve been researching how LBMs can help robots learn faster, better, and more efficiently. The key takeaways: ✅ We built an evaluation pipeline to benchmark LBM performance with real 𝐬𝐭𝐚𝐭𝐢𝐬𝐭𝐢𝐜𝐚𝐥 𝐜𝐨𝐧𝐟𝐢𝐝𝐞𝐧𝐜𝐞 ✅ Pre-training on hundreds of tasks makes models more robust—plus, we can teach new, complex tasks with 80% 𝐥𝐞𝐬𝐬 𝐝𝐚𝐭𝐚 ✅ The bigger and more diverse the pre-training, the better the results Check out our overview video, webpage and paper for more details: ✨ 🌎 📄 We hope this work helps move the field of robotics forward!

🚀Thrilled to share what we’ve been building at TRI over the past several months: our first Large Behavior Models (LBMs) are here! I’m proud to have been a core contributor to the multi-task policy learning and post-training efforts. At TRI, we’ve been researching how LBMs can help robots learn faster, better, and more efficiently. The key takeaways: ✅ We built an evaluation pipeline to benchmark LBM performance with real 𝐬𝐭𝐚𝐭𝐢𝐬𝐭𝐢𝐜𝐚𝐥 𝐜𝐨𝐧𝐟𝐢𝐝𝐞𝐧𝐜𝐞 ✅ Pre-training on hundreds of tasks makes models more robust—plus, we can teach new, complex tasks with 80% 𝐥𝐞𝐬𝐬 𝐝𝐚𝐭𝐚 ✅ The bigger and more diverse the pre-training, the better the results Check out our overview video, webpage and paper for more details: ✨ 🌎 📄 We hope this work helps move the field of robotics forward!

Zubair Irshad

20,314 görüntüleme • 1 yıl önce

D4RL is a great benchmark, but is saturated. Introducing OGBench, a new benchmark for offline goal-conditioned RL and offline RL! Tasks include HumanoidMaze, Puzzle, Drawing, and more 🙂 Project page: GitHub: 🧵↓

D4RL is a great benchmark, but is saturated. Introducing OGBench, a new benchmark for offline goal-conditioned RL and offline RL! Tasks include HumanoidMaze, Puzzle, Drawing, and more 🙂 Project page: GitHub: 🧵↓

Seohong Park

36,410 görüntüleme • 1 yıl önce

BREAKING 🚨: OpenAI is actively polishing its Tasks feature and there is a big chance we will see them announced today 👀 - Tasks Beta will allow users to schedule tasks like "send me AI news from TestingCatalog at 9 am" - These automations will be handled by a new model tool "jawbone" - There will be a new Notifications tab in settings, assumingly to control the way you will receive notifications about scheduled tasks Interestingly, the same feature is being in development for Gemini. What is the chance of seeing both of them released on the same day?

BREAKING 🚨: OpenAI is actively polishing its Tasks feature and there is a big chance we will see them announced today 👀 - Tasks Beta will allow users to schedule tasks like "send me AI news from TestingCatalog at 9 am" - These automations will be handled by a new model tool "jawbone" - There will be a new Notifications tab in settings, assumingly to control the way you will receive notifications about scheduled tasks Interestingly, the same feature is being in development for Gemini. What is the chance of seeing both of them released on the same day?

🚨 AI News | TestingCatalog

204,165 görüntüleme • 1 yıl önce

🤖 NVIDIA’s Gr00t N1.5 is now available in LeRobot! This is the result of a great collaboration between the Hugging Face LeRobot team and NVIDIA Robotics ! Gr00t N1.5 highlights: 🦾 Cross-embodiment foundation model for robots 🧠 Multimodal inputs: vision, language, and proprioception 🪛Tested on the Libero benchmark and real-world hardware tasks 🌍Trained on real robot, synthetic, and internet-scale video data ⚙️ Flow matching action transformer for action prediction

🤖 NVIDIA’s Gr00t N1.5 is now available in LeRobot! This is the result of a great collaboration between the Hugging Face LeRobot team and NVIDIA Robotics ! Gr00t N1.5 highlights: 🦾 Cross-embodiment foundation model for robots 🧠 Multimodal inputs: vision, language, and proprioception 🪛Tested on the Libero benchmark and real-world hardware tasks 🌍Trained on real robot, synthetic, and internet-scale video data ⚙️ Flow matching action transformer for action prediction

LeRobot

115,194 görüntüleme • 8 ay önce

Most imitation learning policies break when the camera moves or the robot changes. NOT THIS ONE 👇 [📍 Bookmark for later ] A new 3D scene representation encoder, tackles this by enabling zero-shot generalization to unseen embodiments and viewpoints… And it works with any IL algorithm. The trick? •Use a 2D foundation model to extract semantic features •Lift them into 3D space for localization (not semantics) •Condition the IL policy on this spatially grounded vector Across 93 simulated and 6 real tasks, Adapt3R: ✅ Maintains IL performance on LIBERO & MimicGen benchmarks ✅ Outperforms DP3 and 3D Diffuser Actor in most settings ✅ Holds >80% success on LIBERO even with large camera rotations Thanks for sharing this, Animesh Garg & Albert Wilcox! 📍Paper: Website: Code:

Most imitation learning policies break when the camera moves or the robot changes. NOT THIS ONE 👇 [📍 Bookmark for later ] A new 3D scene representation encoder, tackles this by enabling zero-shot generalization to unseen embodiments and viewpoints… And it works with any IL algorithm. The trick? •Use a 2D foundation model to extract semantic features •Lift them into 3D space for localization (not semantics) •Condition the IL policy on this spatially grounded vector Across 93 simulated and 6 real tasks, Adapt3R: ✅ Maintains IL performance on LIBERO & MimicGen benchmarks ✅ Outperforms DP3 and 3D Diffuser Actor in most settings ✅ Holds >80% success on LIBERO even with large camera rotations Thanks for sharing this, Animesh Garg & Albert Wilcox! 📍Paper: Website: Code:

Ilir Aliu

12,178 görüntüleme • 10 ay önce

Can robots learn without training❓ [𝗜𝘁'𝘀 𝗼𝗽𝗲𝗻 𝘀𝗼𝘂𝗿𝗰𝗲𝗱 ⬇ ] Teaching robots to do complex tasks WITHOUT spending hours training them. Sounds cool, right? That's exactly what DIAL-MPC does! The first training-free method for whole-body torque control using full-order dynamics: ✅ Instantly checks if a robot's moves are right or wrong ✅ Adapts quickly to new tasks without needing extra training ✅ Could work hand-in-hand with other robot learning methods Robots are getting smarter AND faster without the need for long training sessions. Website: Paper: Code: Saw this first Haoru Xue ✈️ CVPR 🙏

Can robots learn without training❓ [𝗜𝘁'𝘀 𝗼𝗽𝗲𝗻 𝘀𝗼𝘂𝗿𝗰𝗲𝗱 ⬇ ] Teaching robots to do complex tasks WITHOUT spending hours training them. Sounds cool, right? That's exactly what DIAL-MPC does! The first training-free method for whole-body torque control using full-order dynamics: ✅ Instantly checks if a robot's moves are right or wrong ✅ Adapts quickly to new tasks without needing extra training ✅ Could work hand-in-hand with other robot learning methods Robots are getting smarter AND faster without the need for long training sessions. Website: Paper: Code: Saw this first Haoru Xue ✈️ CVPR 🙏

Ilir Aliu

71,502 görüntüleme • 1 yıl önce

You don’t have to use one model (or one provider!) for everything. With Workshop, you can combine frontier and local models in the same workflow. For example: Opus can be the main agent, and delegate specific tasks to Gemma 4 via subagents. Better quality where it matters. Better privacy, speed, and cost where it counts. One workflow, best model for each task.

You don’t have to use one model (or one provider!) for everything. With Workshop, you can combine frontier and local models in the same workflow. For example: Opus can be the main agent, and delegate specific tasks to Gemma 4 via subagents. Better quality where it matters. Better privacy, speed, and cost where it counts. One workflow, best model for each task.

Workshop AI

883,582 görüntüleme • 2 ay önce

1/ Introducing Glider - the smallest model to beat GPT-4o-mini on eval tasks ⚡🚀 - Open source, open weights, open code - Explainable evaluations by nature - Trained on 183 criteria and 685 domains Try it out for free at 🔥

1/ Introducing Glider - the smallest model to beat GPT-4o-mini on eval tasks ⚡🚀 - Open source, open weights, open code - Explainable evaluations by nature - Trained on 183 criteria and 685 domains Try it out for free at 🔥

PatronusAI

14,856 görüntüleme • 1 yıl önce

AI in robotics gets all the attention right now, but sometimes the most interesting work is very practical. Viet built a small vision system that counts potatoes on a conveyor belt. No giant dataset. No huge model. Just a clear problem and a smart setup. He used Ultralytics’ ObjectCounter, trained a tiny YOLO11 nano model, and because there was no potato dataset, he annotated a single frame with SAM 2 and trained from that. One frame. Still works across the whole video. It is a good reminder that useful AI in industry often looks like this. Focused. Lightweight. Solves a real task. If you work in manufacturing or robotics, these small systems are usually the fastest wins. They save time, reduce errors, and do not need massive infrastructure. Nice work, Viet. His projects: —- Weekly robotics and AI insights. Subscribe free:

AI in robotics gets all the attention right now, but sometimes the most interesting work is very practical. Viet built a small vision system that counts potatoes on a conveyor belt. No giant dataset. No huge model. Just a clear problem and a smart setup. He used Ultralytics’ ObjectCounter, trained a tiny YOLO11 nano model, and because there was no potato dataset, he annotated a single frame with SAM 2 and trained from that. One frame. Still works across the whole video. It is a good reminder that useful AI in industry often looks like this. Focused. Lightweight. Solves a real task. If you work in manufacturing or robotics, these small systems are usually the fastest wins. They save time, reduce errors, and do not need massive infrastructure. Nice work, Viet. His projects: —- Weekly robotics and AI insights. Subscribe free:

Ilir Aliu

1,674,926 görüntüleme • 7 ay önce

Don't train the model, evolve the harness. I read a brilliant blog post from Hugging Face where they took a frozen open model scoring 0% on a hard legal agent benchmark, left its weights alone, and let an automated loop rewrite only the code around it. That code layer is the harness, the runtime wrapper that feeds the model context, runs its tool calls, and decides when a run ends. By the time the loop finished, the system had essentially matched Sonnet 4.6 on the benchmark's headline metric, at roughly 7x lower cost per task. Zero weights changed. The gain existed because of where the model was failing. The judge only grades files saved in the right place under the exact requested filename, and the model kept doing the legal analysis correctly, then saving it under the wrong name, dropping it in a scratch folder, or never writing it at all. So the 0% was never measuring legal reasoning. It was measuring the harness. Hand-tuning that layer is slow and model-specific, so they automated it. A Claude proposer adds exactly one mechanism per iteration, and an outer loop keeps it only if it clearly beats the current best, so accepted mechanisms compound. What the loop discovered says a lot about where agents actually fail. → The biggest single gain was file handling, not intelligence. An automatic step that lands the deliverable exactly where the judge expects it beat every prompt change, with zero extra model tokens. → Code fixes transferred across models, prompt playbooks did not. The same harness lifted a smaller model from the same family by 14 points, but the tuned prompts hurt a different model family on tasks it could already finish. → The harness mattered more than anything else. Same model, same judge, same tasks, and five different harnesses scored anywhere between 3.5% and 80.1%. The gains do eventually flatten, and the remaining misses look like real capability gaps. At some point the wrapper runs out of tricks and the model has to carry the work. But the lesson holds. A benchmark score measures the model and its harness together, and until the harness is fixed, it's impossible to know which one failed. I highly recommend reading this: I also wrote a deep dive on agent harness engineering a while back, covering the orchestration loop, tools, memory, context management, and everything that turns a stateless LLM into a capable agent. The article is quoted below.

Don't train the model, evolve the harness. I read a brilliant blog post from Hugging Face where they took a frozen open model scoring 0% on a hard legal agent benchmark, left its weights alone, and let an automated loop rewrite only the code around it. That code layer is the harness, the runtime wrapper that feeds the model context, runs its tool calls, and decides when a run ends. By the time the loop finished, the system had essentially matched Sonnet 4.6 on the benchmark's headline metric, at roughly 7x lower cost per task. Zero weights changed. The gain existed because of where the model was failing. The judge only grades files saved in the right place under the exact requested filename, and the model kept doing the legal analysis correctly, then saving it under the wrong name, dropping it in a scratch folder, or never writing it at all. So the 0% was never measuring legal reasoning. It was measuring the harness. Hand-tuning that layer is slow and model-specific, so they automated it. A Claude proposer adds exactly one mechanism per iteration, and an outer loop keeps it only if it clearly beats the current best, so accepted mechanisms compound. What the loop discovered says a lot about where agents actually fail. → The biggest single gain was file handling, not intelligence. An automatic step that lands the deliverable exactly where the judge expects it beat every prompt change, with zero extra model tokens. → Code fixes transferred across models, prompt playbooks did not. The same harness lifted a smaller model from the same family by 14 points, but the tuned prompts hurt a different model family on tasks it could already finish. → The harness mattered more than anything else. Same model, same judge, same tasks, and five different harnesses scored anywhere between 3.5% and 80.1%. The gains do eventually flatten, and the remaining misses look like real capability gaps. At some point the wrapper runs out of tricks and the model has to carry the work. But the lesson holds. A benchmark score measures the model and its harness together, and until the harness is fixed, it's impossible to know which one failed. I highly recommend reading this: I also wrote a deep dive on agent harness engineering a while back, covering the orchestration loop, tools, memory, context management, and everything that turns a stateless LLM into a capable agent. The article is quoted below.

Akshay 🚀

227,260 görüntüleme • 2 gün önce

Multi-robot learning is getting a serious boost! 📚 Researchers have extended Isaac Lab to train heterogeneous multi-agent robotic policies at scale. The new framework supports high-resolution physics, GPU-accelerated simulation, and both homogeneous and heterogeneous agents working together on coordination tasks. They benchmarked different approaches (MAPPO: Multi-Agent Proximal Policy Optimization and HAPPO: Heterogeneous Agent PPO) across six challenging scenarios and showed that large-scale multi-robot training is not only feasible, but efficient. It’s an important step for real-world robotic collaboration, where teams of robots need to coordinate, split tasks, adapt roles, and interact dynamically, not just operate as identical clones. The code is open-source, and it pushes Isaac Lab closer to what robotics actually needs: scalable, physics-driven environments where many different robots can learn to work together. Here's the project page: ~~ ♻️ Join the weekly robotics newsletter, and never miss any news →

Multi-robot learning is getting a serious boost! 📚 Researchers have extended Isaac Lab to train heterogeneous multi-agent robotic policies at scale. The new framework supports high-resolution physics, GPU-accelerated simulation, and both homogeneous and heterogeneous agents working together on coordination tasks. They benchmarked different approaches (MAPPO: Multi-Agent Proximal Policy Optimization and HAPPO: Heterogeneous Agent PPO) across six challenging scenarios and showed that large-scale multi-robot training is not only feasible, but efficient. It’s an important step for real-world robotic collaboration, where teams of robots need to coordinate, split tasks, adapt roles, and interact dynamically, not just operate as identical clones. The code is open-source, and it pushes Isaac Lab closer to what robotics actually needs: scalable, physics-driven environments where many different robots can learn to work together. Here's the project page: ~~ ♻️ Join the weekly robotics newsletter, and never miss any news →

Lukas Ziegler

38,997 görüntüleme • 7 ay önce

DAO Labs Sneak Peak Preview: 1 ) Instant Sign Up/Sign across all HUBs via X or Wallet✅ 2 ) Profile summarized Data of your activities and what they are worth, get a better view of your earnings.💰 3 ) Task Navigator to oversee, in real time, what tasks are available for you to work on across all our HUBs.🧭 4 ) A Timer Function, being able to optimize post relevance and expiration.⏰ Half of all the features complete, on the way to grant you a seamless #SocialMining experience connecting all HUBs

DAO Labs

107,674 görüntüleme • 2 yıl önce

ManiFeel is a benchmark for contact-rich manipulation tasks. Force- and tactile-based tasks with occlusions and partially-observable state are going to be crucial for true general purpose robotics. Website:

ManiFeel is a benchmark for contact-rich manipulation tasks. Force- and tactile-based tasks with occlusions and partially-observable state are going to be crucial for true general purpose robotics. Website:

Chris Paxton

12,063 görüntüleme • 4 ay önce

Grok Computer is expected to roll out soon. It is the new computer-use agent from xAI, designed to control a PC like a human: Viewing the screen, moving the mouse, typing, navigating and handling tasks autonomously.

Grok Computer is expected to roll out soon. It is the new computer-use agent from xAI, designed to control a PC like a human: Viewing the screen, moving the mouse, typing, navigating and handling tasks autonomously.

Testlabor

94,206 görüntüleme • 2 ay önce