Загрузка видео...
Не удалось загрузить видео
🚀 Introducing Magentic-UI — an experimental human-centered web agent from Microsoft Research . It automates your web tasks while keeping you in control 🧠🤝—through co-planning, co-tasking, action guards, and plan learning. 🔓 Fully open-source. We can't wait for you to try it. 🔗
38,409 просмотров • 1 год назад •via X (Twitter)
Комментарии: 8

Computer Use Agents (CUAs) aim to automate complex tasks—but current systems are often brittle and risky. Magentic-UI introduces human-agent interaction paradigms to boost reliability, transparency, and control. Here are 4 core features powering that: 🧑🤝🧑 Co-Planning – Collaboratively create and approve step-by-step plans 🤝 Co-Tasking – Work together to execute tasks with real-time feedback 🛡️ Action Guards – Protect sensitive actions with user approvals 🧠 Plan Learning – Improve future automation by learning from past runs

📊 Initial evals are promising. With simulated users in the loop, Magentic-UI shows meaningful accuracy gains on the GAIA benchmark—compared to fully automated baselines. [Human + agent] > [Agent alone]

Built on Autogen’s Magentic-One, Magentic-UI adds an interactive layer for human–agent collaboration. Here’s the multi-agent set-up: 🧭 Orchestrator – co-plans with the user & delegates 🌐 WebSurfer – LLM agent with browser control 💻 Coder – writes & runs code in Docker 📁 FileSurfer – handles file conversion with MarkItDown tools Magentic-UI is modular & extensible - plug in your own agents, and give it a try!

We’re building Magentic-UI in the open to push the frontier of human–agent collaboration. We invite the community to extend and reuse it for their own scientific explorations. We'd love your feedback, thoughts and ideas. Want to learn more? Read our blog post

@MSFTResearch Is it related to Magentic-One, is it a UI of it?

@MSFTResearch Make it production ready product, why keep it behind research preview.

@MSFTResearch Auto gen has lots of boiler, any work arounds?

@MSFTResearch What default model do you use?

