Loading video...

Video Failed to Load

Go Home

imagine a personal agent that controls your mac from anywhere. chat with it on iOS, web, text, even email. it takes action on your computer. and it has a computer of its own – a cloud server, where you can host, automate, and do all kinds of work. coming...

15,071 views • 5 months ago •via X (Twitter)

0 Comments

No comments available

Comments from the original post will appear here

Related Videos

HERMES AGENT NOW SUPPORTS COMPUTER USE ON WINDOWS AND LINUX. CLICKS, TYPES, SCROLLS YOUR DESKTOP IN THE BACKGROUND WHILE YOU WORK. computer use was macOS only. now it works on Windows and Linux too via Cua. Nous Research HOW IT WORKS: cua-driver runs as an MCP server. Hermes takes a screenshot with numbered elements. clicks element #14 (the search field). types a query. submits. reads the result. during all of this: → your cursor stays where you left it → keyboard focus doesn't change → windows don't come to front → macOS doesn't switch Spaces you and the agent co-work on the same machine. WHAT IT CAN DO: → find your latest Stripe email and summarize it → fill forms in a web app that has no API → navigate desktop apps (Mail, browser, Finder) → interact with any GUI application → extract data from apps only accessible via screen WORKS WITH ANY VISION MODEL: not locked to Anthropic. | Provider | Works | |---|---| | Claude (Sonnet/Opus) | best overall | | GPT-4+, GPT-5.5 | full support | | Gemini (via OpenRouter) | full support | | Local vLLM / LM Studio | if model supports vision | | Text-only models | degraded (accessibility tree only) | SETUP: hermes computer-use install or: hermes tools → Computer Use → cua-driver grant permissions when prompted: → Accessibility (system settings) → Screen Recording (system settings) start a session: hermes -t computer_use chat or add to config.yaml / Desktop app settings to enable permanently. SAFETY: → destructive actions require your approval → blocked key combos: empty trash, force delete, lock screen, log out → blocked type patterns: curl | bash, sudo rm -rf /, fork bombs → agent cannot click permission dialogs → agent cannot type passwords → agent cannot follow instructions embedded in screenshots pair with approvals.mode: manual if you want every single click confirmed. TOKEN NOTE: screenshots are expensive. each one adds vision tokens to context. use computer_use for tasks where no API exists. if the tool has an API or MCP server, use that instead. 15 levels of Hermes Agent👇

YanXbt

29,030 views • 10 days ago

🚨 JUST IN: CHINA just released an AI EMPLOYEE that works 24X7 on its own. 100% OPEN SOURCE. It researches, codes, builds websites, creates slide decks, and generates videos. All by itself. All on your computer. It's called DeerFlow. You give it a task. It makes a plan, spins up its own team of sub-agents, and gets to work. You come back and there's a finished deliverable waiting. Not a draft. Not a summary. The actual thing. Not a chatbot. Not a research assistant. An AI with its own computer that works while you sleep. Here's what it does on its own: → Spawns multiple sub-agents in parallel, each tackling a different piece of your task, then combines everything into one finished output → Writes real code, runs it, reads the results, and fixes its own mistakes without asking you once → Builds slide decks, websites, full research reports, and data dashboards from scratch → Remembers you across sessions. Your writing style. Your tech stack. Your preferences. Gets better every time. → Reads files you upload, works with them inside its own filesystem, hands you clean finished outputs → Searches the web, runs commands, calls any tool you plug in Here's how it thinks: You give one instruction. The lead agent makes a plan. Sub-agents fan out and work in parallel. Results come back. Everything gets synthesized. You get a deliverable. A single research task might split into a dozen sub-agents, each exploring a different angle, then converge into one finished website with generated visuals. Here's the wildest part: DeerFlow 2.0 launched on February 28th 2026 and hit number 1 on all of GitHub Trending the same day. Version 2.0 was a complete rewrite. Zero shared code with version 1. Because users kept using it for things the team never intended. Data pipelines. Dashboards. Entire content workflows. The community told them what it needed to become. So they burned it down and rebuilt it. 22.7K GitHub stars. 2.7K forks. Built by ByteDance 100% Open Source. MIT License.

Kanika

736,538 views • 3 months ago