正在加载视频...

视频加载失败

加载此视频时出现问题。这可能是由于临时网络问题，或视频可能不可用。

This Chinese developer runs 9 agents on Claude Code under a GPT-5.5 orchestrator and they close 500 client tasks a month without a single assistant. His client work is closed without him, on a single laptop and only three subscriptions. The entire system lives on one MacBook Pro M4... with 128 GB of memory and subscriptions to Claude Code and GPT-5.5 cost him approximately $300 a month. There is no CRM, no team, no office only a terminal window with 9 parallel streams. The orchestrator works with a simple system prompt: «You are the orchestrator of a client inbox. Classify every incoming email into 4 categories: code, content, analysis, communication. Delegate to the corresponding worker agent. When the result is ready, check it for completeness, send it to the client on my behalf, and mark the task as closed. Do not ask clarifying questions.» And the orchestrator checks the inbox every 30 seconds, classifies fresh emails, and distributes them to 9 worker agents on Claude Code, each of whom is responsible for their own class of tasks. Here is an example of how one of them closes a request to refactor a client's auth module: Task: refactor user-auth module Broke the monolith into 3 files by responsibilities Added unit tests, coverage increased to 87% Renamed 4 functions to camelCase according to the style guide PR is ready for review, link below» And so about 50 cycles a day. By noon 25 tasks are closed, by dinner 50, and by the end of the month 500. On average, it takes about 7 minutes from the appearance of an email in the inbox to sending the result to the client. This is more than what a live team of 6 developers, copywriters and analysts working 8 hours a day closes. This is no longer an agency. This is a workstation where an orchestrator replaces a manager, and 9 worker agents replace the staff. The pipeline goes from inbox to closing 500 times a month without human participation at any step.show more

Blaze

9,480 subscribers

29,917 次观看 • 1 个月前 •via X (Twitter)

Anya Rossi• Live Now

Private livecam show

0 条评论

暂无评论

原始帖子的评论将显示在这里

相关视频

This Chinese guy created agents in Claude Code for MCP servers and single-handedly serves 6 marketing agencies a month from one iPhone, earning $5,000 from each. Inside he runs a pipeline of 7 agents on Claude Sonnet 4.6 that every Monday pulls a scan of the tech stack from a selected agency, develops an MCP server for its ad accounts, and over the course of a week brings it to production code ready to connect to Claude Desktop. No DevOps, no senior developer, no project manager. Just a Mac Mini in a work corner, an iPhone in the pocket, and a single API key. And traditional dev shops keep 5 people on project rates for the same contract, while his entire P&L is tokens, dirt-cheap hosting on Cloudflare, and Calendly. 7 agents run under a shared orchestrator-router and burn about 5 million tokens a day, which in the API bill comes out to $540 a month. The Mac Mini itself sits at home and keeps the entire orchestrator running 24/7, and from the iPhone the owner connects to it through a secure remote terminal and sees the output of any session right on the smartphone screen, wherever he happens to be. His starting system prompt looks like this: "you run a solo shop for custom MCP servers for marketing agencies. you hand out read-only tasks to 6 sub-agents and own all commits and shipping yourself. sub-agents: // Hunter (finds marketing agencies of 15 to 60 people that have no MCP access to Google Ads, Meta Ads, TikTok Ads, and HubSpot) // Mapper (pulls their tech stack, identifies 3 to 5 integration pains, and simultaneously writes the technical spec for the server: which tools, resources, and prompts to export through MCP, which auth flow and rate limit) // Coder (generates an MCP server in Python through the MCP SDK, deploys 8 to 15 tools for ad accounts and CRM) // Validator (connects the server to Claude Desktop, runs real client API keys in a sandbox, and checks for compliance with the MCP spec) // Shipper (writes a README, integration guide, deployment manual, packages the server, and hosts it on Cloudflare Workers or pushes to the GitHub of the client) // Mobile (always online on the iPhone, books demo calls in Calendly, picks up hot fixes, and confirms contracts through a secure remote terminal to the Mac Mini). only 1 owner agent works on 1 contract, no overlaps. you pull the owner out of observation mode only when a deal goes above $7,500 or the test coverage of the server drops below 85%." This prompt gives the system an understanding of its role and the limits of intervention from the very first line. It knows it is supposed to find agencies on its own. It knows it is supposed to bring every MCP server to production on its own. It knows it connects the live owner only on large deals or when the tests do not converge. → The pipeline runs without breaks, day or night → Hunter goes through about 130 marketing agencies on LinkedIn and Clutch per day → Mapper rolls out 4 audit reports with the tech stack and a final spec for each → Coder writes 1 to 2 MCP servers per week in Python with 8 to 15 tools → Validator validates every server through Claude Desktop with real client API keys → Shipper rolls out the full documentation package and pushes the finished product to Cloudflare Workers or the GitHub of the client And only when a contract breaks $7,500 or test coverage drops below 85% does the orchestrator pull the owner from whatever he is doing. And when the owner at that moment is behind the wheel or at a meeting in a coworking space, the Mobile agent in his iPhone picks up 1 contract in progress: confirms a meeting with the agency CMO in Calendly, opens a live demo of the MCP server through a secure terminal to the Mac Mini, and writes the test result to the shared state. The owner just swipes "approve" and in 15 minutes joins the Zoom demo. The fresh system log from last Wednesday looks like this: "hunter report: 132 agencies checked on LinkedIn and Clutch, 19 without MCP integrations, 8 with active requests for AI tooling in job posts, 4 with an open Q4 budget. passing to mapper." "coder: MCP server for Northwave Performance Marketing built in Python, 11 tools for Google Ads, Meta Ads, and GA4, 320 lines of code. exported to /Users/dev/mcp-shop/clients/northwave/server.py. validator connecting to Claude Desktop." "validator: 11 tools passed validation through Claude Desktop, test coverage 92%, average latency 380 ms. passing to shipper." "eval flag: contract with Pacific Reach Agency at $8,200 exceeds the approved limit of $7,500. sending for manual review." In his work setup there is no cloud server, no external team, and not even a separate office. At home sits a Mac Mini with a sandbox at /Users/dev/mcp-shop, on top runs an MCP router with a single API key to Claude, and the same key is forwarded to a secure terminal on the iPhone. Out of everything I have seen this year, this is the cleanest solo shop for custom MCP servers for marketing agencies: $540 a month on the API, about $30,000 into the account, and between them 7 system prompts, 1 Mac Mini in a work corner, and 1 iPhone that never leaves the pocket.

This Chinese guy created agents in Claude Code for MCP servers and single-handedly serves 6 marketing agencies a month from one iPhone, earning $5,000 from each. Inside he runs a pipeline of 7 agents on Claude Sonnet 4.6 that every Monday pulls a scan of the tech stack from a selected agency, develops an MCP server for its ad accounts, and over the course of a week brings it to production code ready to connect to Claude Desktop. No DevOps, no senior developer, no project manager. Just a Mac Mini in a work corner, an iPhone in the pocket, and a single API key. And traditional dev shops keep 5 people on project rates for the same contract, while his entire P&L is tokens, dirt-cheap hosting on Cloudflare, and Calendly. 7 agents run under a shared orchestrator-router and burn about 5 million tokens a day, which in the API bill comes out to $540 a month. The Mac Mini itself sits at home and keeps the entire orchestrator running 24/7, and from the iPhone the owner connects to it through a secure remote terminal and sees the output of any session right on the smartphone screen, wherever he happens to be. His starting system prompt looks like this: "you run a solo shop for custom MCP servers for marketing agencies. you hand out read-only tasks to 6 sub-agents and own all commits and shipping yourself. sub-agents: // Hunter (finds marketing agencies of 15 to 60 people that have no MCP access to Google Ads, Meta Ads, TikTok Ads, and HubSpot) // Mapper (pulls their tech stack, identifies 3 to 5 integration pains, and simultaneously writes the technical spec for the server: which tools, resources, and prompts to export through MCP, which auth flow and rate limit) // Coder (generates an MCP server in Python through the MCP SDK, deploys 8 to 15 tools for ad accounts and CRM) // Validator (connects the server to Claude Desktop, runs real client API keys in a sandbox, and checks for compliance with the MCP spec) // Shipper (writes a README, integration guide, deployment manual, packages the server, and hosts it on Cloudflare Workers or pushes to the GitHub of the client) // Mobile (always online on the iPhone, books demo calls in Calendly, picks up hot fixes, and confirms contracts through a secure remote terminal to the Mac Mini). only 1 owner agent works on 1 contract, no overlaps. you pull the owner out of observation mode only when a deal goes above $7,500 or the test coverage of the server drops below 85%." This prompt gives the system an understanding of its role and the limits of intervention from the very first line. It knows it is supposed to find agencies on its own. It knows it is supposed to bring every MCP server to production on its own. It knows it connects the live owner only on large deals or when the tests do not converge. → The pipeline runs without breaks, day or night → Hunter goes through about 130 marketing agencies on LinkedIn and Clutch per day → Mapper rolls out 4 audit reports with the tech stack and a final spec for each → Coder writes 1 to 2 MCP servers per week in Python with 8 to 15 tools → Validator validates every server through Claude Desktop with real client API keys → Shipper rolls out the full documentation package and pushes the finished product to Cloudflare Workers or the GitHub of the client And only when a contract breaks $7,500 or test coverage drops below 85% does the orchestrator pull the owner from whatever he is doing. And when the owner at that moment is behind the wheel or at a meeting in a coworking space, the Mobile agent in his iPhone picks up 1 contract in progress: confirms a meeting with the agency CMO in Calendly, opens a live demo of the MCP server through a secure terminal to the Mac Mini, and writes the test result to the shared state. The owner just swipes "approve" and in 15 minutes joins the Zoom demo. The fresh system log from last Wednesday looks like this: "hunter report: 132 agencies checked on LinkedIn and Clutch, 19 without MCP integrations, 8 with active requests for AI tooling in job posts, 4 with an open Q4 budget. passing to mapper." "coder: MCP server for Northwave Performance Marketing built in Python, 11 tools for Google Ads, Meta Ads, and GA4, 320 lines of code. exported to /Users/dev/mcp-shop/clients/northwave/server.py. validator connecting to Claude Desktop." "validator: 11 tools passed validation through Claude Desktop, test coverage 92%, average latency 380 ms. passing to shipper." "eval flag: contract with Pacific Reach Agency at $8,200 exceeds the approved limit of $7,500. sending for manual review." In his work setup there is no cloud server, no external team, and not even a separate office. At home sits a Mac Mini with a sandbox at /Users/dev/mcp-shop, on top runs an MCP router with a single API key to Claude, and the same key is forwarded to a secure terminal on the iPhone. Out of everything I have seen this year, this is the cleanest solo shop for custom MCP servers for marketing agencies: $540 a month on the API, about $30,000 into the account, and between them 7 system prompts, 1 Mac Mini in a work corner, and 1 iPhone that never leaves the pocket.

Blaze

55,926 次观看 • 1 个月前

SPEC IS BECOMING THE PRODUCT An Anthropic engineer gave Claude a spec, pointed it to an Asana board and left for the weekend. Claude broke it into tickets and spun up a team of agents. The agents started picking up tasks on their own. No one told them to. They just did.

SPEC IS BECOMING THE PRODUCT An Anthropic engineer gave Claude a spec, pointed it to an Asana board and left for the weekend. Claude broke it into tickets and spun up a team of agents. The agents started picking up tasks on their own. No one told them to. They just did.

Shubham Saboo

285,562 次观看 • 3 个月前

This Chinese developer launched 6 agents under 1 orchestrator, and they run his UI design agency at $32,000 a month on their own. He built a system of 6 agents on Claude Sonnet 4.6 that single-handedly runs his agency for UI auditing and redesign for SaaS startups and e-commerce. No contractors, no project manager, and no team. Just him, a MacBook, and 1 API key. Traditional design agencies out of Shenzhen keep teams of 8 people on salaries for the same volume, while he keeps only API tokens. 6 agents work through a single orchestrator on Claude Code Router. Usage is about 4 million tokens a day, the average API bill is just $480 a month. All 6 go through MCP servers and write shared state to the file system, without shared state in memory and without race conditions. And here is the system prompt he gave the orchestrator before launch: "you are the orchestrator of a one-man UI agency. you delegate read-only research tasks to 5 sub-agents and own all writes. sub-agents: // Hunter (finds SaaS and e-commerce sites with outdated UI) // Auditor (runs each site through Lighthouse, accessibility, and design system checks) // Pitcher (writes cold outreach and redesign proposals with before/after screenshots) // Splitter (breaks accepted projects into typed milestones) // Designer (generates Figma mockups and Tailwind components) // Checker (runs evals on every artifact before it leaves the harness). you never let 2 sub-agents touch 1 file. you stop and request human approval only when an invoice exceeds $5,000 or when the design system eval score drops below 0.88." Meaning the system knows exactly what it is and within what boundaries it operates. It knows it is supposed to find clients on its own. It knows it is supposed to write proposals with screenshots and mockups without intervention. It knows the human only plugs in when the amounts go above $5,000 or when the design system eval does not converge. → The system runs 24 hours a day → Hunter finds about 200 sites with outdated UI a day → Auditor runs each one through Lighthouse and WCAG → Pitcher prepares about 28 personalized proposals with before/after screenshots → Splitter breaks 3 accepted projects per week into milestones → Designer generates mockups and components, Checker runs evals on every artifact And only when the invoice breaks $5,000 or the eval drops below 0.88 does the orchestrator wake the human. Here is what the system outputs in his log during 1 of the sessions: "hunter report, tuesday: 213 sites found, 31 with last redesign before 2020, 14 with Lighthouse score below 65, 6 with active redesign RFP. passing top 6 to auditor." "pitcher: 27 cold outreach sent with before/after screenshots, 5 replies, 3 discovery calls scheduled. passing to splitter." "designer: milestone 2 of Lotus Tea Co redesign complete. Figma frames exported to /Users/dev/agency/clients/lotus/v2. checker running design system evals." "eval flag: proposal for $6,800 exceeds the approved limit of $5,000. sending for manual review." He has no remote server. No separate backend. Just a local file sandbox in /Users/dev/agency, an MCP router, and an API key to Claude. Out of everything I have seen this year, this is the cleanest one-person UI design agency: $480 in, about $32,000 out, and between them 6 prompts and 1 file system.

This Chinese developer launched 6 agents under 1 orchestrator, and they run his UI design agency at $32,000 a month on their own. He built a system of 6 agents on Claude Sonnet 4.6 that single-handedly runs his agency for UI auditing and redesign for SaaS startups and e-commerce. No contractors, no project manager, and no team. Just him, a MacBook, and 1 API key. Traditional design agencies out of Shenzhen keep teams of 8 people on salaries for the same volume, while he keeps only API tokens. 6 agents work through a single orchestrator on Claude Code Router. Usage is about 4 million tokens a day, the average API bill is just $480 a month. All 6 go through MCP servers and write shared state to the file system, without shared state in memory and without race conditions. And here is the system prompt he gave the orchestrator before launch: "you are the orchestrator of a one-man UI agency. you delegate read-only research tasks to 5 sub-agents and own all writes. sub-agents: // Hunter (finds SaaS and e-commerce sites with outdated UI) // Auditor (runs each site through Lighthouse, accessibility, and design system checks) // Pitcher (writes cold outreach and redesign proposals with before/after screenshots) // Splitter (breaks accepted projects into typed milestones) // Designer (generates Figma mockups and Tailwind components) // Checker (runs evals on every artifact before it leaves the harness). you never let 2 sub-agents touch 1 file. you stop and request human approval only when an invoice exceeds $5,000 or when the design system eval score drops below 0.88." Meaning the system knows exactly what it is and within what boundaries it operates. It knows it is supposed to find clients on its own. It knows it is supposed to write proposals with screenshots and mockups without intervention. It knows the human only plugs in when the amounts go above $5,000 or when the design system eval does not converge. → The system runs 24 hours a day → Hunter finds about 200 sites with outdated UI a day → Auditor runs each one through Lighthouse and WCAG → Pitcher prepares about 28 personalized proposals with before/after screenshots → Splitter breaks 3 accepted projects per week into milestones → Designer generates mockups and components, Checker runs evals on every artifact And only when the invoice breaks $5,000 or the eval drops below 0.88 does the orchestrator wake the human. Here is what the system outputs in his log during 1 of the sessions: "hunter report, tuesday: 213 sites found, 31 with last redesign before 2020, 14 with Lighthouse score below 65, 6 with active redesign RFP. passing top 6 to auditor." "pitcher: 27 cold outreach sent with before/after screenshots, 5 replies, 3 discovery calls scheduled. passing to splitter." "designer: milestone 2 of Lotus Tea Co redesign complete. Figma frames exported to /Users/dev/agency/clients/lotus/v2. checker running design system evals." "eval flag: proposal for $6,800 exceeds the approved limit of $5,000. sending for manual review." He has no remote server. No separate backend. Just a local file sandbox in /Users/dev/agency, an MCP router, and an API key to Claude. Out of everything I have seen this year, this is the cleanest one-person UI design agency: $480 in, about $32,000 out, and between them 6 prompts and 1 file system.

Blaze

56,062 次观看 • 1 个月前

This Chinese guy created agents in Claude Code for landing pages and single-handedly serves 47 small businesses a month, taking $400 from each. He built a system of 7 agents on Claude Sonnet 4.6 that analyzes Google Maps in small towns, finds small businesses without websites there, and over 1 weekend takes each one to a finished mockup with video and cold message. No assistant, no sales team, no SDR. Just him, a MacBook, an iPhone, and 1 API key. And traditional web design agencies keep teams of 8 people on salary for the same order flow, while his expenses are only tokens and subscriptions to Lovable, Higgsfield, and Calendly. 7 agents work through 1 orchestrator on Claude Code Router. Usage is about 3 million tokens a day, the average API bill is about $480 a month. All 7 go through MCP servers and write shared state to the file system, without shared state in memory and without race conditions, and 1 of them lives right in the iPhone and picks up positive replies from the subway, a taxi, or on walks. And here is the system prompt he put into the orchestrator before launch: "You are the orchestrator of a solo agency that sells ready-made websites to local businesses. You delegate read-only tasks to 6 sub-agents and own all writes. sub-agents: // Scout (walks through Google Maps in selected cities, looks for narrow niches: 5+ years on the map, fewer than 50 reviews, no website or a website from 2014, but high ratings) // Diagnoser (for each lead writes a 50-word diagnosis, hero angle, tone matched to the industry, and a cold message under 70 words) // Builder (generates a landing page mockup in Lovable through MCP only for the top 5 leads per day, with the sharpest diagnoses and the biggest gap) // Filmer (pulls 5 screenshots of the mockup and through Higgsfield renders a 10-second vertical video 1080x1920 with a soft zoom) // Pitcher (sends a personalized cold message through the right channel for the niche: email to roofers, SMS to tradesmen, IG DM to salons, LinkedIn to realtors) // Checker (runs every message through evals for personalization, absence of AI markers and buzzwords before sending) // Mobile (lives in the iPhone, handles positive replies in real time, books Zoom calls in Calendly through MCP while the owner is on the go). You never let 2 sub-agents touch 1 lead. You stop and request approval from the human only when a deal exceeds $3,000 or the reply rate in a niche for the day drops below 12%." Meaning the system knows what it is and within what boundaries it is allowed to act. It knows it is supposed to find leads on its own. It knows it is supposed to take each one to a mockup, video, and cold message without intervention. It knows the human only steps in when a deal goes above $3,000 or the reply rate stops converging. → The system runs 24 hours a day → Scout goes through about 220 local businesses on Google Maps per day and leaves 30 new leads in the queue → Diagnoser outputs 30 structured diagnoses + briefs + cold messages per day → Builder assembles 3 to 5 finished landing pages in Lovable for the sharpest leads → Filmer renders a 10-second vertical video in Higgsfield for each one → Pitcher sends 30 personalized messages per day across 4 channels with a reply rate of about 14% → Checker runs every message through evals before sending And only when a deal breaks $3,000 or the reply rate for the day drops below 12% does the orchestrator wake the owner. And when the owner at that moment is sitting in the subway or a taxi, the Mobile agent in his iPhone picks up 1 move on its own: replies to a fresh positive reply from a dentist, books a Zoom through Calendly synced to the local time of the client, and puts the lead back in the queue. The owner only has to tap "approve" and in just 10 minutes join the call. Here is what the system writes in his log during 1 of the Saturdays: "scout report: 218 businesses checked in Austin, Denver, and Miami, 34 without a website, 19 with a website from 2014, 6 with an active redesign request in reviews. passing top 30 to diagnoser." "pitcher: 30 cold messages sent across 4 channels, 14 replies, 5 positive, 3 Zoom calls booked for Sunday. passing to closer." "builder: landing page for Westside Cosmetic Dentistry built in Lovable, 5 sections, mobile, soft beige. URL placed at /Users/dev/maps-agency/clients/westside/v1. filmer launching Higgsfield." "eval flag: deal with The Lotus Salon at $3,400 exceeds the approved limit of $3,000. sending for manual review." He has no server of his own and no separate backend. Just a local file sandbox at /Users/dev/maps-agency, an MCP router, 1 API key to Claude, and the same key forwarded to Claude Code on his iPhone. Out of everything I have seen this year, this is the cleanest one-person agency for selling websites to small businesses: $480 a month on the API, about $18,800 into the account, and between them 7 prompts, 1 file system, and 1 phone in the pocket.

This Chinese guy created agents in Claude Code for landing pages and single-handedly serves 47 small businesses a month, taking $400 from each. He built a system of 7 agents on Claude Sonnet 4.6 that analyzes Google Maps in small towns, finds small businesses without websites there, and over 1 weekend takes each one to a finished mockup with video and cold message. No assistant, no sales team, no SDR. Just him, a MacBook, an iPhone, and 1 API key. And traditional web design agencies keep teams of 8 people on salary for the same order flow, while his expenses are only tokens and subscriptions to Lovable, Higgsfield, and Calendly. 7 agents work through 1 orchestrator on Claude Code Router. Usage is about 3 million tokens a day, the average API bill is about $480 a month. All 7 go through MCP servers and write shared state to the file system, without shared state in memory and without race conditions, and 1 of them lives right in the iPhone and picks up positive replies from the subway, a taxi, or on walks. And here is the system prompt he put into the orchestrator before launch: "You are the orchestrator of a solo agency that sells ready-made websites to local businesses. You delegate read-only tasks to 6 sub-agents and own all writes. sub-agents: // Scout (walks through Google Maps in selected cities, looks for narrow niches: 5+ years on the map, fewer than 50 reviews, no website or a website from 2014, but high ratings) // Diagnoser (for each lead writes a 50-word diagnosis, hero angle, tone matched to the industry, and a cold message under 70 words) // Builder (generates a landing page mockup in Lovable through MCP only for the top 5 leads per day, with the sharpest diagnoses and the biggest gap) // Filmer (pulls 5 screenshots of the mockup and through Higgsfield renders a 10-second vertical video 1080x1920 with a soft zoom) // Pitcher (sends a personalized cold message through the right channel for the niche: email to roofers, SMS to tradesmen, IG DM to salons, LinkedIn to realtors) // Checker (runs every message through evals for personalization, absence of AI markers and buzzwords before sending) // Mobile (lives in the iPhone, handles positive replies in real time, books Zoom calls in Calendly through MCP while the owner is on the go). You never let 2 sub-agents touch 1 lead. You stop and request approval from the human only when a deal exceeds $3,000 or the reply rate in a niche for the day drops below 12%." Meaning the system knows what it is and within what boundaries it is allowed to act. It knows it is supposed to find leads on its own. It knows it is supposed to take each one to a mockup, video, and cold message without intervention. It knows the human only steps in when a deal goes above $3,000 or the reply rate stops converging. → The system runs 24 hours a day → Scout goes through about 220 local businesses on Google Maps per day and leaves 30 new leads in the queue → Diagnoser outputs 30 structured diagnoses + briefs + cold messages per day → Builder assembles 3 to 5 finished landing pages in Lovable for the sharpest leads → Filmer renders a 10-second vertical video in Higgsfield for each one → Pitcher sends 30 personalized messages per day across 4 channels with a reply rate of about 14% → Checker runs every message through evals before sending And only when a deal breaks $3,000 or the reply rate for the day drops below 12% does the orchestrator wake the owner. And when the owner at that moment is sitting in the subway or a taxi, the Mobile agent in his iPhone picks up 1 move on its own: replies to a fresh positive reply from a dentist, books a Zoom through Calendly synced to the local time of the client, and puts the lead back in the queue. The owner only has to tap "approve" and in just 10 minutes join the call. Here is what the system writes in his log during 1 of the Saturdays: "scout report: 218 businesses checked in Austin, Denver, and Miami, 34 without a website, 19 with a website from 2014, 6 with an active redesign request in reviews. passing top 30 to diagnoser." "pitcher: 30 cold messages sent across 4 channels, 14 replies, 5 positive, 3 Zoom calls booked for Sunday. passing to closer." "builder: landing page for Westside Cosmetic Dentistry built in Lovable, 5 sections, mobile, soft beige. URL placed at /Users/dev/maps-agency/clients/westside/v1. filmer launching Higgsfield." "eval flag: deal with The Lotus Salon at $3,400 exceeds the approved limit of $3,000. sending for manual review." He has no server of his own and no separate backend. Just a local file sandbox at /Users/dev/maps-agency, an MCP router, 1 API key to Claude, and the same key forwarded to Claude Code on his iPhone. Out of everything I have seen this year, this is the cleanest one-person agency for selling websites to small businesses: $480 a month on the API, about $18,800 into the account, and between them 7 prompts, 1 file system, and 1 phone in the pocket.

Blaze

2,697,192 次观看 • 1 个月前

AI AGENTS 101 (58 minute free masterclass) send this to anyone who wants to understand ai agents, claude skills, md files, how to get the most out of AI etc in plain english: 1. chat vs agents - chat models answer questions in a back and forth while agents take a goal, figure out the steps, and deliver a result 2. agents don’t stop after one response. they keep running until the task is actually finishedno babysitting required 3. everything runs on a loop. they gather context, decide what to do, take an action, then repeat until done 4. the loop is the system. they look at files, tools, and the internet. decide the next step. execute and then feed that back into the next step. over and over until completion 5. the model is just one piece. gpt, claude, gemini are the reasoning layer. the key is model + loop + tools + context 6. mcp is how agents use tools. it connects things like browser, code, apis, and your internal software. once connected, the agent decides when to use them to get the job done 7. context beats prompt all day. you don't need to write perfect prompts. load your agent with context about your business, style, and goals and then simple instructions work 8. claude.md or agents.md is the onboarding doc it tells the agent who it is, how to behave, what it knows, and what tools it can use. this gets loaded every time before it starts 9. memory.md is how it improves. agents don’t remember by default. this file stores preferences, corrections, and patterns you tell the agent to update it, and it gets better over time 10. skills + harnesses make it usable. skills are reusable tasks like writing, research, analysis the harness is the environment like claude code or openclaw that runs everything. basiclaly, different interfaces, same system underneath this episode with remy on The Startup Ideas Podcast (SIP) 🧃 was one of the clearest ways of understanding a lot of the core concepts of ai agents could be the best beginners course for ai agents 58 mins. all free. no advertisers. i just want to see you build cool stuff. im rooting for you. send to a friend watch

AI AGENTS 101 (58 minute free masterclass) send this to anyone who wants to understand ai agents, claude skills, md files, how to get the most out of AI etc in plain english: 1. chat vs agents - chat models answer questions in a back and forth while agents take a goal, figure out the steps, and deliver a result 2. agents don’t stop after one response. they keep running until the task is actually finishedno babysitting required 3. everything runs on a loop. they gather context, decide what to do, take an action, then repeat until done 4. the loop is the system. they look at files, tools, and the internet. decide the next step. execute and then feed that back into the next step. over and over until completion 5. the model is just one piece. gpt, claude, gemini are the reasoning layer. the key is model + loop + tools + context 6. mcp is how agents use tools. it connects things like browser, code, apis, and your internal software. once connected, the agent decides when to use them to get the job done 7. context beats prompt all day. you don't need to write perfect prompts. load your agent with context about your business, style, and goals and then simple instructions work 8. claude.md or agents.md is the onboarding doc it tells the agent who it is, how to behave, what it knows, and what tools it can use. this gets loaded every time before it starts 9. memory.md is how it improves. agents don’t remember by default. this file stores preferences, corrections, and patterns you tell the agent to update it, and it gets better over time 10. skills + harnesses make it usable. skills are reusable tasks like writing, research, analysis the harness is the environment like claude code or openclaw that runs everything. basiclaly, different interfaces, same system underneath this episode with remy on The Startup Ideas Podcast (SIP) 🧃 was one of the clearest ways of understanding a lot of the core concepts of ai agents could be the best beginners course for ai agents 58 mins. all free. no advertisers. i just want to see you build cool stuff. im rooting for you. send to a friend watch

GREG ISENBERG

374,915 次观看 • 3 个月前

Complete Claude Code Training 6 HOURS. The most comprehensive Claude training on the internet. From A to Z: setup, workflow creation, website deployment, agent team creation, browser automation, client prospecting and pricing your services. All of it without writing a single line of code. In the end: you use Claude Code like a pro and you monetize your skills. Beginner or advanced, everything is there in one place, this course covers it all. It's worth more than all those $500 courses you almost bought. Keep it bookmarked and watch later.

Complete Claude Code Training 6 HOURS. The most comprehensive Claude training on the internet. From A to Z: setup, workflow creation, website deployment, agent team creation, browser automation, client prospecting and pricing your services. All of it without writing a single line of code. In the end: you use Claude Code like a pro and you monetize your skills. Beginner or advanced, everything is there in one place, this course covers it all. It's worth more than all those $500 courses you almost bought. Keep it bookmarked and watch later.

Rahul

230,892 次观看 • 21 天前

THIS GUY SPENT 5 MINUTES SETTING UP A $20 CLAUDE WORKSPACE AND TURNED 30 MINUTES OF DAILY PROMPT CHAOS INTO A SYSTEM Most people open Claude, type 1 messy prompt, rewrite it 10 times, get the same generic answer and call the model “bad.” He opens Claude Cowork, adds the project rules, past context, working files, examples, decisions and a clean place for the model to remember what it is doing. Same Claude. Same $20 tool. Completely different output. The funny part is there is no magic prompt here. No secret template. No “10x Claude” trick. Just the boring layer Karpathy keeps pointing at: context, memory and structure before output. A setup like this can save 30 minutes a day, which is 15 hours a month. If you use Claude for client work at $50 to $100/hour, that is $750 to $1,500/month in time you stop burning on repeated prompts. The model is real. The output is real. The edge is in not making Claude start from zero every single time.

THIS GUY SPENT 5 MINUTES SETTING UP A $20 CLAUDE WORKSPACE AND TURNED 30 MINUTES OF DAILY PROMPT CHAOS INTO A SYSTEM Most people open Claude, type 1 messy prompt, rewrite it 10 times, get the same generic answer and call the model “bad.” He opens Claude Cowork, adds the project rules, past context, working files, examples, decisions and a clean place for the model to remember what it is doing. Same Claude. Same $20 tool. Completely different output. The funny part is there is no magic prompt here. No secret template. No “10x Claude” trick. Just the boring layer Karpathy keeps pointing at: context, memory and structure before output. A setup like this can save 30 minutes a day, which is 15 hours a month. If you use Claude for client work at $50 to $100/hour, that is $750 to $1,500/month in time you stop burning on repeated prompts. The model is real. The output is real. The edge is in not making Claude start from zero every single time.

Gipp 🦅

14,871 次观看 • 20 天前

i just built a 4-agent software team. everything runs from Telegram and gets managed on a kanban board. a project manager who plans the work, a backend developer, a frontend developer, and a tester. the PM reads a goal, breaks it into linked tasks, and assigns each to the right agent. the thing that makes them a team instead of four strangers is a shared kanban board. every task is a row that survives crashes, and when an agent finishes, it writes a summary of what it built and what the next agent needs to know. the next agent reads that summary before it starts. so the frontend developer never has to guess the API shape, and the tester knows exactly what to verify. the hardest part was not the coordination. it was building an agent that could actually act like a backend engineer. a backend engineer stands up a database, wires auth, manages storage, deploys functions, and keeps all of it consistent while the rest of the team builds on top. an agent doing this from scratch drowns. it burns its context window remembering which tables exist and which endpoint it created three steps ago, and the work degrades fast. so the backend agent needs a backend built for agents, not for humans clicking through a dashboard. that is where InsForge came in. it is an open-source, agent-native backend, and i added it to my backend developer agent as a skill. a skill is a step-by-step guide that teaches the agent how to do a specific kind of work. with InsForge installed, the agent stopped improvising infrastructure and followed a reliable path: create the project, define the database, set up auth, deploy functions. to test the whole team, i had them build a working Google Docs clone, AI features included. the backend agent spun up the full service on its own. database tables, user auth, document handling, and edge functions running real TypeScript, all in one dashboard. the frontend agent read that summary and built the UI on top of it, and the tester closed the loop. the result was a backend an agent could reason about end to end, instead of one it kept getting lost inside. if you are building an AI backend engineer, InsForge is worth a look, it's 100% open-source. InsForge GitHub: (don't forget to star 🌟) the full article on Hermes Kanban: Mission Control for your Agents is quoted below.

i just built a 4-agent software team. everything runs from Telegram and gets managed on a kanban board. a project manager who plans the work, a backend developer, a frontend developer, and a tester. the PM reads a goal, breaks it into linked tasks, and assigns each to the right agent. the thing that makes them a team instead of four strangers is a shared kanban board. every task is a row that survives crashes, and when an agent finishes, it writes a summary of what it built and what the next agent needs to know. the next agent reads that summary before it starts. so the frontend developer never has to guess the API shape, and the tester knows exactly what to verify. the hardest part was not the coordination. it was building an agent that could actually act like a backend engineer. a backend engineer stands up a database, wires auth, manages storage, deploys functions, and keeps all of it consistent while the rest of the team builds on top. an agent doing this from scratch drowns. it burns its context window remembering which tables exist and which endpoint it created three steps ago, and the work degrades fast. so the backend agent needs a backend built for agents, not for humans clicking through a dashboard. that is where InsForge came in. it is an open-source, agent-native backend, and i added it to my backend developer agent as a skill. a skill is a step-by-step guide that teaches the agent how to do a specific kind of work. with InsForge installed, the agent stopped improvising infrastructure and followed a reliable path: create the project, define the database, set up auth, deploy functions. to test the whole team, i had them build a working Google Docs clone, AI features included. the backend agent spun up the full service on its own. database tables, user auth, document handling, and edge functions running real TypeScript, all in one dashboard. the frontend agent read that summary and built the UI on top of it, and the tester closed the loop. the result was a backend an agent could reason about end to end, instead of one it kept getting lost inside. if you are building an AI backend engineer, InsForge is worth a look, it's 100% open-source. InsForge GitHub: (don't forget to star 🌟) the full article on Hermes Kanban: Mission Control for your Agents is quoted below.

Akshay 🚀

118,124 次观看 • 12 天前

🚨 OpenAI just launched Codex, a brand-new autonomous coding agent that can build features and fix bugs on its own. We’ve been using it Every 📧 for a few days, and I’m impressed. I invited Alexander Embiricos (ben davies), a member of the product staff responsible for Codex, to demo Codex and talk about it live on a special edition of AI & I: What Codex is and how it works Codex is designed to be used by senior engineers—it performs coding tasks like adding features or fixing bugs autonomously. It's built to allow you to start many sessions at once, so you can have multiple agents working in parallel. Codex is built to have "taste" OpenAI trained Codex to have the taste of a senior software engineer. It knows how big codebases work, how to write a good PR, and uses clean, minimal code. Why an “abundance mindset” is best for interacting with agents Codex is designed to allow users to delegate many tasks at once without getting caught up in the details. This lets you point an abundance of agents at a specific task like a difficult bug—it’s worth it even if only one of them succeeds. How OpenAI is thinking about agents Codex is one piece of a unified super-assistant OpenAI wants to eventually build—an agent that helps users easily get things done by selecting the right tools for them behind the scenes. OpenAI’s vision for the future of programming In the future developers will probably spend less time writing routine code and more time guiding agents, reviewing their work, and making strategy decisions. Programming will become more social, letting teams easily delegate multiple tasks at once, allowing people to focus on ideas and collaboration instead of routine coding. Watch below!

Dan Shipper 📧

145,487 次观看 • 1 年前

The person who built Claude Code just explained how he is using it "Now I just have an army of agents that are doing stuff" Only 18 minutes, the best Claude guide on the internet right now. Boris Cherny covered: > how Claude Code is moving beyond just engineers > routines for CI, code review, and daily workflows > why the loop is the next big leap > working with hundreds of agents at once everything is already running through agents, ignoring this is not an option that's exactly why I wrote a full guide on how to build one yourself with Claude Code article below

The person who built Claude Code just explained how he is using it "Now I just have an army of agents that are doing stuff" Only 18 minutes, the best Claude guide on the internet right now. Boris Cherny covered: > how Claude Code is moving beyond just engineers > routines for CI, code review, and daily workflows > why the loop is the next big leap > working with hundreds of agents at once everything is already running through agents, ignoring this is not an option that's exactly why I wrote a full guide on how to build one yourself with Claude Code article below

Anatoli Kopadze

16,651 次观看 • 8 天前

16-year-old American launched his own marketing agency and replaced a team of four with one Claude and two tools. ChatGPT writes a video prompt - a hyper-realistic scene of a house after storm damage to the roof. Pastes it into Google Gemini VEO 3. Two minutes and a finished video that didn't exist before. Drops it into CapCut with real footage from the client. Under an hour of editing and a complete ad ready to launch. Then he connected a repo with 139 marketing tactics, 12 SEO playbooks, a CRO framework and a pricing system to Claude - everything a marketing agency charges $10,000 a month to do by hand Claude executes in an hour. While competitors invoice for weeks of work he delivers in a day and charges the client $3,000. An entire agency replaced by one 16-year-old and two tools.

16-year-old American launched his own marketing agency and replaced a team of four with one Claude and two tools. ChatGPT writes a video prompt - a hyper-realistic scene of a house after storm damage to the roof. Pastes it into Google Gemini VEO 3. Two minutes and a finished video that didn't exist before. Drops it into CapCut with real footage from the client. Under an hour of editing and a complete ad ready to launch. Then he connected a repo with 139 marketing tactics, 12 SEO playbooks, a CRO framework and a pricing system to Claude - everything a marketing agency charges $10,000 a month to do by hand Claude executes in an hour. While competitors invoice for weeks of work he delivers in a day and charges the client $3,000. An entire agency replaced by one 16-year-old and two tools.

Noisy

13,434 次观看 • 1 个月前

This guy built JARVIS on Claude Code and with 1 clap of his hands launches his entire work day, saving $5,000 a month on a personal assistant. Inside he runs a pipeline of 5 plugins on Claude Code that on a double clap of the hands wakes up 3 monitors, sets the Philips Hue light to focus mode, turns on a Spotify playlist, and greets him by voice with a British accent, reading out the time, date, and weather. No Alexa, no smart speakers, no separate smart home app. Just him, a MacBook M3 Max on the desk, an iPhone in the pocket, and 1 local API key. And a regular personal assistant for the same volume of tasks charges $5,000 a month or more on salary alone, plus another $1,200 to cover off-hours work time. Meanwhile this guy's expenses are only tokens and a subscription to ElevenLabs for the British voice. All 5 plugins launch through 1 JARVIS, burn about 4 million tokens a day, and close the monthly API bill at about $640. Each plugin writes shared state to a local sandbox at /Users/dev/jarvis-suite, and 1 of them lives right in the iPhone and picks up voice requests while the owner is in the kitchen or on a run. And here is the system prompt he put into JARVIS before launch: "you are JARVIS, a butler-engineer on Claude Code. you manage your owner's workflow through 4 sub-plugins and own all commits and communication yourself. sub-plugins: // Wakeup (recognizes a double clap, activates 3 monitors, reads out the time, date, and weather by voice, checks the clock accuracy on the iPad and corrects it via NTP server) // Atmosphere (controls Philips Hue on a Pomodoro schedule, turns on a Spotify playlist for the current context, and holds the light at 2700K at 80% brightness in focus mode) // Devshop (monitors VS Code, tracks Python scripts in the terminal, and every 15 minutes sends a summary of changes to the shared chat) // Project (every morning recalculates the deadline for the Wallaroo app in the App Store, manages UI tickets, and initiates the Refinement Protocol by voice command). you speak only with a British accent, you never slip into neutral English. you wake the owner by voice only when the Wallaroo deadline drops below 10 days or when an external client joins Zoom without an invitation." This instruction immediately defines the role of JARVIS and the limits of his autonomy. He knows he is supposed to wake the room himself and sound like a real butler. He knows he is supposed to manage the Wallaroo project himself and not miss the App Store deadline. → JARVIS runs 24 hours a day in the background → Wakeup activates the room on a double clap in just 1.4 seconds, the monitors come alive simultaneously → Atmosphere sets warm Philips Hue light at 2700K and picks a Spotify playlist for the current Pomodoro cycle → Devshop reads changes in VS Code and pushes a summary to the shared chat every 15 minutes → Project every morning recalculates the Wallaroo deadline and reminds about 4 unresolved UI tickets → Mobile lives in the iPhone and answers any question about code or the project by voice while the owner is not home And only when less than 10 days remain until the Wallaroo release or Zoom receives an unscheduled call does JARVIS raise the owner with a voice intervention. And when the owner at that moment is on a run or in a coffee shop, the Mobile agent in his iPhone picks up 1 request on its own: switches the Spotify playlist, dictates the summary of the last commit, updates the Pomodoro timer, and reads the Wallaroo reminder. Look at 0:55 in the video, that is where JARVIS intercepts a voice request from outside and confirms execution with the phrase "Very good, sir." The fresh system log from last Wednesday looks like this: "wakeup: double clap registered at 09:14, 3 monitors activated, temperature 20.4C, sunny. clock on iPad was 4 minutes behind, syncing via NTP." "atmosphere: Spotify turned on playlist 'Deep Focus', Philips Hue set to warm 2700K at 80% brightness, Pomodoro mode 25/5." "project: Wallaroo to App Store 9 days, 4 unresolved UI tickets, initiating Refinement Protocol by voice command from the owner." "mobile: voice request processed outside the room, playlist switched to 'Coding Lo-Fi', Pomodoro updated to 25 minutes, confirming execution with the phrase 'Very good, sir.'" He has no Alexa, no smart speakers, no smart home app. At home sits a MacBook M3 Max with a local folder at /Users/dev/jarvis-suite, on top run 5 plugins and a neural network butler, and the same stack is forwarded to a secure terminal on the iPhone. Out of everything I have seen this year, this is the densest one-person AI headquarters assembled in 1 room: $640 a month on the API, about $5,000 a month saved on a personal assistant, and between them 5 plugins, 1 clap of the hands, and 1 voice with a British accent.

This guy built JARVIS on Claude Code and with 1 clap of his hands launches his entire work day, saving $5,000 a month on a personal assistant. Inside he runs a pipeline of 5 plugins on Claude Code that on a double clap of the hands wakes up 3 monitors, sets the Philips Hue light to focus mode, turns on a Spotify playlist, and greets him by voice with a British accent, reading out the time, date, and weather. No Alexa, no smart speakers, no separate smart home app. Just him, a MacBook M3 Max on the desk, an iPhone in the pocket, and 1 local API key. And a regular personal assistant for the same volume of tasks charges $5,000 a month or more on salary alone, plus another $1,200 to cover off-hours work time. Meanwhile this guy's expenses are only tokens and a subscription to ElevenLabs for the British voice. All 5 plugins launch through 1 JARVIS, burn about 4 million tokens a day, and close the monthly API bill at about $640. Each plugin writes shared state to a local sandbox at /Users/dev/jarvis-suite, and 1 of them lives right in the iPhone and picks up voice requests while the owner is in the kitchen or on a run. And here is the system prompt he put into JARVIS before launch: "you are JARVIS, a butler-engineer on Claude Code. you manage your owner's workflow through 4 sub-plugins and own all commits and communication yourself. sub-plugins: // Wakeup (recognizes a double clap, activates 3 monitors, reads out the time, date, and weather by voice, checks the clock accuracy on the iPad and corrects it via NTP server) // Atmosphere (controls Philips Hue on a Pomodoro schedule, turns on a Spotify playlist for the current context, and holds the light at 2700K at 80% brightness in focus mode) // Devshop (monitors VS Code, tracks Python scripts in the terminal, and every 15 minutes sends a summary of changes to the shared chat) // Project (every morning recalculates the deadline for the Wallaroo app in the App Store, manages UI tickets, and initiates the Refinement Protocol by voice command). you speak only with a British accent, you never slip into neutral English. you wake the owner by voice only when the Wallaroo deadline drops below 10 days or when an external client joins Zoom without an invitation." This instruction immediately defines the role of JARVIS and the limits of his autonomy. He knows he is supposed to wake the room himself and sound like a real butler. He knows he is supposed to manage the Wallaroo project himself and not miss the App Store deadline. → JARVIS runs 24 hours a day in the background → Wakeup activates the room on a double clap in just 1.4 seconds, the monitors come alive simultaneously → Atmosphere sets warm Philips Hue light at 2700K and picks a Spotify playlist for the current Pomodoro cycle → Devshop reads changes in VS Code and pushes a summary to the shared chat every 15 minutes → Project every morning recalculates the Wallaroo deadline and reminds about 4 unresolved UI tickets → Mobile lives in the iPhone and answers any question about code or the project by voice while the owner is not home And only when less than 10 days remain until the Wallaroo release or Zoom receives an unscheduled call does JARVIS raise the owner with a voice intervention. And when the owner at that moment is on a run or in a coffee shop, the Mobile agent in his iPhone picks up 1 request on its own: switches the Spotify playlist, dictates the summary of the last commit, updates the Pomodoro timer, and reads the Wallaroo reminder. Look at 0:55 in the video, that is where JARVIS intercepts a voice request from outside and confirms execution with the phrase "Very good, sir." The fresh system log from last Wednesday looks like this: "wakeup: double clap registered at 09:14, 3 monitors activated, temperature 20.4C, sunny. clock on iPad was 4 minutes behind, syncing via NTP." "atmosphere: Spotify turned on playlist 'Deep Focus', Philips Hue set to warm 2700K at 80% brightness, Pomodoro mode 25/5." "project: Wallaroo to App Store 9 days, 4 unresolved UI tickets, initiating Refinement Protocol by voice command from the owner." "mobile: voice request processed outside the room, playlist switched to 'Coding Lo-Fi', Pomodoro updated to 25 minutes, confirming execution with the phrase 'Very good, sir.'" He has no Alexa, no smart speakers, no smart home app. At home sits a MacBook M3 Max with a local folder at /Users/dev/jarvis-suite, on top run 5 plugins and a neural network butler, and the same stack is forwarded to a secure terminal on the iPhone. Out of everything I have seen this year, this is the densest one-person AI headquarters assembled in 1 room: $640 a month on the API, about $5,000 a month saved on a personal assistant, and between them 5 plugins, 1 clap of the hands, and 1 voice with a British accent.

Blaze

798,515 次观看 • 1 个月前

9 out of 10 multi-agent projects never leave demo mode Not because the model is bad. Because the structure is missing. Most people who try to build [ a team of AI agents ] end up with one agent talking to itself in five tabs. The agents don't share context → Don't divide work → Don't know what the others are doing. > stage 1: if your agent doesn't have a real loop, observe, act, iterate, you have a long prompt, not an agent > stage 2: subagents need isolated context, the orchestrator never reads their raw transcript, only the summary > stage 3: the orchestrator plans and delegates, the moment it executes, it drowns in details that belong inside subagents > stage 4: without a shared task list it's not a team, it's five agents duplicating each other's work in parallel > stage 5: a permissions file is what lets you sleep, the model cannot bypass it because the rule lives outside the model a team of AI agents is not more model, it is more structure the most ignored stage in every demo that died: - durability... when a 50-step task crashes at step 47 and starts from zero, that's not a model failure. that's a missing write-to-disk call

9 out of 10 multi-agent projects never leave demo mode Not because the model is bad. Because the structure is missing. Most people who try to build [ a team of AI agents ] end up with one agent talking to itself in five tabs. The agents don't share context → Don't divide work → Don't know what the others are doing. > stage 1: if your agent doesn't have a real loop, observe, act, iterate, you have a long prompt, not an agent > stage 2: subagents need isolated context, the orchestrator never reads their raw transcript, only the summary > stage 3: the orchestrator plans and delegates, the moment it executes, it drowns in details that belong inside subagents > stage 4: without a shared task list it's not a team, it's five agents duplicating each other's work in parallel > stage 5: a permissions file is what lets you sleep, the model cannot bypass it because the rule lives outside the model a team of AI agents is not more model, it is more structure the most ignored stage in every demo that died: - durability... when a 50-step task crashes at step 47 and starts from zero, that's not a model failure. that's a missing write-to-disk call

Shadow Nick

14,501 次观看 • 17 天前

President Zelenskyy: The task of Ukrainian units is to inflict a level of losses on the occupier at which Russian casualties exceed the amount of reinforcements they are able to send to their forces each month. This is a realistic task. When we speak of 50,000 Russian losses per month, that is the optimal level. It is a difficult task, without question, but it is precisely the level needed for Russia to begin weighing what it is doing and what it is fighting for. The task of the Ministry of Defense of Ukraine, our army, and all the Defense and Security Forces of Ukraine is to ensure exactly this level of Russian losses.

President Zelenskyy: The task of Ukrainian units is to inflict a level of losses on the occupier at which Russian casualties exceed the amount of reinforcements they are able to send to their forces each month. This is a realistic task. When we speak of 50,000 Russian losses per month, that is the optimal level. It is a difficult task, without question, but it is precisely the level needed for Russia to begin weighing what it is doing and what it is fighting for. The task of the Ministry of Defense of Ukraine, our army, and all the Defense and Security Forces of Ukraine is to ensure exactly this level of Russian losses.

Anton Gerashchenko

77,058 次观看 • 4 个月前

The same kinds of productivity gains we've seen in coding with AI agents are heading to the rest of knowledge work. This is the jump when you go from having a chatbot to being able to actually have an agent go off and do work for minutes or even hours and come back with a complete work output that you then review. Here's an example of the new Box Agent filling out an RFP response from an existing knowledge base. This process would normally take hours to fill out, and requires the full attention of the user doing the work. Now, you provide the Box Agent with the RFP questions, and it will go off, make a plan, extract all the relevant questions, read through existing source material to come up with an answer, and then generate a new word document as the final output. All while you're doing something else. The key to this architecture is that the agent is able to use all of the same tools in the background that a user uses to get work done. The agent can search for documents, read entire files, run scripts and tools in the background, and even be able to write code on the fly to automate tasks it hasn't seen before. And best of all, the Box Agent will (soon) work from the Box MCP and CLI so you can invoke it in any agentic system as a step in a process. This kind of agent complexity would have been impossible even 6 months ago. Models consistently failed at tracking long running tasks or using the right tools at the right moment for the task. But this is all now possible because of models like GPT-5.4, Opus 4.6, and Gemini 3, and is only getting better by the month. Just as we moved from engineers writing code and using AI as an assistant to answer questions, in many areas of knowledge work -like legal, finance, consulting, sales, marketing, and more- when we have a problem we'll just kick off the AI agent to just go work on it for us in the background.

The same kinds of productivity gains we've seen in coding with AI agents are heading to the rest of knowledge work. This is the jump when you go from having a chatbot to being able to actually have an agent go off and do work for minutes or even hours and come back with a complete work output that you then review. Here's an example of the new Box Agent filling out an RFP response from an existing knowledge base. This process would normally take hours to fill out, and requires the full attention of the user doing the work. Now, you provide the Box Agent with the RFP questions, and it will go off, make a plan, extract all the relevant questions, read through existing source material to come up with an answer, and then generate a new word document as the final output. All while you're doing something else. The key to this architecture is that the agent is able to use all of the same tools in the background that a user uses to get work done. The agent can search for documents, read entire files, run scripts and tools in the background, and even be able to write code on the fly to automate tasks it hasn't seen before. And best of all, the Box Agent will (soon) work from the Box MCP and CLI so you can invoke it in any agentic system as a step in a process. This kind of agent complexity would have been impossible even 6 months ago. Models consistently failed at tracking long running tasks or using the right tools at the right moment for the task. But this is all now possible because of models like GPT-5.4, Opus 4.6, and Gemini 3, and is only getting better by the month. Just as we moved from engineers writing code and using AI as an assistant to answer questions, in many areas of knowledge work -like legal, finance, consulting, sales, marketing, and more- when we have a problem we'll just kick off the AI agent to just go work on it for us in the background.

Aaron Levie

24,608 次观看 • 2 个月前

A 16-year-old American student made $7,000 in his first month after publishing his first game on Google Play - and GPT-5.5 built the whole thing for him. He had zero programming knowledge and no $50,000 dev team. He just opened GPT-5.5, described what he wanted to build, and went to play while the AI wrote every line of code for him. GPT-5.5 built a complete 3D game called Hormuz Escape from scratch - obstacle avoidance mechanics, high score system, particle effects, AdMob integration for ads, and a Firebase leaderboard. All from one prompt in a week of work. A Google Play Developer account cost $25 once and forever. 10,000 active players at $0.03 a day through AdMob is already $9,000 a month in passive income. He stopped at $7,000 in the first month and the number keeps growing. 3 billion Android devices in the world and the Play Store is open to anyone with $25 and an idea. AI removed the last barrier - needing to know how to code. $25 to register, One week of work. $7,000 in the first month.

A 16-year-old American student made $7,000 in his first month after publishing his first game on Google Play - and GPT-5.5 built the whole thing for him. He had zero programming knowledge and no $50,000 dev team. He just opened GPT-5.5, described what he wanted to build, and went to play while the AI wrote every line of code for him. GPT-5.5 built a complete 3D game called Hormuz Escape from scratch - obstacle avoidance mechanics, high score system, particle effects, AdMob integration for ads, and a Firebase leaderboard. All from one prompt in a week of work. A Google Play Developer account cost $25 once and forever. 10,000 active players at $0.03 a day through AdMob is already $9,000 a month in passive income. He stopped at $7,000 in the first month and the number keeps growing. 3 billion Android devices in the world and the Play Store is open to anyone with $25 and an idea. AI removed the last barrier - needing to know how to code. $25 to register, One week of work. $7,000 in the first month.

Sprytix

85,421 次观看 • 1 个月前

In the future, you’ll be able to accomplish a goal by just giving Claude an outcome and a budget. That’s the direction Anthropic is building in with its new Managed Agents features, announced at this week’s Code with Claude developer event. The basic idea: Claude, wrapped in a computer in the cloud, that you can spin up, scale, and manage as needed. Anthropic is taking on the infrastructure that kills most agent products, and making sure that it scales to meet the needs of agents running 24/7. On this week’s AI & I from Every 📧, I talk with Angela Jiang (Angela Jiang), head of product for the Claude platform, and Katelyn Lesse (Katelyn Lesse), head of engineering for the Claude platform, about what Anthropic is building and what it takes to make agents reliable in production. We get into: - Why the "build a generic harness, hot-swap any model behind it" playbook is already outdated. Angela points to eval data on Memory where the same task across different harnesses performed drastically differently. - The infrastructure wall every team hits in production—and why Katelyn thinks “my sandbox died and took the agent with it” is the real reason internal agents don't ship. - Why Anthropic is so bullish on using file systems and skills within Claude, including Angela's argument that those early design choices can compound for years. This is a must-watch for anyone trying to take an agent past the demo and into production. Watch below! Timestamps: How the Claude platform evolved from API to agents: 00:01:48 The primitives that make up Claude Managed Agents: 00:04:09 Why the harness and the model are becoming a single unit: 00:10:37 The infrastructure wall that kills most agent projects in production: 00:18:49 Why team agents need a different shape than individual productivity tools: 00:24:49 How Anthropic's legal team uses an agent to review marketing copy: 00:26:36 Using multi-agent orchestration for advisor strategies, adversarial pairs, and swarms: 00:34:24 How to measure agent success with outcome and budget as the end state: 00:35:50 What the platform looks like a year from now, when Claude writes its own harness: 00:39:11

In the future, you’ll be able to accomplish a goal by just giving Claude an outcome and a budget. That’s the direction Anthropic is building in with its new Managed Agents features, announced at this week’s Code with Claude developer event. The basic idea: Claude, wrapped in a computer in the cloud, that you can spin up, scale, and manage as needed. Anthropic is taking on the infrastructure that kills most agent products, and making sure that it scales to meet the needs of agents running 24/7. On this week’s AI & I from Every 📧, I talk with Angela Jiang (Angela Jiang), head of product for the Claude platform, and Katelyn Lesse (Katelyn Lesse), head of engineering for the Claude platform, about what Anthropic is building and what it takes to make agents reliable in production. We get into: - Why the "build a generic harness, hot-swap any model behind it" playbook is already outdated. Angela points to eval data on Memory where the same task across different harnesses performed drastically differently. - The infrastructure wall every team hits in production—and why Katelyn thinks “my sandbox died and took the agent with it” is the real reason internal agents don't ship. - Why Anthropic is so bullish on using file systems and skills within Claude, including Angela's argument that those early design choices can compound for years. This is a must-watch for anyone trying to take an agent past the demo and into production. Watch below! Timestamps: How the Claude platform evolved from API to agents: 00:01:48 The primitives that make up Claude Managed Agents: 00:04:09 Why the harness and the model are becoming a single unit: 00:10:37 The infrastructure wall that kills most agent projects in production: 00:18:49 Why team agents need a different shape than individual productivity tools: 00:24:49 How Anthropic's legal team uses an agent to review marketing copy: 00:26:36 Using multi-agent orchestration for advisor strategies, adversarial pairs, and swarms: 00:34:24 How to measure agent success with outcome and budget as the end state: 00:35:50 What the platform looks like a year from now, when Claude writes its own harness: 00:39:11

Dan Shipper 📧

66,256 次观看 • 1 个月前

I am hooked on Dynamic Workflows! The idea of generating harnesses on the fly is so compelling that I reverse-engineered it for my agent orchestrator. And then I built a monitoring dashboard (as an HTML artifact) to track tasks, metrics, and reports. I can now use and monitor dynamic workflows in my agent orchestrator with coding agents like Claude Code, Codex, Pi, and even my own custom-built DAIR.AI agent. This is clearly the future of working with agents to accomplish complex, long-running tasks. Some use cases I'm having success with: - Branching deep research tasks (with verification) - Parallel deep research tasks - Session mining of all my agent sessions - Bug hunting - Triaging - Fact-checking - LLM councils - AI simulations - Data synthesis - Evals generation ... and many others Dynamic workflows, like agent skills, feel like an important primitive to not only get the most out of agents but also incorporate dynamic behaviors and important components like cooperation and verification. There is so much exploration ground here. The exciting part is that this is not limited to coding tasks; it extends to business use cases and many other technical domains like science and research.

I am hooked on Dynamic Workflows! The idea of generating harnesses on the fly is so compelling that I reverse-engineered it for my agent orchestrator. And then I built a monitoring dashboard (as an HTML artifact) to track tasks, metrics, and reports. I can now use and monitor dynamic workflows in my agent orchestrator with coding agents like Claude Code, Codex, Pi, and even my own custom-built DAIR.AI agent. This is clearly the future of working with agents to accomplish complex, long-running tasks. Some use cases I'm having success with: - Branching deep research tasks (with verification) - Parallel deep research tasks - Session mining of all my agent sessions - Bug hunting - Triaging - Fact-checking - LLM councils - AI simulations - Data synthesis - Evals generation ... and many others Dynamic workflows, like agent skills, feel like an important primitive to not only get the most out of agents but also incorporate dynamic behaviors and important components like cooperation and verification. There is so much exploration ground here. The exciting part is that this is not limited to coding tasks; it extends to business use cases and many other technical domains like science and research.

elvis

100,997 次观看 • 14 天前

THIS CHINESE DEVELOPER'S OBSIDIAN BRAIN MAPS EVERY CLIENT CONVERSATION - ZERO EMPLOYEES, $40,000/MONTH, CATCHES PROBLEMS BEFORE CLIENTS DO every node on screen is a conversation - green for happy clients, yellow for neutral, red for problems that need attention - the brain sees the emotional state of the entire business in real time this is not a CRM, not a dashboard, not a spreadsheet - it's a living map of every client interaction that fires connections and surfaces patterns no human account manager would ever catch when a cluster starts turning yellow the brain flags it before the client sends the complaint - when a green cluster grows it identifies what's working and replicates it across every other account automatically he replaced a $15,000/month account management team with one Obsidian brain that never sleeps, never misses a message and never needs a performance review no employees, no contractors, no standups, no payroll - just one developer, one brain and one delivery system doing the work of 10 people $40,000/month - solo - from a system that gets smarter every single conversation

THIS CHINESE DEVELOPER'S OBSIDIAN BRAIN MAPS EVERY CLIENT CONVERSATION - ZERO EMPLOYEES, $40,000/MONTH, CATCHES PROBLEMS BEFORE CLIENTS DO every node on screen is a conversation - green for happy clients, yellow for neutral, red for problems that need attention - the brain sees the emotional state of the entire business in real time this is not a CRM, not a dashboard, not a spreadsheet - it's a living map of every client interaction that fires connections and surfaces patterns no human account manager would ever catch when a cluster starts turning yellow the brain flags it before the client sends the complaint - when a green cluster grows it identifies what's working and replicates it across every other account automatically he replaced a $15,000/month account management team with one Obsidian brain that never sleeps, never misses a message and never needs a performance review no employees, no contractors, no standups, no payroll - just one developer, one brain and one delivery system doing the work of 10 people $40,000/month - solo - from a system that gets smarter every single conversation

Noisy

42,207 次观看 • 13 天前

THIS IS HOW A SENIOR ENGINEER ACTUALLY SCALES THEMSELVES WITH CLAUDE CODE the biggest change with AI isn't coding faster. it's where you actually spend your time now. more detailed prompts, more code review, more planning, less typing, etc. here's the workflow: this guy has been shipping code since the days of cgi and perl. he uses a compound engineering plugin that runs 5 separate agents on every task. one brainstorms, one plans the technical implementation, one executes, one reviews, one checks different verticals. every step is documented in markdown files. it's slow and way more waiting. but the output quality is way higher because each agent is focused on one thing. then the REAL multiplier is in git worktrees if Claude Code made you 10x faster, worktrees multiplies that again depending on how many agents you can manage in parallel his team runs 4-8 Claude Code sessions at the same time across different worktrees with each one working on a separate task. the skill is managing multiple AI agents in parallel without losing track, that's the next evolution of engineering

THIS IS HOW A SENIOR ENGINEER ACTUALLY SCALES THEMSELVES WITH CLAUDE CODE the biggest change with AI isn't coding faster. it's where you actually spend your time now. more detailed prompts, more code review, more planning, less typing, etc. here's the workflow: this guy has been shipping code since the days of cgi and perl. he uses a compound engineering plugin that runs 5 separate agents on every task. one brainstorms, one plans the technical implementation, one executes, one reviews, one checks different verticals. every step is documented in markdown files. it's slow and way more waiting. but the output quality is way higher because each agent is focused on one thing. then the REAL multiplier is in git worktrees if Claude Code made you 10x faster, worktrees multiplies that again depending on how many agents you can manage in parallel his team runs 4-8 Claude Code sessions at the same time across different worktrees with each one working on a separate task. the skill is managing multiple AI agents in parallel without losing track, that's the next evolution of engineering

Om Patel

158,893 次观看 • 2 个月前