Loading video...

Video Failed to Load

Go Home

sharing a pretty important primitive in my agentic engineering setup I call it "gnhf" - good night, have fun basically, every night before I go to bed, I would put my agents to work so I never wake up "empty-handed". it's done through a similar setup as geoff's famous...

59,814 views • 3 months ago •via X (Twitter)

0 Comments

No comments available

Comments from the original post will appear here

Related Videos

Three skills I use every day in Claude Code and Codex to solve my hardest problems: 1️⃣ /agent-watchdog When I have one agent like Codex working on a task and I don't fully trust it's going to do everything right, I'll open up another one like Claude Code and tell it to watchdog the Codex thread. You can copy the Codex deep link into Claude Code and it'll look at the prompt you sent, watch the Codex thread until it's done, then compare the Codex solution to how it was planning to solve it and automatically fix anything that Codex missed. It can also test the work of the other agent end-to-end. Similar to the idea of OpenRouter's new Fusion feature, I've definitely found that two models thinking through a problem and checking each other's work can be wildly more impactful than just one. 2️⃣ /plan-arbiter Similar ideas as /agent-watchdog - but with this one you have both make plans, compare plans, negotiate the differences, and make a final plan to execute. I find Claude Code is better at writing plans, but Codex is faster and cheaper to execute on them. Then I usually have Claude Code watchdog the Codex work and fix anything that was missed. 3️⃣ /read-the-damn-docs One thing that drives me crazy with coding agents is they're so reluctant to look up docs. They'll just guess and guess and guess at the right API surface for things, or the right solution to an integration of two things. Once I explicitly tell it to look up the docs, it says "Oh, I see the answer," and it fixes the problem. So I made the /read-the-damn-docs skill. Add it and your agents will know when and how to do efficient web searches to look up docs for the types of problems you really should look up docs for. All of these are totally open source over on my GitHub. If you try them, let me know your feedback. Will link to them below:

Steve (Builder.io)

35,213 views • 10 days ago

New Andrej Karpathy interview Says AI agent failures stem from user skill, not model capability. Poor instructions cause errors. He suggests delegating 20-minute macro actions like coding and research to parallel agents and reviewing their work. --- "I think everything, like so many things, even if they don't work, I think to a large extent you feel like it's a skill issue. It's not that the capability is not there; it's that you just haven't found a way to string together what's available. Like, I didn't give good enough instructions to the agents in the file, or whatever it may be. I don't have a nice enough memory tool that I put in there, or something like that. So, it all kind of feels like a skill issue when it doesn't work to some extent. You want to see how you can parallelize them, and you want to be a 'Pierce tender,' basically. Pierce famously has a funny photo where he's in front of lots of these Codex agents behind the monitor. They all take about 20 minutes if you run them correctly and use high effort. You have multiple—you know, 10 or 20—pull requests checked out. It's just like you can do much larger macro actions. It's not just, 'Here's a line of code, here's a new function.' It's like, 'Here's a new functionality, delegate it to agent one. Here's a new functionality that's not going to interfere with the other one, give it to agent two.' Then, you try to review their work as best as you can, depending on how much you care about that code. You look for these macro actions that you can manipulate your software repository by. Another agent is doing some research, another agent is writing code, another one is coming up with a plan for some new implementation. Everything just happens in these macro actions over your repository. You're just trying to become really good at it and develop a muscle memory for it. It's very rewarding when it actually works, but it's also a new thing to learn. Hence, the psychosis." --- From No Priors YT channel (link in comment)

Rohan Paul

23,090 views • 3 months ago