Video wird geladen...

Video konnte nicht geladen werden

Beim Laden dieses Videos ist ein Problem aufgetreten. Dies könnte an einem vorübergehenden Netzwerkproblem liegen oder das Video ist möglicherweise nicht verfügbar.

Since I joined Cognition I've been obsessed with learning how our eng team uses Devin themselves If we are building the best coding agent + we have the most cracked engineers + we've been fully AI-pilled from day one... it stands to reason that there is a lot to... learn by just watching our technical staff work And yes there are a lot of tips & tricks. I recorded a video talking about my favorite... Agent Fan Out - asking your agent to break down the problem, spin up 10 more agents in parallel, and combine their results This is something I've seen everyone do - from our model research team spinning up 100 Devins to examine eval logs - or our product team using 5 child Devins to try out 5 different alternative implementations of the same thing If engineering is cheap and easy, why not build the product 10 times and choose the best one? Think of it in a master/slave context: Master Devin -> 10 Slave Devins -> Master Devin pulls their results There are two reasons this is useful 1. Agents are smartest when their context is small and their task is small & precise. Context windows are finite and too much becomes distracting 2. Agents are good at helping you break a large problem into independent & parallelizable chunks of work Every Devin is its own VM/computer so this also is just a great way to move faster. I've done a migration from React Native to Swift by having Devin break it up into 6 pieces then spin up new Devins to work in parallel In the video I build a greenfield project and try my best to show off this agent fan out concept. I also threw in a few other tricks that I've seen my coworkers do: - Let Devin write its own prompts (especially for creating child Devins). It's way better than us humans - Do tons of things at once. You should be absolutely frying your attention span. Your job should just be babysitting 38 different Devins - Don't be a blocker. Before letting the agent work I make sure to tell it to ask me any questions that would fill in ambiguities. Give your agent all the information it needs (and then some more) so that it can just cook without stopping to ask you questions every few minutes - Let Devin test itself. Integration sanity tests are pretty much solved Hope this is useful!!show more

Jared Zoneraich

5,717 subscribers

113,212 Aufrufe • vor 11 Tagen •via X (Twitter)

Bildung Wissenschaft & Technologie

Anya Rossi• Live Now

Private livecam show

0 Kommentare

Keine Kommentare verfügbar

Kommentare vom Original-Post werden hier angezeigt

Ähnliche Videos

Devin can now manage a team of Devins. Devin will break down large tasks and delegate them to parallel Devins that each run in their own VM. Over time, Devin gets better at breaking down and managing tasks for your codebase. Available now for all users.

Devin can now manage a team of Devins. Devin will break down large tasks and delegate them to parallel Devins that each run in their own VM. Over time, Devin gets better at breaking down and managing tasks for your codebase. Available now for all users.

Cognition

109,556 Aufrufe • vor 3 Monaten

Devin can manage a team of Devins. Managed Devins run in parallel help break down complex tasks. Each managed session is a full Devin, with its own VM, terminal, browser, and testing infrastructure. The main session coordinates, monitors, and compiles results.

Devin can manage a team of Devins. Managed Devins run in parallel help break down complex tasks. Each managed session is a full Devin, with its own VM, terminal, browser, and testing infrastructure. The main session coordinates, monitors, and compiles results.

Cognition

31,502 Aufrufe • vor 2 Monaten

We asked Walden (Co-Founder of Cognition) about building a truly AI-native company. “At a minimum, we can’t be hiring people whose whole aspiration is to write code that Devin will do in 1–2 years.” "A lot of companies at our stage, they have like an internal tools team to maintain all the different services that engineers internally use." "And we can just staff that team with Devins, and then basically have engineers just sending requests to those Devins for how to do that work."

We asked Walden (Co-Founder of Cognition) about building a truly AI-native company. “At a minimum, we can’t be hiring people whose whole aspiration is to write code that Devin will do in 1–2 years.” "A lot of companies at our stage, they have like an internal tools team to maintain all the different services that engineers internally use." "And we can just staff that team with Devins, and then basically have engineers just sending requests to those Devins for how to do that work."

TBPN

15,304 Aufrufe • vor 1 Jahr

How I get shit done, Episode 001 I've set up a playbook called ‘land’, which is triggered automatically when I drag an issue into the merging column in Linear. That reliably runs CI and merges any green PRs. This has allowed me to ship way faster than before. I think the key takeaway here is you can try to build your own code factory and your own agent orchestration layer, but it is a huge amount of work. The truth is there are entire companies with massive funding that are already tackling this and it's just easier to use their platform. I think this is a lot like if you were a carpenter: you could build your own generator, fuel it, wire it up, and then build a plug and then you could plug your saw into it. Or you could just plug your saw into the wall. Because the electricity company has already done all the work in the infrastructure and investment to make that plug work. I think more of us who are building companies should just be plugging into the wall instead of trying to build all this tooling ourselves. As a dev it's so tempting to build your own dev tools but I think a lot of times, even though you can build fast with agents now, it's a complete waste of time. It probably sounds like I'm being paid by Devin or something but I have zero financial interest here. They don't give me credits. I'm not an investor. I'm not being paid. I just think the tooling is really damn good. If you used Devin a long time ago and wrote it off, you really should have another look - for $500/month it's pretty obscene what you can get done.

How I get shit done, Episode 001 I've set up a playbook called ‘land’, which is triggered automatically when I drag an issue into the merging column in Linear. That reliably runs CI and merges any green PRs. This has allowed me to ship way faster than before. I think the key takeaway here is you can try to build your own code factory and your own agent orchestration layer, but it is a huge amount of work. The truth is there are entire companies with massive funding that are already tackling this and it's just easier to use their platform. I think this is a lot like if you were a carpenter: you could build your own generator, fuel it, wire it up, and then build a plug and then you could plug your saw into it. Or you could just plug your saw into the wall. Because the electricity company has already done all the work in the infrastructure and investment to make that plug work. I think more of us who are building companies should just be plugging into the wall instead of trying to build all this tooling ourselves. As a dev it's so tempting to build your own dev tools but I think a lot of times, even though you can build fast with agents now, it's a complete waste of time. It probably sounds like I'm being paid by Devin or something but I have zero financial interest here. They don't give me credits. I'm not an investor. I'm not being paid. I just think the tooling is really damn good. If you used Devin a long time ago and wrote it off, you really should have another look - for $500/month it's pretty obscene what you can get done.

Ryan Carson

14,038 Aufrufe • vor 3 Monaten

Airtable's Howie Liu says that basically everyone will need to graduate from being ICs to ICs that manage teams of 20-30 agents: "The best developers today don't just sit there in front of their IDEs and synchronously talk to their agent." "[Instead], you have like 30 separate branches that are each being worked on by a different agent. And you can have the agents continue to update the branches based on human and other agent feedback." "And I think this whole idea of it taking hours for that entire loop to complete — agent pushes some changes, the changes get feedback from other agents or humans, the agent responds to that — that whole loop could be hours, not just minutes. So you're not going to just sit there and watch it one at a time." "But the powerful thing about this is, each one is still actually operating faster than a human engineer. One agent on one branch can do the work of maybe three humans, operating 3x as fast. So it's like a 10x leverage factor just for one agent." "But the best engineers are now able to multitask and say, 'I'm going to oversee my own little team of 20-30 agents working concurrently.'" "Everyone needs to graduate from being an IC to an IC manager of agents. Meaning, if you're a VC analyst, your job should no longer be to go synchronously research one company. You need to go and research like 30 companies, and do them all faster, better, and higher quality than you could before." "That's the greatest leap that is going to be challenging for a lot of people in a lot of roles. Because it's a totally different mentality in how you operate, and what your role is."

Airtable's Howie Liu says that basically everyone will need to graduate from being ICs to ICs that manage teams of 20-30 agents: "The best developers today don't just sit there in front of their IDEs and synchronously talk to their agent." "[Instead], you have like 30 separate branches that are each being worked on by a different agent. And you can have the agents continue to update the branches based on human and other agent feedback." "And I think this whole idea of it taking hours for that entire loop to complete — agent pushes some changes, the changes get feedback from other agents or humans, the agent responds to that — that whole loop could be hours, not just minutes. So you're not going to just sit there and watch it one at a time." "But the powerful thing about this is, each one is still actually operating faster than a human engineer. One agent on one branch can do the work of maybe three humans, operating 3x as fast. So it's like a 10x leverage factor just for one agent." "But the best engineers are now able to multitask and say, 'I'm going to oversee my own little team of 20-30 agents working concurrently.'" "Everyone needs to graduate from being an IC to an IC manager of agents. Meaning, if you're a VC analyst, your job should no longer be to go synchronously research one company. You need to go and research like 30 companies, and do them all faster, better, and higher quality than you could before." "That's the greatest leap that is going to be challenging for a lot of people in a lot of roles. Because it's a totally different mentality in how you operate, and what your role is."

TBPN

35,595 Aufrufe • vor 2 Monaten

Learn Devin in 5 minutes Cloud agent. Terminal agent. Linear assignee. Slack teammate. On schedule. Devin is everywhere your engineering team already lives. 00:00 - Introduction 00:19 - Sending sessions to Slack 00:24 - Creating your first session 00:51 - Devin Review 01:20 - Computer Use testing 01:46 - Desktop and IDE in Devin 02:33 - Scheduled Devins 03:01 - Devin for Terminal 03:40 - Terminal to cloud handoff 04:09 - DeepWiki 04:21 - Assigning Devins from Linear If you got this far, I'm giving away $1,000 in Devin credits, $200 to 5 people for a free month of Devin Max. Comment what you'd like to build and retweet this post to be eligible to win!

Learn Devin in 5 minutes Cloud agent. Terminal agent. Linear assignee. Slack teammate. On schedule. Devin is everywhere your engineering team already lives. 00:00 - Introduction 00:19 - Sending sessions to Slack 00:24 - Creating your first session 00:51 - Devin Review 01:20 - Computer Use testing 01:46 - Desktop and IDE in Devin 02:33 - Scheduled Devins 03:01 - Devin for Terminal 03:40 - Terminal to cloud handoff 04:09 - DeepWiki 04:21 - Assigning Devins from Linear If you got this far, I'm giving away $1,000 in Devin credits, $200 to 5 people for a free month of Devin Max. Comment what you'd like to build and retweet this post to be eligible to win!

nader dabit

34,719 Aufrufe • vor 1 Monat

Cognition has signed a definitive agreement to acquire Windsurf. The acquisition includes Windsurf’s IP, product, trademark and brand, and strong business. Above all, it includes Windsurf’s world-class people, whom we’re privileged to welcome to our team. We are also honoring their talent and hard work in building Windsurf into the great business it is today. This transaction is structured so that 100% of Windsurf employees will participate financially. They will also have all vesting cliffs waived and will receive fully accelerated vesting for their work to date. At Cognition we have focused on developing robust and secure autonomous agents, while Windsurf has pioneered the agentic IDE. Devin + Windsurf are a powerful combination for the developers we serve. Working side by side, we’ll soon enable you to plan tasks in an IDE powered by Devin’s codebase understanding, delegate chunks of work to multiple Devins in parallel, complete the highest-leverage parts yourself with the help of autocomplete, and stitch it all back together in the same IDE. Cognition and Windsurf are united behind a shared vision for the future of software engineering, and there’s never been a better time to build. Welcome to our new colleagues from Windsurf!

Cognition

3,047,761 Aufrufe • vor 11 Monaten

this video is the CLEAREST explanation of how claude skills + AI agents work and how to use them most people set up an AI agent and wonder why it keeps disappointing them. the context window is everything context is what the model assembles before it takes any action. think of it like everything the agent needs to read before it does anything. the quality of what goes in determines the quality of what comes out. the models are genuinely really good right now. claude and gpt are exceptional. the variable is almost always the context you give them. 1. agent.md files are mostly unnecessary every single line you put in an agent.md file gets added to every single conversation you have with your agent. a 1000 line file is around 7000 tokens burning on every run. the model already knows to use react. it can read your codebase. save the agent.md for proprietary information specific to your company that the model genuinely cannot know on its own. 2. skills are the actual unlock a skill.md file works differently. what loads into context is only the name and description, around 50 tokens. the full instructions only appear when the agent recognizes it needs that skill. so instead of 7000 tokens on every run you have 50. and the agent stays sharp because the context window stays lean. the closer you get to filling the context window the worse the agent performs, same way you perform worse when someone dumps 10 things on you at once. 3. here is how to actually build a skill the right way most people identify a workflow and immediately try to write the skill. what you want to do instead is run the workflow by hand with the agent first. walk it through every single step. tell it what to check, what good looks like, what bad looks like. correct it in real time. once you have had a full successful run from start to finish, tell the agent to review everything it just did and write the skill itself. it writes a better skill than you will because it has the full context of what actually worked in practice not in theory. 4. recursively building skills is how you go from frustrated to reliable when the skill breaks, and it will break, ask the agent exactly why it failed. it will tell you specifically what went wrong. fix it together in that same conversation. then tell it to update the skill file so that failure mode never happens again. ross mike did this five times with his youtube report generator. it now pulls from eight different data sources and runs flawlessly every single time without him touching it. 5. sub agents are something you earn not something you set up on day one start with one agent. build one workflow. turn it into one skill. once that works add another. ross mike has five sub agents now covering marketing, business, personal and more. it took months to get there and every single one exists because a workflow proved it deserved to exist. the people who set up 15 sub agents on day one and wonder why nothing works skipped all the steps that make the thing actually run. 6. your workflow is the thing the model cannot get anywhere else the model has been trained on everything. it knows more than you about most things. what it does not have is your specific process, your taste, your way of doing things. that is what skills capture. that is what makes your agent actually useful versus a generic one. downloading someone else's skill means downloading their context onto your setup and it will not work the way you want it to because it was never built around how you work. this is the clearest explanation of how agents actually work i have heard. Micky runs this stuff every single day and the results show it. full episode is now live on The Startup Ideas Podcast (SIP) 🧃 where you get your pods people charge for this sorta stuff i give away the sauce for free i just want you to win watch

this video is the CLEAREST explanation of how claude skills + AI agents work and how to use them most people set up an AI agent and wonder why it keeps disappointing them. the context window is everything context is what the model assembles before it takes any action. think of it like everything the agent needs to read before it does anything. the quality of what goes in determines the quality of what comes out. the models are genuinely really good right now. claude and gpt are exceptional. the variable is almost always the context you give them. 1. agent.md files are mostly unnecessary every single line you put in an agent.md file gets added to every single conversation you have with your agent. a 1000 line file is around 7000 tokens burning on every run. the model already knows to use react. it can read your codebase. save the agent.md for proprietary information specific to your company that the model genuinely cannot know on its own. 2. skills are the actual unlock a skill.md file works differently. what loads into context is only the name and description, around 50 tokens. the full instructions only appear when the agent recognizes it needs that skill. so instead of 7000 tokens on every run you have 50. and the agent stays sharp because the context window stays lean. the closer you get to filling the context window the worse the agent performs, same way you perform worse when someone dumps 10 things on you at once. 3. here is how to actually build a skill the right way most people identify a workflow and immediately try to write the skill. what you want to do instead is run the workflow by hand with the agent first. walk it through every single step. tell it what to check, what good looks like, what bad looks like. correct it in real time. once you have had a full successful run from start to finish, tell the agent to review everything it just did and write the skill itself. it writes a better skill than you will because it has the full context of what actually worked in practice not in theory. 4. recursively building skills is how you go from frustrated to reliable when the skill breaks, and it will break, ask the agent exactly why it failed. it will tell you specifically what went wrong. fix it together in that same conversation. then tell it to update the skill file so that failure mode never happens again. ross mike did this five times with his youtube report generator. it now pulls from eight different data sources and runs flawlessly every single time without him touching it. 5. sub agents are something you earn not something you set up on day one start with one agent. build one workflow. turn it into one skill. once that works add another. ross mike has five sub agents now covering marketing, business, personal and more. it took months to get there and every single one exists because a workflow proved it deserved to exist. the people who set up 15 sub agents on day one and wonder why nothing works skipped all the steps that make the thing actually run. 6. your workflow is the thing the model cannot get anywhere else the model has been trained on everything. it knows more than you about most things. what it does not have is your specific process, your taste, your way of doing things. that is what skills capture. that is what makes your agent actually useful versus a generic one. downloading someone else's skill means downloading their context onto your setup and it will not work the way you want it to because it was never built around how you work. this is the clearest explanation of how agents actually work i have heard. Micky runs this stuff every single day and the results show it. full episode is now live on The Startup Ideas Podcast (SIP) 🧃 where you get your pods people charge for this sorta stuff i give away the sauce for free i just want you to win watch

GREG ISENBERG

192,408 Aufrufe • vor 2 Monaten

I shot this video last week to onboard a new set of alpha customers to Notion Custom Agents. Figured I should just put it out here so you can follow along the latest and greatest. It's so cool to see all the things people are building with it. You can see some of my own use cases here: (0:00) Intro to custom agents Starting with personal use cases (1:00) Building a mail triage agent (5:00) Creating a prep doc for all my daily meetings (6:45) Generating a daily report that helps me catch up with multiple slack channels Switching to company use cases (8:30) Agent that automatically answers questions in a slack channel and files tasks when necessary (10:30) Agent that auto-routes product feedback to the right team channel and task db. And it does this with routing rules, that it wrote itself. (12:40) Build your own agent. Let your imaginations run wild! Send us feedback! The product is rolling out quickly, and everyone should have access in the next couple weeks.

I shot this video last week to onboard a new set of alpha customers to Notion Custom Agents. Figured I should just put it out here so you can follow along the latest and greatest. It's so cool to see all the things people are building with it. You can see some of my own use cases here: (0:00) Intro to custom agents Starting with personal use cases (1:00) Building a mail triage agent (5:00) Creating a prep doc for all my daily meetings (6:45) Generating a daily report that helps me catch up with multiple slack channels Switching to company use cases (8:30) Agent that automatically answers questions in a slack channel and files tasks when necessary (10:30) Agent that auto-routes product feedback to the right team channel and task db. And it does this with routing rules, that it wrote itself. (12:40) Build your own agent. Let your imaginations run wild! Send us feedback! The product is rolling out quickly, and everyone should have access in the next couple weeks.

Akshay Kothari

74,684 Aufrufe • vor 5 Monaten

The subject of 'owning a slave' is dense. It is something we hear a lot when we are in the FemDom Realm. Is it just fantasy? Can it actually be a lifestyle? How do we navigate this type of dynamic? How do we even get to that level of D/s? In this short clip [Exerpt from SLAVE TRAINING Part 2] I want to already bring to your attention one thing that will define if your desire for a slave (or desire as a slave) is touching more on a fantasy or... how can you actually navigate this in a realistic way. No one person 'can do it all' or should be expected to. If you want your slave to be 'the best' , assign them a specific role in which they can excel... and then build upon that. Once they 'master' your housekeeping (which takes quite a bit of real training), they can move to other levels. And an important note I want to leave here... make them EARN access to certain things in your life that sometimes you just want to delegate because you don't want to manage or don't know how to manage. Entrusting them with serious tasks that can affect your life, your business, your reputation, are on top of the ladder. Are they even qualified for the thing you want them to take off your shoulders? Start small and allow them to grow in their submission, to develop their skills and to learn how to best satisfy you without setting them up for failure by expecting too much, too quick. In the end, if you want this to truly work, you have to approach it from a place that transcends the roles. As this is consensual power exchange. And you both want to be fulfilled in that relationship.

The subject of 'owning a slave' is dense. It is something we hear a lot when we are in the FemDom Realm. Is it just fantasy? Can it actually be a lifestyle? How do we navigate this type of dynamic? How do we even get to that level of D/s? In this short clip [Exerpt from SLAVE TRAINING Part 2] I want to already bring to your attention one thing that will define if your desire for a slave (or desire as a slave) is touching more on a fantasy or... how can you actually navigate this in a realistic way. No one person 'can do it all' or should be expected to. If you want your slave to be 'the best' , assign them a specific role in which they can excel... and then build upon that. Once they 'master' your housekeeping (which takes quite a bit of real training), they can move to other levels. And an important note I want to leave here... make them EARN access to certain things in your life that sometimes you just want to delegate because you don't want to manage or don't know how to manage. Entrusting them with serious tasks that can affect your life, your business, your reputation, are on top of the ladder. Are they even qualified for the thing you want them to take off your shoulders? Start small and allow them to grow in their submission, to develop their skills and to learn how to best satisfy you without setting them up for failure by expecting too much, too quick. In the end, if you want this to truly work, you have to approach it from a place that transcends the roles. As this is consensual power exchange. And you both want to be fulfilled in that relationship.

Ms. Malissia

12,410 Aufrufe • vor 4 Monaten

The #1 problem with coding agents right now: Ask them to solve one problem, and they will make 10 other changes you didn't want. This happens to me every day. It happens to everyone I talk to as well. We have a solution for this now. The team Augment Code released a "Task List" feature for their coding assistant that solves this problem. Augment Code is partnering with me on this post. In case you haven't used them before: • Augment Code is a fully-fledged coding assistant • Their specialty are large projects • Fastest coding indexing I've seen • Has a free forever community edition Now, you can ask their coding agent to generate a Task List before doing anything. This will give you a plan you can review, edit, and augment if you need to. You can export this plan, load it on a different session, or even share it across projects. It makes a huge difference: The task list constrains the agent so you won't get any "unintended" changes anymore. It also puts you in control of everything the agent does. Check the video to see the agent working through a task list. You can also try this 100% free: (By the way, they also have support for remote agents. You can basically have those agents write your code while you are sleeping.)

The #1 problem with coding agents right now: Ask them to solve one problem, and they will make 10 other changes you didn't want. This happens to me every day. It happens to everyone I talk to as well. We have a solution for this now. The team Augment Code released a "Task List" feature for their coding assistant that solves this problem. Augment Code is partnering with me on this post. In case you haven't used them before: • Augment Code is a fully-fledged coding assistant • Their specialty are large projects • Fastest coding indexing I've seen • Has a free forever community edition Now, you can ask their coding agent to generate a Task List before doing anything. This will give you a plan you can review, edit, and augment if you need to. You can export this plan, load it on a different session, or even share it across projects. It makes a huge difference: The task list constrains the agent so you won't get any "unintended" changes anymore. It also puts you in control of everything the agent does. Check the video to see the agent working through a task list. You can also try this 100% free: (By the way, they also have support for remote agents. You can basically have those agents write your code while you are sleeping.)

Santiago

41,738 Aufrufe • vor 11 Monaten

"Every team wants to win a championship, but not every team wants to do the things required for a championship. And here's the thing: it's easy to be an average team. It doesn't require a lot. It's less adversity to be average in the world. The consequences of being average aren't easy. We end up wearing them. There's strain and struggle that comes with that too. The standard is just lower to be an average team. To be a championship team, to be champion, to be a championship team member here . . . I'm not gonna lie to you . . . I'm going to tell you the truth. It is harder. It is. The question is: Is it worth it? Some people say, "Oh it's not harder work." Yes it is. It's harder work. You can pursue comfort or you can pursue excellence. If we pursue comfort, we gotta give up some excellence. But if we pursue excellence, then we're just going to face more adversity. Everyone who's ever accomplished something excellence has had to overcome it. We are here today for a reason. Two reasons actually. Reason #1 is let's make sure that we identify and realize the opportunities that are in front of us. Reason #2 is let's make sure that we are preparing for the adversity that those opportunities require. And just understand: every single time you lever up your opportunities and you identify, "Oh there's something more I can do, more I can achieve. I can get better. I can earn more. I can do this." It's going to be matched with the adversity that comes with it. I want to make sure we are prepared for both of those, so that we're not chasing big opportunities and then getting mad when things start getting harder along the way. Is that fair? Does that make sense?"

"Every team wants to win a championship, but not every team wants to do the things required for a championship. And here's the thing: it's easy to be an average team. It doesn't require a lot. It's less adversity to be average in the world. The consequences of being average aren't easy. We end up wearing them. There's strain and struggle that comes with that too. The standard is just lower to be an average team. To be a championship team, to be champion, to be a championship team member here . . . I'm not gonna lie to you . . . I'm going to tell you the truth. It is harder. It is. The question is: Is it worth it? Some people say, "Oh it's not harder work." Yes it is. It's harder work. You can pursue comfort or you can pursue excellence. If we pursue comfort, we gotta give up some excellence. But if we pursue excellence, then we're just going to face more adversity. Everyone who's ever accomplished something excellence has had to overcome it. We are here today for a reason. Two reasons actually. Reason #1 is let's make sure that we identify and realize the opportunities that are in front of us. Reason #2 is let's make sure that we are preparing for the adversity that those opportunities require. And just understand: every single time you lever up your opportunities and you identify, "Oh there's something more I can do, more I can achieve. I can get better. I can earn more. I can do this." It's going to be matched with the adversity that comes with it. I want to make sure we are prepared for both of those, so that we're not chasing big opportunities and then getting mad when things start getting harder along the way. Is that fair? Does that make sense?"

Brian Kight

125,726 Aufrufe • vor 2 Jahren

Bash is all you need! Which is why I'm introducing my holiday project: just-bash just-bash is a pretty complete implementation of bash in TypeScript designed to be used as a bash tool by AI agents. Because it turns out agents love exploring data via shell scripts, even beyond coding. It comes with grep, sed, awk and the 99th percentile features that an agent like Claude Code or Cursor would use. In fact, Claude Code can use it for secure bash execution. In the package - A bash-tool for AI SDK - A binary for use by yourself or your coding agents - An overlay filesystem to feed files to your agent securely - A Vercel Sandbox compatible API, so you can quickly upgrade to a real VM if you need to run binaries - An example AI agent that explores the just-bash code base using just-bash - I imported the Oils shell bash compatibility suite and just-bash passes a very good chunk What is interesting about this codebase: It was essentially entirely written by Opus 4.5. Coding agents love bash and they are good at reproducing it. They are also great at text-book recursive descent parsers and AST tweet-walk interpreters. That said, it is, like, a lot of code and I didn't read it all 😅. This is very much a hack, but it also seems to be _really_ useful. I haven't really found anything agents want to use that it doesn't support and it's fast and secure (caveats apply). It doesn't have write access to your computer and the filesystem is given a root that the agent cannot escape from. Find it at Related: Our recent blog post how we migrated our data analysis agent to bash tools and achieved incredible quality improvements The video shows the example agent investigating the just-bash code base

Bash is all you need! Which is why I'm introducing my holiday project: just-bash just-bash is a pretty complete implementation of bash in TypeScript designed to be used as a bash tool by AI agents. Because it turns out agents love exploring data via shell scripts, even beyond coding. It comes with grep, sed, awk and the 99th percentile features that an agent like Claude Code or Cursor would use. In fact, Claude Code can use it for secure bash execution. In the package - A bash-tool for AI SDK - A binary for use by yourself or your coding agents - An overlay filesystem to feed files to your agent securely - A Vercel Sandbox compatible API, so you can quickly upgrade to a real VM if you need to run binaries - An example AI agent that explores the just-bash code base using just-bash - I imported the Oils shell bash compatibility suite and just-bash passes a very good chunk What is interesting about this codebase: It was essentially entirely written by Opus 4.5. Coding agents love bash and they are good at reproducing it. They are also great at text-book recursive descent parsers and AST tweet-walk interpreters. That said, it is, like, a lot of code and I didn't read it all 😅. This is very much a hack, but it also seems to be _really_ useful. I haven't really found anything agents want to use that it doesn't support and it's fast and secure (caveats apply). It doesn't have write access to your computer and the filesystem is given a root that the agent cannot escape from. Find it at Related: Our recent blog post how we migrated our data analysis agent to bash tools and achieved incredible quality improvements The video shows the example agent investigating the just-bash code base

Malte Ubl

124,713 Aufrufe • vor 6 Monaten

My first move after joining Cognition - using Devin to update my website! My repos are indexed with Devin. I ask for a change, the Devin coding agent knows which repo to make the update, creates the PR, and sends me a link to review. This is especially cool with teams.

My first move after joining Cognition - using Devin to update my website! My repos are indexed with Devin. I ask for a change, the Devin coding agent knows which repo to make the update, creates the PR, and sends me a link to review. This is especially cool with teams.

nader dabit

18,914 Aufrufe • vor 4 Monaten

How many AI agents work at your company? We now have over 3,258 agents working alongside 1,300 humans. The crazy part is these agents were created by EVERY EMPLOYEE at our company... sales reps, marketers, customer support, product, eng. Literally EVERYONE. BUT I'm most surprised by the adoption and value that MANAGERS are getting from agents. I used to think that every IC would become a manager of agents. Now I think that managers will very likely manage WAY more agents than their ICs combined. And managers' agents will manage their ICs' agents - overseeing them for human-in-the-loop interactions. When creating agents, we use 100% context from all of your activity, files edited, tasks and projects worked on, hierarchy, skills, and role information. We build a user-based context model to make agents as relatable as possible to the specific human that we're building for. This means they truly understand the nuances of the work and what "great" looks like - because great is very much in the eye of the beholder. Great is by definition, subjective. This is also why the human ENGAGEMENT loops are SO vital to agent value. The iteration AFTER the agent is onboarded is where the MAGIC happens. This is just like a manager managing an IC in real life... you're giving feedback. In this case, though, agents learn INSTANTLY, and they retain the knowledge perfectly and indefinitely. Even though I've been pushing AI for years now to everyone in our company, this was the first time we had truly end-to-end AI adoption and retention. This kind of AI adoption is wild. But the value we're realizing is truly INSANE. Super Agents outnumber our humans nearly 3 to 1. What if you could 3X your workforce overnight? Watch this video to see how 👇

How many AI agents work at your company? We now have over 3,258 agents working alongside 1,300 humans. The crazy part is these agents were created by EVERY EMPLOYEE at our company... sales reps, marketers, customer support, product, eng. Literally EVERYONE. BUT I'm most surprised by the adoption and value that MANAGERS are getting from agents. I used to think that every IC would become a manager of agents. Now I think that managers will very likely manage WAY more agents than their ICs combined. And managers' agents will manage their ICs' agents - overseeing them for human-in-the-loop interactions. When creating agents, we use 100% context from all of your activity, files edited, tasks and projects worked on, hierarchy, skills, and role information. We build a user-based context model to make agents as relatable as possible to the specific human that we're building for. This means they truly understand the nuances of the work and what "great" looks like - because great is very much in the eye of the beholder. Great is by definition, subjective. This is also why the human ENGAGEMENT loops are SO vital to agent value. The iteration AFTER the agent is onboarded is where the MAGIC happens. This is just like a manager managing an IC in real life... you're giving feedback. In this case, though, agents learn INSTANTLY, and they retain the knowledge perfectly and indefinitely. Even though I've been pushing AI for years now to everyone in our company, this was the first time we had truly end-to-end AI adoption and retention. This kind of AI adoption is wild. But the value we're realizing is truly INSANE. Super Agents outnumber our humans nearly 3 to 1. What if you could 3X your workforce overnight? Watch this video to see how 👇

Zeb Evans

425,244 Aufrufe • vor 5 Monaten

This is the future of web design. Gamma 3.0 has just been released, and I used it to create a complete website from a URL in seconds. No prompts. No code. No input. Their new AI agent will design, review, fix, and iterate on your content. The best part: you can watch it in real-time as it builds your website! This is pretty amazing! There are many AI design agents out there, but this one is one of the most hands-off tools I've seen. This is incredible.

This is the future of web design. Gamma 3.0 has just been released, and I used it to create a complete website from a URL in seconds. No prompts. No code. No input. Their new AI agent will design, review, fix, and iterate on your content. The best part: you can watch it in real-time as it builds your website! This is pretty amazing! There are many AI design agents out there, but this one is one of the most hands-off tools I've seen. This is incredible.

Santiago

39,140 Aufrufe • vor 9 Monaten

9 out of 10 multi-agent projects never leave demo mode Not because the model is bad. Because the structure is missing. Most people who try to build [ a team of AI agents ] end up with one agent talking to itself in five tabs. The agents don't share context → Don't divide work → Don't know what the others are doing. > stage 1: if your agent doesn't have a real loop, observe, act, iterate, you have a long prompt, not an agent > stage 2: subagents need isolated context, the orchestrator never reads their raw transcript, only the summary > stage 3: the orchestrator plans and delegates, the moment it executes, it drowns in details that belong inside subagents > stage 4: without a shared task list it's not a team, it's five agents duplicating each other's work in parallel > stage 5: a permissions file is what lets you sleep, the model cannot bypass it because the rule lives outside the model a team of AI agents is not more model, it is more structure the most ignored stage in every demo that died: - durability... when a 50-step task crashes at step 47 and starts from zero, that's not a model failure. that's a missing write-to-disk call

9 out of 10 multi-agent projects never leave demo mode Not because the model is bad. Because the structure is missing. Most people who try to build [ a team of AI agents ] end up with one agent talking to itself in five tabs. The agents don't share context → Don't divide work → Don't know what the others are doing. > stage 1: if your agent doesn't have a real loop, observe, act, iterate, you have a long prompt, not an agent > stage 2: subagents need isolated context, the orchestrator never reads their raw transcript, only the summary > stage 3: the orchestrator plans and delegates, the moment it executes, it drowns in details that belong inside subagents > stage 4: without a shared task list it's not a team, it's five agents duplicating each other's work in parallel > stage 5: a permissions file is what lets you sleep, the model cannot bypass it because the rule lives outside the model a team of AI agents is not more model, it is more structure the most ignored stage in every demo that died: - durability... when a 50-step task crashes at step 47 and starts from zero, that's not a model failure. that's a missing write-to-disk call

Shadow Nick

14,501 Aufrufe • vor 1 Monat

Brian still spends over two hours a day on recruiting and personally hires the top 200 people at Airbnb. I loved this idea of being in the flow of talent to find the best people: "Don't do searches. Build pipelines. I try to map out all the best people in the Valley. So let's say I need to hire really good engineers. I don't do searches. I just informationally meet the best engineers in the world. Every meeting, the job is to get the next meeting, meet someone else. The mistake people make when they hire. They go, "I need to hire a blank." So they hire a search firm. They give you 50 profiles, and you pick the best one. That is the wrong way to do it. The best way to do it is pipeline recruiting. You're constantly recruiting, you're constantly meeting people. in advance of searches. And all of it is referral based. The two ways to find out if people are good – is to start with the results and work backwards to the people. Find an ad you like and figure out who made that ad. Start with the results. Work backwards to people. Don't start with the resume. The other thing to do is just keep asking people to build your Rolodex. The moment I find somebody that's really good, I ask them who all the best people they know are. And I build these little mafias and they tell you who the other good people are. I am the co-hiring manager for the top 200 people in the company. This is very radical. A lot of CEOs think it's their job to hire their executive team, and their executive team hires their team. I think that is fatal. You always want to be marrying up, hiring people of the future. It should be like we're reaching. If you can hire them without my help, we're not reaching far enough. You want to hire the very best person you can."

Brian still spends over two hours a day on recruiting and personally hires the top 200 people at Airbnb. I loved this idea of being in the flow of talent to find the best people: "Don't do searches. Build pipelines. I try to map out all the best people in the Valley. So let's say I need to hire really good engineers. I don't do searches. I just informationally meet the best engineers in the world. Every meeting, the job is to get the next meeting, meet someone else. The mistake people make when they hire. They go, "I need to hire a blank." So they hire a search firm. They give you 50 profiles, and you pick the best one. That is the wrong way to do it. The best way to do it is pipeline recruiting. You're constantly recruiting, you're constantly meeting people. in advance of searches. And all of it is referral based. The two ways to find out if people are good – is to start with the results and work backwards to the people. Find an ad you like and figure out who made that ad. Start with the results. Work backwards to people. Don't start with the resume. The other thing to do is just keep asking people to build your Rolodex. The moment I find somebody that's really good, I ask them who all the best people they know are. And I build these little mafias and they tell you who the other good people are. I am the co-hiring manager for the top 200 people in the company. This is very radical. A lot of CEOs think it's their job to hire their executive team, and their executive team hires their team. I think that is fatal. You always want to be marrying up, hiring people of the future. It should be like we're reaching. If you can hire them without my help, we're not reaching far enough. You want to hire the very best person you can."

Patrick OShaughnessy

316,632 Aufrufe • vor 1 Monat

We've open sourced my favorite Devin feature: /handoff Hand off jobs to cloud Devins from your local machine Install it as a plugin in Claude Code or Codex or any other coding agent Close your laptop without pausing your agents 😉

We've open sourced my favorite Devin feature: /handoff Hand off jobs to cloud Devins from your local machine Install it as a plugin in Claude Code or Codex or any other coding agent Close your laptop without pausing your agents 😉

Jared Zoneraich

135,913 Aufrufe • vor 19 Tagen

The same kinds of productivity gains we've seen in coding with AI agents are heading to the rest of knowledge work. This is the jump when you go from having a chatbot to being able to actually have an agent go off and do work for minutes or even hours and come back with a complete work output that you then review. Here's an example of the new Box Agent filling out an RFP response from an existing knowledge base. This process would normally take hours to fill out, and requires the full attention of the user doing the work. Now, you provide the Box Agent with the RFP questions, and it will go off, make a plan, extract all the relevant questions, read through existing source material to come up with an answer, and then generate a new word document as the final output. All while you're doing something else. The key to this architecture is that the agent is able to use all of the same tools in the background that a user uses to get work done. The agent can search for documents, read entire files, run scripts and tools in the background, and even be able to write code on the fly to automate tasks it hasn't seen before. And best of all, the Box Agent will (soon) work from the Box MCP and CLI so you can invoke it in any agentic system as a step in a process. This kind of agent complexity would have been impossible even 6 months ago. Models consistently failed at tracking long running tasks or using the right tools at the right moment for the task. But this is all now possible because of models like GPT-5.4, Opus 4.6, and Gemini 3, and is only getting better by the month. Just as we moved from engineers writing code and using AI as an assistant to answer questions, in many areas of knowledge work -like legal, finance, consulting, sales, marketing, and more- when we have a problem we'll just kick off the AI agent to just go work on it for us in the background.

The same kinds of productivity gains we've seen in coding with AI agents are heading to the rest of knowledge work. This is the jump when you go from having a chatbot to being able to actually have an agent go off and do work for minutes or even hours and come back with a complete work output that you then review. Here's an example of the new Box Agent filling out an RFP response from an existing knowledge base. This process would normally take hours to fill out, and requires the full attention of the user doing the work. Now, you provide the Box Agent with the RFP questions, and it will go off, make a plan, extract all the relevant questions, read through existing source material to come up with an answer, and then generate a new word document as the final output. All while you're doing something else. The key to this architecture is that the agent is able to use all of the same tools in the background that a user uses to get work done. The agent can search for documents, read entire files, run scripts and tools in the background, and even be able to write code on the fly to automate tasks it hasn't seen before. And best of all, the Box Agent will (soon) work from the Box MCP and CLI so you can invoke it in any agentic system as a step in a process. This kind of agent complexity would have been impossible even 6 months ago. Models consistently failed at tracking long running tasks or using the right tools at the right moment for the task. But this is all now possible because of models like GPT-5.4, Opus 4.6, and Gemini 3, and is only getting better by the month. Just as we moved from engineers writing code and using AI as an assistant to answer questions, in many areas of knowledge work -like legal, finance, consulting, sales, marketing, and more- when we have a problem we'll just kick off the AI agent to just go work on it for us in the background.

Aaron Levie

24,618 Aufrufe • vor 2 Monaten