Video yükleniyor...

Video Yüklenemedi

Ana Sayfaya Dön

GitHub has a front-row seat to how code is changing now that everyone—and their army of agents—can ship code. In March alone, agents created 17 million pull requests on the platform. That’s why I was thrilled Mike Taylor was on hand to interview GitHub COO Kyle Daigle at Microsoft...

13,886 görüntüleme • 8 gün önce •via X (Twitter)

0 Yorum

Yorum bulunmuyor

Orijinal gönderinin yorumları burada görünecek

Benzer Videolar

In the future, you’ll be able to accomplish a goal by just giving Claude an outcome and a budget. That’s the direction Anthropic is building in with its new Managed Agents features, announced at this week’s Code with Claude developer event. The basic idea: Claude, wrapped in a computer in the cloud, that you can spin up, scale, and manage as needed. Anthropic is taking on the infrastructure that kills most agent products, and making sure that it scales to meet the needs of agents running 24/7. On this week’s AI & I from Every 📧, I talk with Angela Jiang (Angela Jiang), head of product for the Claude platform, and Katelyn Lesse (Katelyn Lesse), head of engineering for the Claude platform, about what Anthropic is building and what it takes to make agents reliable in production. We get into: - Why the "build a generic harness, hot-swap any model behind it" playbook is already outdated. Angela points to eval data on Memory where the same task across different harnesses performed drastically differently. - The infrastructure wall every team hits in production—and why Katelyn thinks “my sandbox died and took the agent with it” is the real reason internal agents don't ship. - Why Anthropic is so bullish on using file systems and skills within Claude, including Angela's argument that those early design choices can compound for years. This is a must-watch for anyone trying to take an agent past the demo and into production. Watch below! Timestamps: How the Claude platform evolved from API to agents: 00:01:48 The primitives that make up Claude Managed Agents: 00:04:09 Why the harness and the model are becoming a single unit: 00:10:37 The infrastructure wall that kills most agent projects in production: 00:18:49 Why team agents need a different shape than individual productivity tools: 00:24:49 How Anthropic's legal team uses an agent to review marketing copy: 00:26:36 Using multi-agent orchestration for advisor strategies, adversarial pairs, and swarms: 00:34:24 How to measure agent success with outcome and budget as the end state: 00:35:50 What the platform looks like a year from now, when Claude writes its own harness: 00:39:11

Dan Shipper 📧

66,339 görüntüleme • 1 ay önce

SaaS isn’t dead, it just needs to become agent-native. Linear (Linear) is a great example of how: They pivoted the product to be used by both humans and agents, and that has made them one of the premier software tools in the agent-native era. I had Linear’s cofounder and CEO Karri Saarinen on Every 📧's AI & I to talk about how a product management tool for human software developers became an agent-native tool—and how Linear’s trajectory reveals a bright future for SaaS businesses: - Speed means decisions matter more, not less. AI makes it easy to have an idea and build it without considering whether its existence is justified. When ChatGPT was released, SaaS companies were launching their own chatbots left, right, and center. Instead of jumping on the bandwagon, Linear stopped to consider whether the application was useful. (It wasn’t.) - Just because the technology has changed doesn’t mean your mission should. Karri attributes Linear’s success to never losing sight of what matters: helping teams develop great software. Instead of chasing trends, Linear focused on understanding how AI was impacting its customers’ workflows—and updating its product accordingly. - Agents are now first-class users. Linear never tried to change what it was or did well; it just expanded the user base. Companies can now kick off agents inside Linear, manage them, and track what they're working on alongside the humans on the team, which explains why Codex, Coinbase, and Brex all run their agents on Linear. This is a must watch for anyone interested in how an agent-native SaaS company operates. Watch below! Timestamps: Introduction and how Every first discovered Linear: 00:00:39 Why Linear waited to ship AI features instead of rushing to chatbots: 00:02:00 Linear's agent platform and becoming the system that guides AI agents: 00:05:06 Why "SaaS is dead" is a simplistic narrative: 00:07:42 How Linear adopted AI coding tools internally: 00:12:18 AI's impact on product building workflows—speed versus thoughtfulness: 00:17:45 The value of conceptual work and thinking before shipping: 00:22:18 How AI is reshaping Linear's product strategy: 00:29:30 Demo: Linear's agent skills, shared context, and code review workflow: 00:37:18 The future of product development and the enduring role of human judgment: 00:47:48

Dan Shipper 📧

36,359 görüntüleme • 2 ay önce

Claude Code cracked something open for us Every 📧. Now I ship to codebases I barely know, every feature we ship makes the next one easier, and non-technical members of the team use the terminal. I’m genuinely grateful. So I brought its creators, Cat Wu (cat) and Boris Cherny (Boris Cherny) from Anthropic, on AI & I to say thank you—and to talk about everything they’ve learned from building Claude Code. We get into: • The workflows Anthropic’s smartest engineers use to push Claude Code to its limits. Why they pit subagents against each other to get cleaner results, how they turn past code into leverage, and the slash commands and MCPs they rely on most. • The product lessons behind one of the most loved AI agents in the world. How the team balances simplicity and power—building a tool that anyone can use, but that experts can bend to their will—and their philosophy of “unshipping,” or cutting back whenever there’s a simpler, more intuitive path to user intent. • A peek into the future of coding with AI. The new form factors they’re experimenting with to make Claude Code more autonomous, more reliable, and more accessible to non-technical users This is a must-watch for anyone—both technical and non-technical—who wants to learn how to use Claude Code like the people who built it. Watch below! Timestamps: Introduction: 00:01:26 Claude Code’s origin story: 00:02:25 How Anthropic dogfoods Claude Code: 00:07:03 Boris and Cat’s favorite slash commands: 00:14:06 How Boris uses Claude Code to plan feature development: 00:15:49 Everything Anthropic has learned about using sub-agents well: 00:21:53 Use Claude Code to turn past code into leverage: 00:26:16 The product decisions for building an agent that’s simple and powerful: 00:33:14 Making Claude Code accessible to the non-technical user: 00:36:38 The next form factor for coding with AI: 00:45:12

Dan Shipper 📧

57,540 görüntüleme • 7 ay önce

Agents who can buy, sell, and trade on our behalf are becoming a major part of the economy. But what exactly are they doing? Stripe sees 2% of global GDP, so they’re the company with the best view of what’s going on in the earliest innings of the agent economy. That’s why I had Emily Glassberg Sands, who leads data and AI at Stripe, on Every 📧’s AI & I. We covered: - Most of us still don’t trust AI with larger online purchases. People are hesitant to let AI make expensive purchases like a vacation or a couch—just like the early days of online shopping. But a superhero outfit for a kid who needs one stat? Sure, let the agent handle it. - Fraud is moving up the stack. It used to mean stolen credit cards. Now attackers are stealing free-trial tokens and compute credits. Free-trial abuse has 4x-ed in the last six months.. - AI is on both sides of fraud. Fraudsters are using it to scale attacks, while Stripe is using it to detect them. They’re blocking 250,000 fraudulent free trials a week for one large customer. - AI companies are growing faster than any cohort Stripe has ever tracked. Top companies hit $30M ARR in 18 months—3x faster than the 2018 SaaS class. So far, it’s net new spend instead of cannibalized software budgets. If you want to understand how AI is reshaping online commerce, this one deserves your time. Timestamps Introduction: 00:00:45 New rules for an agent-driven economy: 00:01:27 Compute theft is the new payment fraud: 00:03:57 How Stripe expanded fraud detection from checkout to the full customer lifecycle: 00:10:00 Why AI companies are scaling way faster than top SaaS companies: 00:19:48 Outcome-based billing is replacing seat-based pricing: 00:23:27 Where AI spending is coming from: 00:29:57 How the developer experience changes when agents are the builders: 00:36:45 The agentic commerce spectrum, from assisted buying to autonomous purchasing: 00:41:00 Meet Link, a consumer wallet for delegated agent purchases: 00:51:06

Dan Shipper 📧

19,362 görüntüleme • 1 ay önce

We built an AI app that had 1,000 DAU and $2k MRR before it launched. It’s called Monologue and it’s a smart dictation app built by a single developer: Naveen Naidu. We just launched Monologue yesterday, and it’s one of the fastest-growing and stickiest AI apps that Every 📧 has ever built. Naveen and Monologue are compelling because he’s competing against companies that have raised $50m or more. Because of AI he was able to build an extremely polished, delightful app by himself in just a few months. I brought Naveen on to AI & I along with Every 📧 COO Brandon Gell (Brandon Gell) to talk about his journey with Monologue. We get into: - Why shipping fast is the only thing that matters in AI: Monologue might look like an overnight success, but it wasn’t Naveen’s first, second—or even third—app. Over time, he built a muscle to get quality apps out the door, iterate on them, and learn from what he was seeing. - How he got to PMF inside of Every: The mistake Naveen regrets most in his entrepreneurial journey is building in the dark. Inside of Every 📧 he has an environment where feedback is plentiful—and it let him iterate extremely quickly. - His stack for building production grade AI apps: Naveen breaks down how he used tools like OpenAI’s Codex to do the work of a whole engineering team, including solving hard technical problems like Mac hotkey handling. This is a must-watch for anyone who wants to see how far a single developer and some AI tools can really go. Watch below! Timestamps: Introduction: 00:01:27 A live demo of Monologue: 00:03:51 Hard lessons from Naveen’s years in the wilderness: 00:06:27 Building a muscle to ship fast: 00:12:29 The spark that became Monologue: 00:21:11 Dogfooding your way to a killer feature: 00:26:09 Why the harshest product feedback is the most valuable: 00:29:45 Every’s strategy for launching an app in a crowded space: 00:31:47 Giving Monologue the Every “smell”: 00:40:08 Naveen’s one-person AI stack to build beautiful apps: 00:45:09

Dan Shipper 📧

23,644 görüntüleme • 9 ay önce

OpenAI’s hottest app isn’t ChatGPT—it’s Codex. In the last few weeks alone, the Codex team shipped a desktop app, GPT-5.3 Codex (a new flagship model), and Spark, the fastest coding model I’ve ever used. Usage has grown fivefold since January and over a million people now use Codex weekly. Codex was also the app that OpenAI chose to run an ad for in the Super Bowl. I talked to Thibault (Tibo), head of Codex, and Andrew (Andrew Ambrosino), a member of technical staff who built the Codex app, for Every 📧’s AI & I about what OpenAI is building and how they’re using it internally. We get into: - Why they built a GUI instead of a terminal. Terminals work for quick tasks, they say, but feel limiting when you’re running multiple agents in parallel. The IDE, meanwhile, overwhelms users—and the Codex team wants the AI to dynamically decide which tools to show you for a given task. - How they’re teaching the model to read between the lines. Codex is great at following instructions, but optimize too hard in that direction, and it starts taking you literally—like copying a typo directly into the code. The team obsesses over this tradeoff, and is also introducing “personalities,” modes users can toggle between that control how blunt or supportive the model feels. - How OpenAI uses its own coding agent. Codex lets you schedule prompts to run on a recurring basis, and the team has dozens of automations running at all times. For example, one scans for merge conflicts every couple of hours so code is always ready to ship, and another picks a random file from the codebase multiple times a day and hunts for bugs no one would've gone looking for. - Why speed is a dimension of intelligence. OpenAI’s newest model (Spark) is so fast that they actually slow it down so you can read the output. They see the speed enabling three things: staying super in the flow, replacing brittle developer tools with intelligent ones that can adapt on the fly, and redirecting the model mid-task— especially with voice—so coding starts to feel more and more like a conversation. - Code review is the next bottleneck. Models can generate code faster than ever, but someone still has to verify that it works. The team is exploring a future where the model proves its own fix works—retracing the click path a user would take, screenshotting the results, and attaching the evidence to a pull request. This is a must-watch for anyone who uses AI coding agents—and is curious about the future of programming. Watch below! Timestamps: Introduction: 00:01:27 OpenAI’s evolving bet on its coding agent: 00:05:27 The choice to invest in a GUI (over a terminal): 00:09:42 The AI workflows that the Codex team relies on to ship: 00:20:38 Teaching Codex how to read between the lines: 00:26:45 Building affordances for a lightening fast model: 00:28:45 Why speed is a dimension of intelligence: 00:33:15 Code review is the next bottleneck for coding agents: 00:36:30 How the Codex team positions against the competition: 00:41:24

Dan Shipper 📧

15,588 görüntüleme • 4 ay önce

Guillermo Rauch (Guillermo Rauch) is one of the most prolific coders of this generation. But he doesn’t think of himself as a coder anymore. Coding, he says, is a specific skill that AI is becoming great at. Instead, he thinks the future of coding is more holistic, full-stack engineers who can ideate, design, and execute all together. Guillermo is the founder and CEO of Vercel (Vercel), the creator of NextJS, and SocketIO. We spent an hour talking about the future of software development in an AI world—and the meta-skills that are essential for the coders of today to master—in order to use tomorrow’s tools to their fullest extent. Here are a few takeaways: - One of the most important keys to his success is taste—and developing taste is all about paying better attention to everything you experience day to day. - He’s great at recognizing bleeding-edge technologies with extremely practical applications but that have bad user experiences. If you can learn to recognize those and build with them, you might build the next NextJs or SocketIO. - Why prototype cultures are becoming common in AI—and the benefits of written cultures like Amazon vs. prototype cultures like Apple for different kinds of companies. - For developers building frameworks, always put the product first; a framework in isolation without a “customer zero” is never going to be a good tool. - The theory of “recursive founder mode”—if you want to build a scalable business, you have to scale yourself by creating an atmosphere that nurtures talent and ambition. - AI tools are shifting software toward consumption-based billing models, making us capital allocators who decide how much compute the AI consumes. - The future of AI is agents with the taste, knowledge, and tools to perform specialized tasks. Watch below! Timestamps: Introduction: 00:01:33 How to spot trends early: 00:03:18 Why you should be your own customer: 00:07:34 How to create an ecosystem of talent and ambition: 00:14:55 Why Guillermo doesn't identify as a coder: 00:17:29 AI is gearing us toward an allocation economy: 00:20:50 How Vercel’s copilot compares with other coding agents: 00:28:34 Guillermo’s advice on having better taste: 00:40:35 The future of AI agents is specialized: 00:42:46 How AI startups can compete with big tech: 00:47:50

Dan Shipper 📧

186,927 görüntüleme • 1 yıl önce

Nat Eliason’s (Nat Eliason) career arc is borderline absurd—but it works. He’ll spot a new tool or trend, master it, build a business around it, and move on. Nat’s pulled it off with the note-taking wave ($600k in sales from a Roam Research course), real estate (6x return flipping property in Austin), and crypto (published his insider story with Random House). Now it’s AI: he’s running a viral course on building apps with AI—$200k in pre-sales in just a week, 800 students and counting. I’ve known Nat for a long time and I think he has a great sense for where the puck is headed. He was one of the first guests I had on the podcast and I was delighted to have him on again. Here are a few takeaways from our conversation: - Coding with AI has become orders of magnitude easier for non-technical people over the last 2 years—Nat rarely has to help students fix bugs; they troubleshoot in Cursor on their own. - AI coding assistants are creating new behaviours in programming, like using a speech-to-text model to talk to an agent and having it write code for you. - The traditional learning curve of coding is flattening because AI tools let beginners build and iterate in faster feedback loops. - AI has given Nat leverage in spades—it increases his ability to be a creator while also building a robust business with as few people to manage as possible. He demos an AI book editor he coded for his sci-fi novel. - In the age of AI, software is becoming content and the barriers to create are lower than ever—but custom software for everything isn’t the answer. Nat’s model is that personalized tools make sense for that one thing you care the most about. - Nat believes that the future of writing with AI is a Cursor-style interface with a model that’s trained on your style and voice. This episode is a must-watch for writers, creators, and anyone interested in the future of product building. Watch below! Timestamps: Introduction: 00:01:45 The origins of Nat’s viral course on building apps with AI: 00:11:45 How coding with AI has evolved over the last two years: 00:18:46 Nat creates an app using Composer, Cursor’s AI assistant: 00:22:22 Tactical tips for coding with Cursor: 00:26:06 How coding with AI is creating new behaviours in programming: 00:29:06 What excites Nat the most about the future of AI: 00:32:41 A demo of Hubbard, the AI editor Nat built for his science fiction writing: 00:38:58 When does it makes sense to build custom software: 00:44:52 Nat’s take on the future of writing with AI: 00:49:18

Dan Shipper 📧

27,207 görüntüleme • 1 yıl önce

Three months ago, Codex was trash for knowledge work. Now it's my daily driver. I use it for writing, recruiting, deep engineering work, and everything in between. It even keeps me at inbox 0. I chatted with Every 📧's head of growth Austin Austin Tedesco on Every 📧's AI & I about what changed, and why he now spends 80% of his working time in the Codex desktop app too. We get into: - How Codex went from making Austin feel like an idiot to being the place he goes to get stuff done, including complex tasks like writing go-to-market plans using existing material from Slack, Notion, and meeting transcripts. - Why the Codex’s desktop app, which is faster and more reliable than Claude Desktop/Cowork, is the real differentiator. - How I source candidates with Codex by having it identify career arcs, not keywords—my go-to move is identifying organizations likely to teach the skills Every needs for a role, and then find candidates from that pool who have since gone on to work in AI. This is a must-watch for anyone who's wondering whether it’s finally time to give Codex a try. Watch below! Timestamps How Codex went from a tool for senior engineers to a daily driver for knowledge work: 00:00:57 How Claude Code proved that a great coding agent works for any knowledge work: 00:02:42 Austin's switch to Codex: 00:07:24 How Austin set up Codex with folders, keys, and reviewer agents: 00:13:48 Using Codex to brainstorm automations across Gmail, Slack, and Notion: 00:18:24 How Austin manages the human review step when Codex is drafting communications: 00:22:42 Using Codex to build specialized agents inspired by product executive Claire Vo: 00:28:54 Synthesizing meeting transcripts and Slack threads into a go-to-market plan: 00:31:09 Building a live KPI tracker in Notion that agents can read: 00:40:15 Using Codex for recruiting: 00:44:54

Dan Shipper 📧

55,221 görüntüleme • 1 ay önce

Noah Brier (Noah Brier) uses Claude Code as his second brain—it’s the coolest notetaking setup I’ve ever seen. He has Claude running on a server in his basement hooked up to a VPN. It stores, reads, and writes to thousands of notes in his Obsidian (Obsidian) vault. He does it all from his phone. I had him on the show to tell us exactly how he’s pulling this off. We get into: - The nuts and bolts of the Claude Code-Obsidian setup: Noah set up Claude Code on top of his Obsidian root directory, and he walked me through how he uses it to prep for an upcoming speech—creating a project folder, pulling in relevant research from his notes, saving transcripts from chats with other LLMs, and generating daily progress updates. - The “thinking partner” that lives inside Noah’s second brain: Noah points out that in the hype around AI’s ability to write, the fact that it can read is overlooked. That’s why he has an agent inside Claude Code with strict guardrails to stay in “thinking mode.” It logs his questions, tracks insights, and catches him up on research if he returns to a project after a few days away. - How Noah does deep work on his phone: Noah rigged a home server in his basement, put his Obsidian vault in it—and then runs Claude Code on top. Noah says that being able to think, write, research, and ship code from his phone has fundamentally changed the way he works. This episode of Every 📧’s AI & I is a must-watch for anyone curious about who wants to learn how to use Claude Code to build a true second brain. Watch below! Timestamps: Introduction: 00:01:19 How you can do deep work on your phone: 00:04:28 Why Noah thinks Grok has the best voice AI: 00:06:14 The nuts and bolts of Noah’s Claude Code-Obsidian setup: 00:11:39 Using an agent in Claude Code as a “thinking partner”: 00:23:59 Noah’s Thomas’ English Muffin theory of AI: 00:35:07 The white space still left to explore in AI: 00:44:04 How Noah is preparing his kids for AI: 00:50:41 How he brought his Claude Code setup to mobile: 01:01:54

Dan Shipper 📧

30,792 görüntüleme • 9 ay önce

I'm often asked for the best public example of AI evals done right for a real, production product. I finally have an answer. Teresa Torres shares how she shipped an AI interview coach, and used evals to rapidly squash bugs and improve the product. Teresa shows how she: 1. did error analysis FIRST to find real issues (instead of using generic metrics) 😍 2. used Jupyter notebooks to analyze errors 3. built custom annotation tools + custom widgets in notebooks 4. built a LLM-judge and assertions to test for specific errors 5. iterated through this feedback loop until it worked. 6. kept things simple the whole time It's also probably the best commercial for Jupyter notebooks you can imagine. 🥰 Chapter summary below. Link to YT in next thread 00:00:00 - Intro 00:01:45 - The Product: Building an AI Interview Coach 00:06:34 - The Problem: How Do I Know if My AI Coach is Any Good? 00:10:15 - Using Airtable for Traces and Annotation 00:12:15 - Discovering Jupyter Notebooks and Designing the First Evals 00:15:15 - Example Evals: LLM-as-Judge vs. Code-Based Assertions 00:21:00 - Learning Python with ChatGPT to Analyze Eval Results 00:31:00 - VS Code, Custom Tools, and an Eval Investigation Notebook 00:39:45 - Building a Custom Annotation Tool with Claude 00:41:00 - From Personal Project to Production App 00:46:02 - How Should PMs and Engineers Collaborate on AI Products? 00:55:45 - Q&A: Capturing Feedback and Annotations from End Users 00:58:11 - Q&A: Is a Technical Background Necessary to Build AI? 01:02:28 - Q&A: What's Next for Teresa? 01:03:13 - Q&A: Unpacking the Micro-Decisions of Building an AI App

Hamel Husain

51,376 görüntüleme • 10 ay önce