Video yükleniyor...

Video Yüklenemedi

Bu video yüklenirken bir sorun oluştu. Bu geçici bir ağ sorunundan kaynaklanıyor olabilir veya video kullanılamıyor olabilir.

Ana Sayfaya Dön

GitHub has a front-row seat to how code is changing now that everyone—and their army of agents—can ship code. In March alone, agents created 17 million pull requests on the platform. That’s why I was thrilled Mike Taylor was on hand to interview GitHub COO Kyle Daigle at Microsoft... Build for a behind-the-scenes look at how the platform is helping developers manage the influx without dictating which pull requests they should trust or merge. This week on Every 📧’s AI & I, Mike gets into: - The 14x commit explosion. GitHub hit 1 billion commits last year. Kyle says they’re on pace for 14 billion this year—and he doesn’t think that curve is plateauing. - GitHub is committed to letting open-source maintainers set their own standards. Agent PRs are flooding communities, but GitHub’s philosophy is to leave code maintainers in control. - The developer/non-developer distinction is collapsing. GitHub’s own legal and finance teams are using Copilot to build apps, one example of how AI has expanded the definition of who counts as a developer. - Per-seat pricing doesn’t survive a world where agents run while you sleep. Kyle thinks automatic model routing—swapping in Haiku for simple tasks instead of always calling the expensive model—is the best way to make the economics make sense. - Daigle runs a daily self-improvement loop with an AI he named Baxter. Every day, Baxter reads 7 days of his emails and Slack messages, flags his communication patterns, and checks whether Kyle followed last week’s advice. This is a must-watch for anyone running agents in their dev workflow—and curious how GitHub is handling the explosion of commits on its platform. Watch below! Timestamps: Introduction: 00:00:52 The agentic PR flood: 00:03:27 GitHub’s approach to helping open-source maintainers manage the surge: 00:04:33 What 14 billion commits means for code quality: 00:06:15 Moving from per-seat licensing to usage-based pricing: 00:08:03 Kyle's dual role as GitHub COO and Microsoft's chief marketing officer for developers: 00:09:45 Developer choice as competitive moat: 00:13:03 How to balance dogfooding your own tools with staying honest about the competition: 00:14:57 Hill climbing, frontier tuning, and solving the model-routing problem: 00:19:45 Kyle's agentic communication hack: 00:24:45show more

Dan Shipper 📧

117,462 subscribers

13,886 görüntüleme • 8 gün önce •via X (Twitter)

Anya Rossi• Live Now

Private livecam show

0 Yorum

Yorum bulunmuyor

Orijinal gönderinin yorumları burada görünecek

Benzer Videolar

In the future, you’ll be able to accomplish a goal by just giving Claude an outcome and a budget. That’s the direction Anthropic is building in with its new Managed Agents features, announced at this week’s Code with Claude developer event. The basic idea: Claude, wrapped in a computer in the cloud, that you can spin up, scale, and manage as needed. Anthropic is taking on the infrastructure that kills most agent products, and making sure that it scales to meet the needs of agents running 24/7. On this week’s AI & I from Every 📧, I talk with Angela Jiang (Angela Jiang), head of product for the Claude platform, and Katelyn Lesse (Katelyn Lesse), head of engineering for the Claude platform, about what Anthropic is building and what it takes to make agents reliable in production. We get into: - Why the "build a generic harness, hot-swap any model behind it" playbook is already outdated. Angela points to eval data on Memory where the same task across different harnesses performed drastically differently. - The infrastructure wall every team hits in production—and why Katelyn thinks “my sandbox died and took the agent with it” is the real reason internal agents don't ship. - Why Anthropic is so bullish on using file systems and skills within Claude, including Angela's argument that those early design choices can compound for years. This is a must-watch for anyone trying to take an agent past the demo and into production. Watch below! Timestamps: How the Claude platform evolved from API to agents: 00:01:48 The primitives that make up Claude Managed Agents: 00:04:09 Why the harness and the model are becoming a single unit: 00:10:37 The infrastructure wall that kills most agent projects in production: 00:18:49 Why team agents need a different shape than individual productivity tools: 00:24:49 How Anthropic's legal team uses an agent to review marketing copy: 00:26:36 Using multi-agent orchestration for advisor strategies, adversarial pairs, and swarms: 00:34:24 How to measure agent success with outcome and budget as the end state: 00:35:50 What the platform looks like a year from now, when Claude writes its own harness: 00:39:11

In the future, you’ll be able to accomplish a goal by just giving Claude an outcome and a budget. That’s the direction Anthropic is building in with its new Managed Agents features, announced at this week’s Code with Claude developer event. The basic idea: Claude, wrapped in a computer in the cloud, that you can spin up, scale, and manage as needed. Anthropic is taking on the infrastructure that kills most agent products, and making sure that it scales to meet the needs of agents running 24/7. On this week’s AI & I from Every 📧, I talk with Angela Jiang (Angela Jiang), head of product for the Claude platform, and Katelyn Lesse (Katelyn Lesse), head of engineering for the Claude platform, about what Anthropic is building and what it takes to make agents reliable in production. We get into: - Why the "build a generic harness, hot-swap any model behind it" playbook is already outdated. Angela points to eval data on Memory where the same task across different harnesses performed drastically differently. - The infrastructure wall every team hits in production—and why Katelyn thinks “my sandbox died and took the agent with it” is the real reason internal agents don't ship. - Why Anthropic is so bullish on using file systems and skills within Claude, including Angela's argument that those early design choices can compound for years. This is a must-watch for anyone trying to take an agent past the demo and into production. Watch below! Timestamps: How the Claude platform evolved from API to agents: 00:01:48 The primitives that make up Claude Managed Agents: 00:04:09 Why the harness and the model are becoming a single unit: 00:10:37 The infrastructure wall that kills most agent projects in production: 00:18:49 Why team agents need a different shape than individual productivity tools: 00:24:49 How Anthropic's legal team uses an agent to review marketing copy: 00:26:36 Using multi-agent orchestration for advisor strategies, adversarial pairs, and swarms: 00:34:24 How to measure agent success with outcome and budget as the end state: 00:35:50 What the platform looks like a year from now, when Claude writes its own harness: 00:39:11

Dan Shipper 📧

66,339 görüntüleme • 1 ay önce

GitButler is redesigning the version control experience for you and your agents. We're very excited to lead their Series A. In this conversation, GitButler CEO Scott Chacon joins a16z GP Matt Bornstein to discuss: - Why Scott came back to version control after GitHub - Why Git's UI has barely changed since 2005 - How GitButler is building version control designed natively for agentic coding - Why AI makes communication a more valuable skill for developers 00:00 Intro 01:11 Why Scott came back to version control after GitHub 06:18 How Git was actually built 11:32 Designing GitButler's CLI for agents 18:05 Parallel branches: How GitButler handles multi-agent workflows 23:33 What happens to GitHub in an agentic world 27:00 Code review needs a rethink: PRs, commit messages, and what comes next 32:19 Writing and communication as the new developer superpower Scott Chacon Matt Bornstein

GitButler is redesigning the version control experience for you and your agents. We're very excited to lead their Series A. In this conversation, GitButler CEO Scott Chacon joins a16z GP Matt Bornstein to discuss: - Why Scott came back to version control after GitHub - Why Git's UI has barely changed since 2005 - How GitButler is building version control designed natively for agentic coding - Why AI makes communication a more valuable skill for developers 00:00 Intro 01:11 Why Scott came back to version control after GitHub 06:18 How Git was actually built 11:32 Designing GitButler's CLI for agents 18:05 Parallel branches: How GitButler handles multi-agent workflows 23:33 What happens to GitHub in an agentic world 27:00 Code review needs a rethink: PRs, commit messages, and what comes next 32:19 Writing and communication as the new developer superpower Scott Chacon Matt Bornstein

a16z

29,008 görüntüleme • 2 ay önce

SaaS isn’t dead, it just needs to become agent-native. Linear (Linear) is a great example of how: They pivoted the product to be used by both humans and agents, and that has made them one of the premier software tools in the agent-native era. I had Linear’s cofounder and CEO Karri Saarinen on Every 📧's AI & I to talk about how a product management tool for human software developers became an agent-native tool—and how Linear’s trajectory reveals a bright future for SaaS businesses: - Speed means decisions matter more, not less. AI makes it easy to have an idea and build it without considering whether its existence is justified. When ChatGPT was released, SaaS companies were launching their own chatbots left, right, and center. Instead of jumping on the bandwagon, Linear stopped to consider whether the application was useful. (It wasn’t.) - Just because the technology has changed doesn’t mean your mission should. Karri attributes Linear’s success to never losing sight of what matters: helping teams develop great software. Instead of chasing trends, Linear focused on understanding how AI was impacting its customers’ workflows—and updating its product accordingly. - Agents are now first-class users. Linear never tried to change what it was or did well; it just expanded the user base. Companies can now kick off agents inside Linear, manage them, and track what they're working on alongside the humans on the team, which explains why Codex, Coinbase, and Brex all run their agents on Linear. This is a must watch for anyone interested in how an agent-native SaaS company operates. Watch below! Timestamps: Introduction and how Every first discovered Linear: 00:00:39 Why Linear waited to ship AI features instead of rushing to chatbots: 00:02:00 Linear's agent platform and becoming the system that guides AI agents: 00:05:06 Why "SaaS is dead" is a simplistic narrative: 00:07:42 How Linear adopted AI coding tools internally: 00:12:18 AI's impact on product building workflows—speed versus thoughtfulness: 00:17:45 The value of conceptual work and thinking before shipping: 00:22:18 How AI is reshaping Linear's product strategy: 00:29:30 Demo: Linear's agent skills, shared context, and code review workflow: 00:37:18 The future of product development and the enduring role of human judgment: 00:47:48

SaaS isn’t dead, it just needs to become agent-native. Linear (Linear) is a great example of how: They pivoted the product to be used by both humans and agents, and that has made them one of the premier software tools in the agent-native era. I had Linear’s cofounder and CEO Karri Saarinen on Every 📧's AI & I to talk about how a product management tool for human software developers became an agent-native tool—and how Linear’s trajectory reveals a bright future for SaaS businesses: - Speed means decisions matter more, not less. AI makes it easy to have an idea and build it without considering whether its existence is justified. When ChatGPT was released, SaaS companies were launching their own chatbots left, right, and center. Instead of jumping on the bandwagon, Linear stopped to consider whether the application was useful. (It wasn’t.) - Just because the technology has changed doesn’t mean your mission should. Karri attributes Linear’s success to never losing sight of what matters: helping teams develop great software. Instead of chasing trends, Linear focused on understanding how AI was impacting its customers’ workflows—and updating its product accordingly. - Agents are now first-class users. Linear never tried to change what it was or did well; it just expanded the user base. Companies can now kick off agents inside Linear, manage them, and track what they're working on alongside the humans on the team, which explains why Codex, Coinbase, and Brex all run their agents on Linear. This is a must watch for anyone interested in how an agent-native SaaS company operates. Watch below! Timestamps: Introduction and how Every first discovered Linear: 00:00:39 Why Linear waited to ship AI features instead of rushing to chatbots: 00:02:00 Linear's agent platform and becoming the system that guides AI agents: 00:05:06 Why "SaaS is dead" is a simplistic narrative: 00:07:42 How Linear adopted AI coding tools internally: 00:12:18 AI's impact on product building workflows—speed versus thoughtfulness: 00:17:45 The value of conceptual work and thinking before shipping: 00:22:18 How AI is reshaping Linear's product strategy: 00:29:30 Demo: Linear's agent skills, shared context, and code review workflow: 00:37:18 The future of product development and the enduring role of human judgment: 00:47:48

Dan Shipper 📧

36,359 görüntüleme • 2 ay önce

Claude Code cracked something open for us Every 📧. Now I ship to codebases I barely know, every feature we ship makes the next one easier, and non-technical members of the team use the terminal. I’m genuinely grateful. So I brought its creators, Cat Wu (cat) and Boris Cherny (Boris Cherny) from Anthropic, on AI & I to say thank you—and to talk about everything they’ve learned from building Claude Code. We get into: • The workflows Anthropic’s smartest engineers use to push Claude Code to its limits. Why they pit subagents against each other to get cleaner results, how they turn past code into leverage, and the slash commands and MCPs they rely on most. • The product lessons behind one of the most loved AI agents in the world. How the team balances simplicity and power—building a tool that anyone can use, but that experts can bend to their will—and their philosophy of “unshipping,” or cutting back whenever there’s a simpler, more intuitive path to user intent. • A peek into the future of coding with AI. The new form factors they’re experimenting with to make Claude Code more autonomous, more reliable, and more accessible to non-technical users This is a must-watch for anyone—both technical and non-technical—who wants to learn how to use Claude Code like the people who built it. Watch below! Timestamps: Introduction: 00:01:26 Claude Code’s origin story: 00:02:25 How Anthropic dogfoods Claude Code: 00:07:03 Boris and Cat’s favorite slash commands: 00:14:06 How Boris uses Claude Code to plan feature development: 00:15:49 Everything Anthropic has learned about using sub-agents well: 00:21:53 Use Claude Code to turn past code into leverage: 00:26:16 The product decisions for building an agent that’s simple and powerful: 00:33:14 Making Claude Code accessible to the non-technical user: 00:36:38 The next form factor for coding with AI: 00:45:12

Claude Code cracked something open for us Every 📧. Now I ship to codebases I barely know, every feature we ship makes the next one easier, and non-technical members of the team use the terminal. I’m genuinely grateful. So I brought its creators, Cat Wu (cat) and Boris Cherny (Boris Cherny) from Anthropic, on AI & I to say thank you—and to talk about everything they’ve learned from building Claude Code. We get into: • The workflows Anthropic’s smartest engineers use to push Claude Code to its limits. Why they pit subagents against each other to get cleaner results, how they turn past code into leverage, and the slash commands and MCPs they rely on most. • The product lessons behind one of the most loved AI agents in the world. How the team balances simplicity and power—building a tool that anyone can use, but that experts can bend to their will—and their philosophy of “unshipping,” or cutting back whenever there’s a simpler, more intuitive path to user intent. • A peek into the future of coding with AI. The new form factors they’re experimenting with to make Claude Code more autonomous, more reliable, and more accessible to non-technical users This is a must-watch for anyone—both technical and non-technical—who wants to learn how to use Claude Code like the people who built it. Watch below! Timestamps: Introduction: 00:01:26 Claude Code’s origin story: 00:02:25 How Anthropic dogfoods Claude Code: 00:07:03 Boris and Cat’s favorite slash commands: 00:14:06 How Boris uses Claude Code to plan feature development: 00:15:49 Everything Anthropic has learned about using sub-agents well: 00:21:53 Use Claude Code to turn past code into leverage: 00:26:16 The product decisions for building an agent that’s simple and powerful: 00:33:14 Making Claude Code accessible to the non-technical user: 00:36:38 The next form factor for coding with AI: 00:45:12

Dan Shipper 📧

57,540 görüntüleme • 7 ay önce

Full interview with the founder and CEO of Propr Lou sits down with me to discuss Hyperliquid and the future of prop trading Timestamps: 00:00:00 Intro 00:00:16 Meet Louis: from TradFi to crypto 00:01:52 How the crypto market has changed 00:03:49 Why build on Hyperliquid 00:06:52 How propr got started 00:08:40 Scaling a lean team (and the grind) 00:12:24 What a prop firm actually is 00:15:33 Agents, bots, and the future of trading 00:18:32 What Louis is most proud of 00:21:02 Thinking about the token and TGE 00:24:14 Where propr goes in 5 years 00:25:57 Best and worst trades 00:27:50 Life outside crypto + shoutouts

Full interview with the founder and CEO of Propr Lou sits down with me to discuss Hyperliquid and the future of prop trading Timestamps: 00:00:00 Intro 00:00:16 Meet Louis: from TradFi to crypto 00:01:52 How the crypto market has changed 00:03:49 Why build on Hyperliquid 00:06:52 How propr got started 00:08:40 Scaling a lean team (and the grind) 00:12:24 What a prop firm actually is 00:15:33 Agents, bots, and the future of trading 00:18:32 What Louis is most proud of 00:21:02 Thinking about the token and TGE 00:24:14 Where propr goes in 5 years 00:25:57 Best and worst trades 00:27:50 Life outside crypto + shoutouts

ryandcrypto

41,883 görüntüleme • 24 gün önce

Agents who can buy, sell, and trade on our behalf are becoming a major part of the economy. But what exactly are they doing? Stripe sees 2% of global GDP, so they’re the company with the best view of what’s going on in the earliest innings of the agent economy. That’s why I had Emily Glassberg Sands, who leads data and AI at Stripe, on Every 📧’s AI & I. We covered: - Most of us still don’t trust AI with larger online purchases. People are hesitant to let AI make expensive purchases like a vacation or a couch—just like the early days of online shopping. But a superhero outfit for a kid who needs one stat? Sure, let the agent handle it. - Fraud is moving up the stack. It used to mean stolen credit cards. Now attackers are stealing free-trial tokens and compute credits. Free-trial abuse has 4x-ed in the last six months.. - AI is on both sides of fraud. Fraudsters are using it to scale attacks, while Stripe is using it to detect them. They’re blocking 250,000 fraudulent free trials a week for one large customer. - AI companies are growing faster than any cohort Stripe has ever tracked. Top companies hit $30M ARR in 18 months—3x faster than the 2018 SaaS class. So far, it’s net new spend instead of cannibalized software budgets. If you want to understand how AI is reshaping online commerce, this one deserves your time. Timestamps Introduction: 00:00:45 New rules for an agent-driven economy: 00:01:27 Compute theft is the new payment fraud: 00:03:57 How Stripe expanded fraud detection from checkout to the full customer lifecycle: 00:10:00 Why AI companies are scaling way faster than top SaaS companies: 00:19:48 Outcome-based billing is replacing seat-based pricing: 00:23:27 Where AI spending is coming from: 00:29:57 How the developer experience changes when agents are the builders: 00:36:45 The agentic commerce spectrum, from assisted buying to autonomous purchasing: 00:41:00 Meet Link, a consumer wallet for delegated agent purchases: 00:51:06

Agents who can buy, sell, and trade on our behalf are becoming a major part of the economy. But what exactly are they doing? Stripe sees 2% of global GDP, so they’re the company with the best view of what’s going on in the earliest innings of the agent economy. That’s why I had Emily Glassberg Sands, who leads data and AI at Stripe, on Every 📧’s AI & I. We covered: - Most of us still don’t trust AI with larger online purchases. People are hesitant to let AI make expensive purchases like a vacation or a couch—just like the early days of online shopping. But a superhero outfit for a kid who needs one stat? Sure, let the agent handle it. - Fraud is moving up the stack. It used to mean stolen credit cards. Now attackers are stealing free-trial tokens and compute credits. Free-trial abuse has 4x-ed in the last six months.. - AI is on both sides of fraud. Fraudsters are using it to scale attacks, while Stripe is using it to detect them. They’re blocking 250,000 fraudulent free trials a week for one large customer. - AI companies are growing faster than any cohort Stripe has ever tracked. Top companies hit $30M ARR in 18 months—3x faster than the 2018 SaaS class. So far, it’s net new spend instead of cannibalized software budgets. If you want to understand how AI is reshaping online commerce, this one deserves your time. Timestamps Introduction: 00:00:45 New rules for an agent-driven economy: 00:01:27 Compute theft is the new payment fraud: 00:03:57 How Stripe expanded fraud detection from checkout to the full customer lifecycle: 00:10:00 Why AI companies are scaling way faster than top SaaS companies: 00:19:48 Outcome-based billing is replacing seat-based pricing: 00:23:27 Where AI spending is coming from: 00:29:57 How the developer experience changes when agents are the builders: 00:36:45 The agentic commerce spectrum, from assisted buying to autonomous purchasing: 00:41:00 Meet Link, a consumer wallet for delegated agent purchases: 00:51:06

Dan Shipper 📧

19,362 görüntüleme • 1 ay önce

We built an AI app that had 1,000 DAU and $2k MRR before it launched. It’s called Monologue and it’s a smart dictation app built by a single developer: Naveen Naidu. We just launched Monologue yesterday, and it’s one of the fastest-growing and stickiest AI apps that Every 📧 has ever built. Naveen and Monologue are compelling because he’s competing against companies that have raised $50m or more. Because of AI he was able to build an extremely polished, delightful app by himself in just a few months. I brought Naveen on to AI & I along with Every 📧 COO Brandon Gell (Brandon Gell) to talk about his journey with Monologue. We get into: - Why shipping fast is the only thing that matters in AI: Monologue might look like an overnight success, but it wasn’t Naveen’s first, second—or even third—app. Over time, he built a muscle to get quality apps out the door, iterate on them, and learn from what he was seeing. - How he got to PMF inside of Every: The mistake Naveen regrets most in his entrepreneurial journey is building in the dark. Inside of Every 📧 he has an environment where feedback is plentiful—and it let him iterate extremely quickly. - His stack for building production grade AI apps: Naveen breaks down how he used tools like OpenAI’s Codex to do the work of a whole engineering team, including solving hard technical problems like Mac hotkey handling. This is a must-watch for anyone who wants to see how far a single developer and some AI tools can really go. Watch below! Timestamps: Introduction: 00:01:27 A live demo of Monologue: 00:03:51 Hard lessons from Naveen’s years in the wilderness: 00:06:27 Building a muscle to ship fast: 00:12:29 The spark that became Monologue: 00:21:11 Dogfooding your way to a killer feature: 00:26:09 Why the harshest product feedback is the most valuable: 00:29:45 Every’s strategy for launching an app in a crowded space: 00:31:47 Giving Monologue the Every “smell”: 00:40:08 Naveen’s one-person AI stack to build beautiful apps: 00:45:09

We built an AI app that had 1,000 DAU and $2k MRR before it launched. It’s called Monologue and it’s a smart dictation app built by a single developer: Naveen Naidu. We just launched Monologue yesterday, and it’s one of the fastest-growing and stickiest AI apps that Every 📧 has ever built. Naveen and Monologue are compelling because he’s competing against companies that have raised $50m or more. Because of AI he was able to build an extremely polished, delightful app by himself in just a few months. I brought Naveen on to AI & I along with Every 📧 COO Brandon Gell (Brandon Gell) to talk about his journey with Monologue. We get into: - Why shipping fast is the only thing that matters in AI: Monologue might look like an overnight success, but it wasn’t Naveen’s first, second—or even third—app. Over time, he built a muscle to get quality apps out the door, iterate on them, and learn from what he was seeing. - How he got to PMF inside of Every: The mistake Naveen regrets most in his entrepreneurial journey is building in the dark. Inside of Every 📧 he has an environment where feedback is plentiful—and it let him iterate extremely quickly. - His stack for building production grade AI apps: Naveen breaks down how he used tools like OpenAI’s Codex to do the work of a whole engineering team, including solving hard technical problems like Mac hotkey handling. This is a must-watch for anyone who wants to see how far a single developer and some AI tools can really go. Watch below! Timestamps: Introduction: 00:01:27 A live demo of Monologue: 00:03:51 Hard lessons from Naveen’s years in the wilderness: 00:06:27 Building a muscle to ship fast: 00:12:29 The spark that became Monologue: 00:21:11 Dogfooding your way to a killer feature: 00:26:09 Why the harshest product feedback is the most valuable: 00:29:45 Every’s strategy for launching an app in a crowded space: 00:31:47 Giving Monologue the Every “smell”: 00:40:08 Naveen’s one-person AI stack to build beautiful apps: 00:45:09

Dan Shipper 📧

23,644 görüntüleme • 9 ay önce

OpenAI’s hottest app isn’t ChatGPT—it’s Codex. In the last few weeks alone, the Codex team shipped a desktop app, GPT-5.3 Codex (a new flagship model), and Spark, the fastest coding model I’ve ever used. Usage has grown fivefold since January and over a million people now use Codex weekly. Codex was also the app that OpenAI chose to run an ad for in the Super Bowl. I talked to Thibault (Tibo), head of Codex, and Andrew (Andrew Ambrosino), a member of technical staff who built the Codex app, for Every 📧’s AI & I about what OpenAI is building and how they’re using it internally. We get into: - Why they built a GUI instead of a terminal. Terminals work for quick tasks, they say, but feel limiting when you’re running multiple agents in parallel. The IDE, meanwhile, overwhelms users—and the Codex team wants the AI to dynamically decide which tools to show you for a given task. - How they’re teaching the model to read between the lines. Codex is great at following instructions, but optimize too hard in that direction, and it starts taking you literally—like copying a typo directly into the code. The team obsesses over this tradeoff, and is also introducing “personalities,” modes users can toggle between that control how blunt or supportive the model feels. - How OpenAI uses its own coding agent. Codex lets you schedule prompts to run on a recurring basis, and the team has dozens of automations running at all times. For example, one scans for merge conflicts every couple of hours so code is always ready to ship, and another picks a random file from the codebase multiple times a day and hunts for bugs no one would've gone looking for. - Why speed is a dimension of intelligence. OpenAI’s newest model (Spark) is so fast that they actually slow it down so you can read the output. They see the speed enabling three things: staying super in the flow, replacing brittle developer tools with intelligent ones that can adapt on the fly, and redirecting the model mid-task— especially with voice—so coding starts to feel more and more like a conversation. - Code review is the next bottleneck. Models can generate code faster than ever, but someone still has to verify that it works. The team is exploring a future where the model proves its own fix works—retracing the click path a user would take, screenshotting the results, and attaching the evidence to a pull request. This is a must-watch for anyone who uses AI coding agents—and is curious about the future of programming. Watch below! Timestamps: Introduction: 00:01:27 OpenAI’s evolving bet on its coding agent: 00:05:27 The choice to invest in a GUI (over a terminal): 00:09:42 The AI workflows that the Codex team relies on to ship: 00:20:38 Teaching Codex how to read between the lines: 00:26:45 Building affordances for a lightening fast model: 00:28:45 Why speed is a dimension of intelligence: 00:33:15 Code review is the next bottleneck for coding agents: 00:36:30 How the Codex team positions against the competition: 00:41:24

OpenAI’s hottest app isn’t ChatGPT—it’s Codex. In the last few weeks alone, the Codex team shipped a desktop app, GPT-5.3 Codex (a new flagship model), and Spark, the fastest coding model I’ve ever used. Usage has grown fivefold since January and over a million people now use Codex weekly. Codex was also the app that OpenAI chose to run an ad for in the Super Bowl. I talked to Thibault (Tibo), head of Codex, and Andrew (Andrew Ambrosino), a member of technical staff who built the Codex app, for Every 📧’s AI & I about what OpenAI is building and how they’re using it internally. We get into: - Why they built a GUI instead of a terminal. Terminals work for quick tasks, they say, but feel limiting when you’re running multiple agents in parallel. The IDE, meanwhile, overwhelms users—and the Codex team wants the AI to dynamically decide which tools to show you for a given task. - How they’re teaching the model to read between the lines. Codex is great at following instructions, but optimize too hard in that direction, and it starts taking you literally—like copying a typo directly into the code. The team obsesses over this tradeoff, and is also introducing “personalities,” modes users can toggle between that control how blunt or supportive the model feels. - How OpenAI uses its own coding agent. Codex lets you schedule prompts to run on a recurring basis, and the team has dozens of automations running at all times. For example, one scans for merge conflicts every couple of hours so code is always ready to ship, and another picks a random file from the codebase multiple times a day and hunts for bugs no one would've gone looking for. - Why speed is a dimension of intelligence. OpenAI’s newest model (Spark) is so fast that they actually slow it down so you can read the output. They see the speed enabling three things: staying super in the flow, replacing brittle developer tools with intelligent ones that can adapt on the fly, and redirecting the model mid-task— especially with voice—so coding starts to feel more and more like a conversation. - Code review is the next bottleneck. Models can generate code faster than ever, but someone still has to verify that it works. The team is exploring a future where the model proves its own fix works—retracing the click path a user would take, screenshotting the results, and attaching the evidence to a pull request. This is a must-watch for anyone who uses AI coding agents—and is curious about the future of programming. Watch below! Timestamps: Introduction: 00:01:27 OpenAI’s evolving bet on its coding agent: 00:05:27 The choice to invest in a GUI (over a terminal): 00:09:42 The AI workflows that the Codex team relies on to ship: 00:20:38 Teaching Codex how to read between the lines: 00:26:45 Building affordances for a lightening fast model: 00:28:45 Why speed is a dimension of intelligence: 00:33:15 Code review is the next bottleneck for coding agents: 00:36:30 How the Codex team positions against the competition: 00:41:24

Dan Shipper 📧

15,588 görüntüleme • 4 ay önce

Guillermo Rauch (Guillermo Rauch) is one of the most prolific coders of this generation. But he doesn’t think of himself as a coder anymore. Coding, he says, is a specific skill that AI is becoming great at. Instead, he thinks the future of coding is more holistic, full-stack engineers who can ideate, design, and execute all together. Guillermo is the founder and CEO of Vercel (Vercel), the creator of NextJS, and SocketIO. We spent an hour talking about the future of software development in an AI world—and the meta-skills that are essential for the coders of today to master—in order to use tomorrow’s tools to their fullest extent. Here are a few takeaways: - One of the most important keys to his success is taste—and developing taste is all about paying better attention to everything you experience day to day. - He’s great at recognizing bleeding-edge technologies with extremely practical applications but that have bad user experiences. If you can learn to recognize those and build with them, you might build the next NextJs or SocketIO. - Why prototype cultures are becoming common in AI—and the benefits of written cultures like Amazon vs. prototype cultures like Apple for different kinds of companies. - For developers building frameworks, always put the product first; a framework in isolation without a “customer zero” is never going to be a good tool. - The theory of “recursive founder mode”—if you want to build a scalable business, you have to scale yourself by creating an atmosphere that nurtures talent and ambition. - AI tools are shifting software toward consumption-based billing models, making us capital allocators who decide how much compute the AI consumes. - The future of AI is agents with the taste, knowledge, and tools to perform specialized tasks. Watch below! Timestamps: Introduction: 00:01:33 How to spot trends early: 00:03:18 Why you should be your own customer: 00:07:34 How to create an ecosystem of talent and ambition: 00:14:55 Why Guillermo doesn't identify as a coder: 00:17:29 AI is gearing us toward an allocation economy: 00:20:50 How Vercel’s copilot compares with other coding agents: 00:28:34 Guillermo’s advice on having better taste: 00:40:35 The future of AI agents is specialized: 00:42:46 How AI startups can compete with big tech: 00:47:50

Guillermo Rauch (Guillermo Rauch) is one of the most prolific coders of this generation. But he doesn’t think of himself as a coder anymore. Coding, he says, is a specific skill that AI is becoming great at. Instead, he thinks the future of coding is more holistic, full-stack engineers who can ideate, design, and execute all together. Guillermo is the founder and CEO of Vercel (Vercel), the creator of NextJS, and SocketIO. We spent an hour talking about the future of software development in an AI world—and the meta-skills that are essential for the coders of today to master—in order to use tomorrow’s tools to their fullest extent. Here are a few takeaways: - One of the most important keys to his success is taste—and developing taste is all about paying better attention to everything you experience day to day. - He’s great at recognizing bleeding-edge technologies with extremely practical applications but that have bad user experiences. If you can learn to recognize those and build with them, you might build the next NextJs or SocketIO. - Why prototype cultures are becoming common in AI—and the benefits of written cultures like Amazon vs. prototype cultures like Apple for different kinds of companies. - For developers building frameworks, always put the product first; a framework in isolation without a “customer zero” is never going to be a good tool. - The theory of “recursive founder mode”—if you want to build a scalable business, you have to scale yourself by creating an atmosphere that nurtures talent and ambition. - AI tools are shifting software toward consumption-based billing models, making us capital allocators who decide how much compute the AI consumes. - The future of AI is agents with the taste, knowledge, and tools to perform specialized tasks. Watch below! Timestamps: Introduction: 00:01:33 How to spot trends early: 00:03:18 Why you should be your own customer: 00:07:34 How to create an ecosystem of talent and ambition: 00:14:55 Why Guillermo doesn't identify as a coder: 00:17:29 AI is gearing us toward an allocation economy: 00:20:50 How Vercel’s copilot compares with other coding agents: 00:28:34 Guillermo’s advice on having better taste: 00:40:35 The future of AI agents is specialized: 00:42:46 How AI startups can compete with big tech: 00:47:50

Dan Shipper 📧

186,927 görüntüleme • 1 yıl önce

Nat Eliason’s (Nat Eliason) career arc is borderline absurd—but it works. He’ll spot a new tool or trend, master it, build a business around it, and move on. Nat’s pulled it off with the note-taking wave ($600k in sales from a Roam Research course), real estate (6x return flipping property in Austin), and crypto (published his insider story with Random House). Now it’s AI: he’s running a viral course on building apps with AI—$200k in pre-sales in just a week, 800 students and counting. I’ve known Nat for a long time and I think he has a great sense for where the puck is headed. He was one of the first guests I had on the podcast and I was delighted to have him on again. Here are a few takeaways from our conversation: - Coding with AI has become orders of magnitude easier for non-technical people over the last 2 years—Nat rarely has to help students fix bugs; they troubleshoot in Cursor on their own. - AI coding assistants are creating new behaviours in programming, like using a speech-to-text model to talk to an agent and having it write code for you. - The traditional learning curve of coding is flattening because AI tools let beginners build and iterate in faster feedback loops. - AI has given Nat leverage in spades—it increases his ability to be a creator while also building a robust business with as few people to manage as possible. He demos an AI book editor he coded for his sci-fi novel. - In the age of AI, software is becoming content and the barriers to create are lower than ever—but custom software for everything isn’t the answer. Nat’s model is that personalized tools make sense for that one thing you care the most about. - Nat believes that the future of writing with AI is a Cursor-style interface with a model that’s trained on your style and voice. This episode is a must-watch for writers, creators, and anyone interested in the future of product building. Watch below! Timestamps: Introduction: 00:01:45 The origins of Nat’s viral course on building apps with AI: 00:11:45 How coding with AI has evolved over the last two years: 00:18:46 Nat creates an app using Composer, Cursor’s AI assistant: 00:22:22 Tactical tips for coding with Cursor: 00:26:06 How coding with AI is creating new behaviours in programming: 00:29:06 What excites Nat the most about the future of AI: 00:32:41 A demo of Hubbard, the AI editor Nat built for his science fiction writing: 00:38:58 When does it makes sense to build custom software: 00:44:52 Nat’s take on the future of writing with AI: 00:49:18

Nat Eliason’s (Nat Eliason) career arc is borderline absurd—but it works. He’ll spot a new tool or trend, master it, build a business around it, and move on. Nat’s pulled it off with the note-taking wave ($600k in sales from a Roam Research course), real estate (6x return flipping property in Austin), and crypto (published his insider story with Random House). Now it’s AI: he’s running a viral course on building apps with AI—$200k in pre-sales in just a week, 800 students and counting. I’ve known Nat for a long time and I think he has a great sense for where the puck is headed. He was one of the first guests I had on the podcast and I was delighted to have him on again. Here are a few takeaways from our conversation: - Coding with AI has become orders of magnitude easier for non-technical people over the last 2 years—Nat rarely has to help students fix bugs; they troubleshoot in Cursor on their own. - AI coding assistants are creating new behaviours in programming, like using a speech-to-text model to talk to an agent and having it write code for you. - The traditional learning curve of coding is flattening because AI tools let beginners build and iterate in faster feedback loops. - AI has given Nat leverage in spades—it increases his ability to be a creator while also building a robust business with as few people to manage as possible. He demos an AI book editor he coded for his sci-fi novel. - In the age of AI, software is becoming content and the barriers to create are lower than ever—but custom software for everything isn’t the answer. Nat’s model is that personalized tools make sense for that one thing you care the most about. - Nat believes that the future of writing with AI is a Cursor-style interface with a model that’s trained on your style and voice. This episode is a must-watch for writers, creators, and anyone interested in the future of product building. Watch below! Timestamps: Introduction: 00:01:45 The origins of Nat’s viral course on building apps with AI: 00:11:45 How coding with AI has evolved over the last two years: 00:18:46 Nat creates an app using Composer, Cursor’s AI assistant: 00:22:22 Tactical tips for coding with Cursor: 00:26:06 How coding with AI is creating new behaviours in programming: 00:29:06 What excites Nat the most about the future of AI: 00:32:41 A demo of Hubbard, the AI editor Nat built for his science fiction writing: 00:38:58 When does it makes sense to build custom software: 00:44:52 Nat’s take on the future of writing with AI: 00:49:18

Dan Shipper 📧

27,207 görüntüleme • 1 yıl önce

It started as a small idea to connect AI models to developer workflows. It turned into one of the fastest-growing open standards in the industry. 🚀 Now, the Model Context Protocol is officially joining the The Linux Foundation. Hear from the engineers and maintainers of GitHub, Microsoft, Anthropic, and OpenAI on the journey from day zero to now. 👇

It started as a small idea to connect AI models to developer workflows. It turned into one of the fastest-growing open standards in the industry. 🚀 Now, the Model Context Protocol is officially joining the The Linux Foundation. Hear from the engineers and maintainers of GitHub, Microsoft, Anthropic, and OpenAI on the journey from day zero to now. 👇

GitHub

44,927 görüntüleme • 6 ay önce

With Proof of Human, instead of trying to block bots, software becomes human-only by default. Then you decide what access to give agents. Tom Occhino (CPO, Vercel ) joins Stateful, hosted by Mason Nystrom In this episode, they discuss how Vercel is building the infrastructure layer for human-verified agents with World. - Vercel is now agentic infrastructure: deploy with agents, build agents, and a self-improving platform that fixes its own errors - Workflows does for backends what React did for front ends, composable and reusable in one line of code - CAPTCHAs are archaic. Human-only software by default is the new model. - World ID integration via Workflow SDK: one NPM package, one line of code - Humans as first-class citizens of the internet again 00:00 Vercel's Agentic Infrastructure 01:45 From Front-End Cloud to v0 03:15 Workflow SDK with React 05:30 World ID in One Line of Code 07:00 Beyond CAPTCHAs: Humans First

With Proof of Human, instead of trying to block bots, software becomes human-only by default. Then you decide what access to give agents. Tom Occhino (CPO, Vercel ) joins Stateful, hosted by Mason Nystrom In this episode, they discuss how Vercel is building the infrastructure layer for human-verified agents with World. - Vercel is now agentic infrastructure: deploy with agents, build agents, and a self-improving platform that fixes its own errors - Workflows does for backends what React did for front ends, composable and reusable in one line of code - CAPTCHAs are archaic. Human-only software by default is the new model. - World ID integration via Workflow SDK: one NPM package, one line of code - Humans as first-class citizens of the internet again 00:00 Vercel's Agentic Infrastructure 01:45 From Front-End Cloud to v0 03:15 Workflow SDK with React 05:30 World ID in One Line of Code 07:00 Beyond CAPTCHAs: Humans First

Pantera Capital

19,144 görüntüleme • 1 ay önce

Developer talent is global. Access to AI tools should be, too. 🌍 💻 Stephen, a React developer working in Rwanda, shares how he uses GitHub Copilot to navigate complex code, and why AI is a tool for empowerment (and not replacement). Learn more about the partnership between GitHub and Andela to bring structured AI training to more than 3,000 developers across the globe. 💡

Developer talent is global. Access to AI tools should be, too. 🌍 💻 Stephen, a React developer working in Rwanda, shares how he uses GitHub Copilot to navigate complex code, and why AI is a tool for empowerment (and not replacement). Learn more about the partnership between GitHub and Andela to bring structured AI training to more than 3,000 developers across the globe. 💡

GitHub

25,493 görüntüleme • 3 ay önce

I'm excited to introduce my AI Machine Learning Agent that built 32 ML models in 30 seconds. Today, I'll share with you how to automate building 100s of ML models with the AI ML Agent, which is available on GitHub. We'll create an ML Agent focusing on a Customer Churn Problem. I'll guide you through setting up the ML Agent, creating dozens of ML models, and loading the best model for production. This AI is a huge time-saver! Table of Contents: 00:00 Introduction to my AI Data Science Team 02:56 Setting Up AI Data Science Team 04:48 Running the ML Agent Code 07:12 Create (and Run) the AI Machine Learning Agent 09:53 Reviewing ML Model Summary 12:00 Saving and Loading Models 13:00 Next Steps + Project Roadmap + AI Bootcamp Github to AI Data Science Team (Data Science Agents): Get the Code and Future Updates by Joining my Python AI/ML Tips Newsletter: P.S. - Want to learn how to build AI projects companies actually want? (live Python Code) On Wednesday, May 21st, I'm sharing one of my best AI Projects: AI Customer Segmentation Agent with Python Register here (570+ registered):

I'm excited to introduce my AI Machine Learning Agent that built 32 ML models in 30 seconds. Today, I'll share with you how to automate building 100s of ML models with the AI ML Agent, which is available on GitHub. We'll create an ML Agent focusing on a Customer Churn Problem. I'll guide you through setting up the ML Agent, creating dozens of ML models, and loading the best model for production. This AI is a huge time-saver! Table of Contents: 00:00 Introduction to my AI Data Science Team 02:56 Setting Up AI Data Science Team 04:48 Running the ML Agent Code 07:12 Create (and Run) the AI Machine Learning Agent 09:53 Reviewing ML Model Summary 12:00 Saving and Loading Models 13:00 Next Steps + Project Roadmap + AI Bootcamp Github to AI Data Science Team (Data Science Agents): Get the Code and Future Updates by Joining my Python AI/ML Tips Newsletter: P.S. - Want to learn how to build AI projects companies actually want? (live Python Code) On Wednesday, May 21st, I'm sharing one of my best AI Projects: AI Customer Segmentation Agent with Python Register here (570+ registered):

Matt Dancho (Business Science)

35,452 görüntüleme • 1 yıl önce

A lot of people are calling Hermes Agent the end of OpenClaw. BRUH! It's not... Nous Research trains actual models and they built an agent around that expertise. The local model routing is solid, but the part that matters for your business is that your conversations become fine-tuning data. You can train a model on how you actually work. 00:00 The Problem with Local AI Models 00:25 Introduction to Nous Research 01:04 Cross-Platform Agent Capabilities 01:44 Deep Local Model Integration 02:30 Routing Tasks to Different Models 03:06 Conversations as Training Data 03:50 Hermes Agent vs. OpenClaw 04:15 Future Plans and Series Overview

A lot of people are calling Hermes Agent the end of OpenClaw. BRUH! It's not... Nous Research trains actual models and they built an agent around that expertise. The local model routing is solid, but the part that matters for your business is that your conversations become fine-tuning data. You can train a model on how you actually work. 00:00 The Problem with Local AI Models 00:25 Introduction to Nous Research 01:04 Cross-Platform Agent Capabilities 01:44 Deep Local Model Integration 02:30 Routing Tasks to Different Models 03:06 Conversations as Training Data 03:50 Hermes Agent vs. OpenClaw 04:15 Future Plans and Series Overview

Ray Fernando

42,398 görüntüleme • 2 ay önce

Three months ago, Codex was trash for knowledge work. Now it's my daily driver. I use it for writing, recruiting, deep engineering work, and everything in between. It even keeps me at inbox 0. I chatted with Every 📧's head of growth Austin Austin Tedesco on Every 📧's AI & I about what changed, and why he now spends 80% of his working time in the Codex desktop app too. We get into: - How Codex went from making Austin feel like an idiot to being the place he goes to get stuff done, including complex tasks like writing go-to-market plans using existing material from Slack, Notion, and meeting transcripts. - Why the Codex’s desktop app, which is faster and more reliable than Claude Desktop/Cowork, is the real differentiator. - How I source candidates with Codex by having it identify career arcs, not keywords—my go-to move is identifying organizations likely to teach the skills Every needs for a role, and then find candidates from that pool who have since gone on to work in AI. This is a must-watch for anyone who's wondering whether it’s finally time to give Codex a try. Watch below! Timestamps How Codex went from a tool for senior engineers to a daily driver for knowledge work: 00:00:57 How Claude Code proved that a great coding agent works for any knowledge work: 00:02:42 Austin's switch to Codex: 00:07:24 How Austin set up Codex with folders, keys, and reviewer agents: 00:13:48 Using Codex to brainstorm automations across Gmail, Slack, and Notion: 00:18:24 How Austin manages the human review step when Codex is drafting communications: 00:22:42 Using Codex to build specialized agents inspired by product executive Claire Vo: 00:28:54 Synthesizing meeting transcripts and Slack threads into a go-to-market plan: 00:31:09 Building a live KPI tracker in Notion that agents can read: 00:40:15 Using Codex for recruiting: 00:44:54

Three months ago, Codex was trash for knowledge work. Now it's my daily driver. I use it for writing, recruiting, deep engineering work, and everything in between. It even keeps me at inbox 0. I chatted with Every 📧's head of growth Austin Austin Tedesco on Every 📧's AI & I about what changed, and why he now spends 80% of his working time in the Codex desktop app too. We get into: - How Codex went from making Austin feel like an idiot to being the place he goes to get stuff done, including complex tasks like writing go-to-market plans using existing material from Slack, Notion, and meeting transcripts. - Why the Codex’s desktop app, which is faster and more reliable than Claude Desktop/Cowork, is the real differentiator. - How I source candidates with Codex by having it identify career arcs, not keywords—my go-to move is identifying organizations likely to teach the skills Every needs for a role, and then find candidates from that pool who have since gone on to work in AI. This is a must-watch for anyone who's wondering whether it’s finally time to give Codex a try. Watch below! Timestamps How Codex went from a tool for senior engineers to a daily driver for knowledge work: 00:00:57 How Claude Code proved that a great coding agent works for any knowledge work: 00:02:42 Austin's switch to Codex: 00:07:24 How Austin set up Codex with folders, keys, and reviewer agents: 00:13:48 Using Codex to brainstorm automations across Gmail, Slack, and Notion: 00:18:24 How Austin manages the human review step when Codex is drafting communications: 00:22:42 Using Codex to build specialized agents inspired by product executive Claire Vo: 00:28:54 Synthesizing meeting transcripts and Slack threads into a go-to-market plan: 00:31:09 Building a live KPI tracker in Notion that agents can read: 00:40:15 Using Codex for recruiting: 00:44:54

Dan Shipper 📧

55,221 görüntüleme • 1 ay önce

How to use GitHub Actions to build a pipeline that tests your code, trains a model, and publishes a new version of it. Bonus: I let GitHub Workspace Copilot do a lot of the work. (This is the best "AI developer" I've tested so far by a mile.) Here is the full video:

How to use GitHub Actions to build a pipeline that tests your code, trains a model, and publishes a new version of it. Bonus: I let GitHub Workspace Copilot do a lot of the work. (This is the best "AI developer" I've tested so far by a mile.) Here is the full video:

Santiago

147,981 görüntüleme • 1 yıl önce

Noah Brier (Noah Brier) uses Claude Code as his second brain—it’s the coolest notetaking setup I’ve ever seen. He has Claude running on a server in his basement hooked up to a VPN. It stores, reads, and writes to thousands of notes in his Obsidian (Obsidian) vault. He does it all from his phone. I had him on the show to tell us exactly how he’s pulling this off. We get into: - The nuts and bolts of the Claude Code-Obsidian setup: Noah set up Claude Code on top of his Obsidian root directory, and he walked me through how he uses it to prep for an upcoming speech—creating a project folder, pulling in relevant research from his notes, saving transcripts from chats with other LLMs, and generating daily progress updates. - The “thinking partner” that lives inside Noah’s second brain: Noah points out that in the hype around AI’s ability to write, the fact that it can read is overlooked. That’s why he has an agent inside Claude Code with strict guardrails to stay in “thinking mode.” It logs his questions, tracks insights, and catches him up on research if he returns to a project after a few days away. - How Noah does deep work on his phone: Noah rigged a home server in his basement, put his Obsidian vault in it—and then runs Claude Code on top. Noah says that being able to think, write, research, and ship code from his phone has fundamentally changed the way he works. This episode of Every 📧’s AI & I is a must-watch for anyone curious about who wants to learn how to use Claude Code to build a true second brain. Watch below! Timestamps: Introduction: 00:01:19 How you can do deep work on your phone: 00:04:28 Why Noah thinks Grok has the best voice AI: 00:06:14 The nuts and bolts of Noah’s Claude Code-Obsidian setup: 00:11:39 Using an agent in Claude Code as a “thinking partner”: 00:23:59 Noah’s Thomas’ English Muffin theory of AI: 00:35:07 The white space still left to explore in AI: 00:44:04 How Noah is preparing his kids for AI: 00:50:41 How he brought his Claude Code setup to mobile: 01:01:54

Noah Brier (Noah Brier) uses Claude Code as his second brain—it’s the coolest notetaking setup I’ve ever seen. He has Claude running on a server in his basement hooked up to a VPN. It stores, reads, and writes to thousands of notes in his Obsidian (Obsidian) vault. He does it all from his phone. I had him on the show to tell us exactly how he’s pulling this off. We get into: - The nuts and bolts of the Claude Code-Obsidian setup: Noah set up Claude Code on top of his Obsidian root directory, and he walked me through how he uses it to prep for an upcoming speech—creating a project folder, pulling in relevant research from his notes, saving transcripts from chats with other LLMs, and generating daily progress updates. - The “thinking partner” that lives inside Noah’s second brain: Noah points out that in the hype around AI’s ability to write, the fact that it can read is overlooked. That’s why he has an agent inside Claude Code with strict guardrails to stay in “thinking mode.” It logs his questions, tracks insights, and catches him up on research if he returns to a project after a few days away. - How Noah does deep work on his phone: Noah rigged a home server in his basement, put his Obsidian vault in it—and then runs Claude Code on top. Noah says that being able to think, write, research, and ship code from his phone has fundamentally changed the way he works. This episode of Every 📧’s AI & I is a must-watch for anyone curious about who wants to learn how to use Claude Code to build a true second brain. Watch below! Timestamps: Introduction: 00:01:19 How you can do deep work on your phone: 00:04:28 Why Noah thinks Grok has the best voice AI: 00:06:14 The nuts and bolts of Noah’s Claude Code-Obsidian setup: 00:11:39 Using an agent in Claude Code as a “thinking partner”: 00:23:59 Noah’s Thomas’ English Muffin theory of AI: 00:35:07 The white space still left to explore in AI: 00:44:04 How Noah is preparing his kids for AI: 00:50:41 How he brought his Claude Code setup to mobile: 01:01:54

Dan Shipper 📧

30,792 görüntüleme • 9 ay önce

"We are clearly entering a world where a product-minded engineer is now empowered to produce software without writing a line of code for it." In this conversation, Temporal CEO Samar Abbas joins a16z GPs Sarah Wang and Raghu Raghuram to cover: - Why agents are going from short-lived and interactive to long-running and async - Why the engineer of the future manages 15 parallel AI tasks at once - How OpenAI Codex runs millions of concurrent agent executions on Temporal - How real-time context engineering is exploding on Temporal - Why SaaS isn't dead, and value is migrating to APIs 00:00 Introduction 04:03 Temporal's origin story 11:14 Why agents raise the stakes 16:00 Specialized agents need durable RPC 25:20 Deep research agents 30:58 Execution histories as a superpower 39:04 Minimal viable long-running agent architecture 45:07 Context engineering at scale 52:40 Where value accrues: The "five-layer cake" and breakout AI applications Samar Abbas Sarah Wang Raghu Raghuram

"We are clearly entering a world where a product-minded engineer is now empowered to produce software without writing a line of code for it." In this conversation, Temporal CEO Samar Abbas joins a16z GPs Sarah Wang and Raghu Raghuram to cover: - Why agents are going from short-lived and interactive to long-running and async - Why the engineer of the future manages 15 parallel AI tasks at once - How OpenAI Codex runs millions of concurrent agent executions on Temporal - How real-time context engineering is exploding on Temporal - Why SaaS isn't dead, and value is migrating to APIs 00:00 Introduction 04:03 Temporal's origin story 11:14 Why agents raise the stakes 16:00 Specialized agents need durable RPC 25:20 Deep research agents 30:58 Execution histories as a superpower 39:04 Minimal viable long-running agent architecture 45:07 Context engineering at scale 52:40 Where value accrues: The "five-layer cake" and breakout AI applications Samar Abbas Sarah Wang Raghu Raghuram

a16z

45,496 görüntüleme • 4 ay önce

I'm often asked for the best public example of AI evals done right for a real, production product. I finally have an answer. Teresa Torres shares how she shipped an AI interview coach, and used evals to rapidly squash bugs and improve the product. Teresa shows how she: 1. did error analysis FIRST to find real issues (instead of using generic metrics) 😍 2. used Jupyter notebooks to analyze errors 3. built custom annotation tools + custom widgets in notebooks 4. built a LLM-judge and assertions to test for specific errors 5. iterated through this feedback loop until it worked. 6. kept things simple the whole time It's also probably the best commercial for Jupyter notebooks you can imagine. 🥰 Chapter summary below. Link to YT in next thread 00:00:00 - Intro 00:01:45 - The Product: Building an AI Interview Coach 00:06:34 - The Problem: How Do I Know if My AI Coach is Any Good? 00:10:15 - Using Airtable for Traces and Annotation 00:12:15 - Discovering Jupyter Notebooks and Designing the First Evals 00:15:15 - Example Evals: LLM-as-Judge vs. Code-Based Assertions 00:21:00 - Learning Python with ChatGPT to Analyze Eval Results 00:31:00 - VS Code, Custom Tools, and an Eval Investigation Notebook 00:39:45 - Building a Custom Annotation Tool with Claude 00:41:00 - From Personal Project to Production App 00:46:02 - How Should PMs and Engineers Collaborate on AI Products? 00:55:45 - Q&A: Capturing Feedback and Annotations from End Users 00:58:11 - Q&A: Is a Technical Background Necessary to Build AI? 01:02:28 - Q&A: What's Next for Teresa? 01:03:13 - Q&A: Unpacking the Micro-Decisions of Building an AI App

I'm often asked for the best public example of AI evals done right for a real, production product. I finally have an answer. Teresa Torres shares how she shipped an AI interview coach, and used evals to rapidly squash bugs and improve the product. Teresa shows how she: 1. did error analysis FIRST to find real issues (instead of using generic metrics) 😍 2. used Jupyter notebooks to analyze errors 3. built custom annotation tools + custom widgets in notebooks 4. built a LLM-judge and assertions to test for specific errors 5. iterated through this feedback loop until it worked. 6. kept things simple the whole time It's also probably the best commercial for Jupyter notebooks you can imagine. 🥰 Chapter summary below. Link to YT in next thread 00:00:00 - Intro 00:01:45 - The Product: Building an AI Interview Coach 00:06:34 - The Problem: How Do I Know if My AI Coach is Any Good? 00:10:15 - Using Airtable for Traces and Annotation 00:12:15 - Discovering Jupyter Notebooks and Designing the First Evals 00:15:15 - Example Evals: LLM-as-Judge vs. Code-Based Assertions 00:21:00 - Learning Python with ChatGPT to Analyze Eval Results 00:31:00 - VS Code, Custom Tools, and an Eval Investigation Notebook 00:39:45 - Building a Custom Annotation Tool with Claude 00:41:00 - From Personal Project to Production App 00:46:02 - How Should PMs and Engineers Collaborate on AI Products? 00:55:45 - Q&A: Capturing Feedback and Annotations from End Users 00:58:11 - Q&A: Is a Technical Background Necessary to Build AI? 01:02:28 - Q&A: What's Next for Teresa? 01:03:13 - Q&A: Unpacking the Micro-Decisions of Building an AI App

Hamel Husain

51,376 görüntüleme • 10 ay önce