Загрузка видео...

Не удалось загрузить видео

Возникла проблема при загрузке этого видео. Это может быть связано с временными проблемами сети или видео может быть недоступно.

На главную

How OpenAI Builds for 800 Million Weekly Active Users: Model Specialization and Fine-Tuning We sat down with Sherwin Wu, Head of Engineering at OpenAI Platform, to discuss OpenAI’s developer strategy, how to manage top ML teams, why they decided to start releasing open-weight models again, how prompt engineering has... evolved over time, and how developer tools will shift to accommodate better agents moving forward. 00:00 Introduction 8:36 Horizontal vs vertical OpenAI 12:18 Why you can’t “disintermediate” the model 15:11 People build relationships with models 17:30 Not one AGI model but many 20:10 Fine-tuning, RFT, and customer data choices 24:44 Prompt engineering isn’t the point anymore 28:06 What an “agent” really is 31:55 How OpenAI thinks about pricing 36:46 Why open-weights don’t kill the API 42:57 Different stacks for text, images, video 45:47 How the agent builder actually works Sherwin Wu martin_casadoshow more

a16z

1,006,539 subscribers

113,950 просмотров • 6 месяцев назад •via X (Twitter)

Образование Новости и политика Наука и технологии

Anya Rossi• Live Now

Private livecam show

Комментарии: 0

Нет доступных комментариев

Здесь появятся комментарии из оригинального поста

Похожие видео

A lot of people are calling Hermes Agent the end of OpenClaw. BRUH! It's not... Nous Research trains actual models and they built an agent around that expertise. The local model routing is solid, but the part that matters for your business is that your conversations become fine-tuning data. You can train a model on how you actually work. 00:00 The Problem with Local AI Models 00:25 Introduction to Nous Research 01:04 Cross-Platform Agent Capabilities 01:44 Deep Local Model Integration 02:30 Routing Tasks to Different Models 03:06 Conversations as Training Data 03:50 Hermes Agent vs. OpenClaw 04:15 Future Plans and Series Overview

A lot of people are calling Hermes Agent the end of OpenClaw. BRUH! It's not... Nous Research trains actual models and they built an agent around that expertise. The local model routing is solid, but the part that matters for your business is that your conversations become fine-tuning data. You can train a model on how you actually work. 00:00 The Problem with Local AI Models 00:25 Introduction to Nous Research 01:04 Cross-Platform Agent Capabilities 01:44 Deep Local Model Integration 02:30 Routing Tasks to Different Models 03:06 Conversations as Training Data 03:50 Hermes Agent vs. OpenClaw 04:15 Future Plans and Series Overview

Ray Fernando

42,398 просмотров • 2 месяцев назад

My conversation with OpenAI co-founder Greg Brockman This is the most detailed first-person account of the 72 hours after Sam Altman was fired. We also go deep on what comes next: the global race to AGI, why ChatGPT stopped showing reasoning, how much of OpenAI's own code is now written by AI ("it's hard to know what percent is not"), and the untold story of how OpenAI actually started in 2015. 00:00:00 Introduction 00:00:49 Meeting Sam Altman and Starting OpenAI 00:02:40 Building the Founding Team 00:04:25 DeepMind's Lead Over OpenAI 00:04:54 Changing OpenAI to a For-Profit Model 00:06:05 Breakthrough Moments at OpenAI 00:08:22 What Dota 2 Meant for OpenAI 00:10:04 Reasoning Versus Prediction 00:11:59 Tensions Grow at OpenAI 00:15:44 Sam Altman's Firing 00:17:49 Greg Quits OpenAI 00:19:56 Sam Explores Deal with Microsoft's Satya 00:20:28 Petition for Altman's Return 00:23:43 Ilya Sutskever Leaves OpenAI 00:24:59 Lessons Learned after Sam Ousting 00:28:22 The Thing Ilya Said that Greg Can't Forget 00:32:22 Is AI Going Parabolic? 00:33:24 How Much of OpenAI's Code is Written by AI? 00:36:21 Do AI Chatbots Tell Us What We Want to Hear? 00:38:06 The Global AI Race to Reach AGI 00:38:40 What Happens if US Doesn't Reach AGI First? 00:39:49 Are Countries Stealing AI Advancements? 00:40:38 Why ChatGPT No Longer Shows Reasoning 00:41:47 The Finite Constraints of Compute 00:43:38 On Investing Early in Data Centers 00:46:31 The Future of Data Center Specialization 00:47:52 How to Decide Whose Queries to Serve 00:49:08 OpenAI on Consumer vs Enterprise Models 00:53:05 Data Centers in Space? 01:00:56 What Should AI Regulation Look Like? 01:04:33 The Future of AI-Powered Entrepreneurship 01:04:44 AI and Job Loss 01:07:15 The Skills Young People Should Invest In 01:11:30 What Does Success Look Like For You? Full episode on X below. Also find it on: • YouTube: • Spotify: • Apple:

My conversation with OpenAI co-founder Greg Brockman This is the most detailed first-person account of the 72 hours after Sam Altman was fired. We also go deep on what comes next: the global race to AGI, why ChatGPT stopped showing reasoning, how much of OpenAI's own code is now written by AI ("it's hard to know what percent is not"), and the untold story of how OpenAI actually started in 2015. 00:00:00 Introduction 00:00:49 Meeting Sam Altman and Starting OpenAI 00:02:40 Building the Founding Team 00:04:25 DeepMind's Lead Over OpenAI 00:04:54 Changing OpenAI to a For-Profit Model 00:06:05 Breakthrough Moments at OpenAI 00:08:22 What Dota 2 Meant for OpenAI 00:10:04 Reasoning Versus Prediction 00:11:59 Tensions Grow at OpenAI 00:15:44 Sam Altman's Firing 00:17:49 Greg Quits OpenAI 00:19:56 Sam Explores Deal with Microsoft's Satya 00:20:28 Petition for Altman's Return 00:23:43 Ilya Sutskever Leaves OpenAI 00:24:59 Lessons Learned after Sam Ousting 00:28:22 The Thing Ilya Said that Greg Can't Forget 00:32:22 Is AI Going Parabolic? 00:33:24 How Much of OpenAI's Code is Written by AI? 00:36:21 Do AI Chatbots Tell Us What We Want to Hear? 00:38:06 The Global AI Race to Reach AGI 00:38:40 What Happens if US Doesn't Reach AGI First? 00:39:49 Are Countries Stealing AI Advancements? 00:40:38 Why ChatGPT No Longer Shows Reasoning 00:41:47 The Finite Constraints of Compute 00:43:38 On Investing Early in Data Centers 00:46:31 The Future of Data Center Specialization 00:47:52 How to Decide Whose Queries to Serve 00:49:08 OpenAI on Consumer vs Enterprise Models 00:53:05 Data Centers in Space? 01:00:56 What Should AI Regulation Look Like? 01:04:33 The Future of AI-Powered Entrepreneurship 01:04:44 AI and Job Loss 01:07:15 The Skills Young People Should Invest In 01:11:30 What Does Success Look Like For You? Full episode on X below. Also find it on: • YouTube: • Spotify: • Apple:

Shane Parrish

450,631 просмотров • 2 месяцев назад

I spoke at length with OpenAI President Greg Brockman about the company's double-down bet on text models, AI takeoff, Codex, infrastructure scaling, and plenty more. Full episode below: 0:00 Introduction 4:06 Why OpenAI Pulled Back From Sora 11:24 The OpenAI Super App Plan 22:59 OpenAI's Forthcoming “Spud” Model 28:13 OpenAI’s Automated AI Researcher Plan 31:12 AI Risk, Safety, and Takeoff 55:15 The Logic Behind OpenAI’s Compute Spending 1:03:24 Why So Many People Still Distrust AI

I spoke at length with OpenAI President Greg Brockman about the company's double-down bet on text models, AI takeoff, Codex, infrastructure scaling, and plenty more. Full episode below: 0:00 Introduction 4:06 Why OpenAI Pulled Back From Sora 11:24 The OpenAI Super App Plan 22:59 OpenAI's Forthcoming “Spud” Model 28:13 OpenAI’s Automated AI Researcher Plan 31:12 AI Risk, Safety, and Takeoff 55:15 The Logic Behind OpenAI’s Compute Spending 1:03:24 Why So Many People Still Distrust AI

Alex Kantrowitz

45,546 просмотров • 2 месяцев назад

How OpenAI made Codex The full episode with Alexander Embiricos is now live: Timestamps 0:00:00 Intro: Meet the Codex Team 0:00:21 Origin Story of Codex 0:15:03 How Users Are Adopting Codex: Surprising Patterns 0:36:38 Is AI Eating Software Engineering? 1:00:05 Building a Startup in the Age of AI Agents 1:07:58 Why Some Students Excel with AI Tools 1:10:56 Hiring at OpenAI: What Stands Out

How OpenAI made Codex The full episode with Alexander Embiricos is now live: Timestamps 0:00:00 Intro: Meet the Codex Team 0:00:21 Origin Story of Codex 0:15:03 How Users Are Adopting Codex: Surprising Patterns 0:36:38 Is AI Eating Software Engineering? 1:00:05 Building a Startup in the Age of AI Agents 1:07:58 Why Some Students Excel with AI Tools 1:10:56 Hiring at OpenAI: What Stands Out

Anjney Midha

31,764 просмотров • 9 месяцев назад

Interview with Jay Parikh, EVP of CoreAI Microsoft 00:00 - Intro 00:25 - Microsoft's Core AI Team 05:08 - In-Person Work Culture 10:15 - How AI Changes Engineering Roles 15:24 - Data Centers & Power Constraints 19:00 - AI Bubble & Dark GPUs 22:34 - Model efficiency 25:17 - Enterprise model rollout 28:17 - Open v Closed source 30:43 - Omni-model vs. multi-model 32:53 - Microsoft's relationship with OpenAI 34:31 - MAI and MSFT's frontier models 35:51 - AI safety 38:45 - Contrarian Take

Interview with Jay Parikh, EVP of CoreAI Microsoft 00:00 - Intro 00:25 - Microsoft's Core AI Team 05:08 - In-Person Work Culture 10:15 - How AI Changes Engineering Roles 15:24 - Data Centers & Power Constraints 19:00 - AI Bubble & Dark GPUs 22:34 - Model efficiency 25:17 - Enterprise model rollout 28:17 - Open v Closed source 30:43 - Omni-model vs. multi-model 32:53 - Microsoft's relationship with OpenAI 34:31 - MAI and MSFT's frontier models 35:51 - AI safety 38:45 - Contrarian Take

Matthew Berman

40,617 просмотров • 7 месяцев назад

BG2 Guest Interview. ChatGPT – The Super Assistant Era 📷 How ChatGPT Gets to the Next Billion Users Bg2 Pod Apoorv Agrawal -- (00:00) Intro (01:00) Nick Turley’s Journey to OpenAI (02:15) ChatGPT’s North Star: Long-Term Retention (04:15) Why ChatGPT’s Retention Curve “Smiles” (06:45) What Drove ChatGPT’s Consumer Breakout (10:15) How OpenAI Gets the Next Billion Users (14:15) When ChatGPT Starts Taking Actions (18:15) Why Coding Agents Came First (21:00) Beyond Chatbots: The Super Assistant Vision (24:00) Power Users vs. Casual Users (28:00) Why ChatGPT Pricing Has to Change (33:45) Partnerships, Distribution, and Product Tradeoffs (37:15) GPUs, Scarcity, and the Cost of Scaling AI (41:30) Shopping, ChatGPT as a Thought Partner, and Code Red (51:45) OpenAI’s Future Interface, Rapid Fire, AI Jobs, and Nick’s AGI Moments

BG2 Guest Interview. ChatGPT – The Super Assistant Era 📷 How ChatGPT Gets to the Next Billion Users Bg2 Pod Apoorv Agrawal -- (00:00) Intro (01:00) Nick Turley’s Journey to OpenAI (02:15) ChatGPT’s North Star: Long-Term Retention (04:15) Why ChatGPT’s Retention Curve “Smiles” (06:45) What Drove ChatGPT’s Consumer Breakout (10:15) How OpenAI Gets the Next Billion Users (14:15) When ChatGPT Starts Taking Actions (18:15) Why Coding Agents Came First (21:00) Beyond Chatbots: The Super Assistant Vision (24:00) Power Users vs. Casual Users (28:00) Why ChatGPT Pricing Has to Change (33:45) Partnerships, Distribution, and Product Tradeoffs (37:15) GPUs, Scarcity, and the Cost of Scaling AI (41:30) Shopping, ChatGPT as a Thought Partner, and Code Red (51:45) OpenAI’s Future Interface, Rapid Fire, AI Jobs, and Nick’s AGI Moments

Bg2 Pod

25,525 просмотров • 3 месяцев назад

.Poetiq is a new startup that recently achieved a major jump on the ARC-AGI benchmark by layering a recursive self-improvement system on top of existing models. In this episode of the Lightcone Podcast, Poetiq's Founder & CEO Ian Fischer joined us to discuss how small teams can build “reasoning harnesses” that outperform base models, what that means for startups and why automating prompt engineering may be one of the most powerful levers in AI today. 00:00 – Intro 00:40 – What Is Poetiq? 01:07 – Recursive Self-Improvement Explained 02:07 – The Fine-Tuning Trap 02:59 – “Stilts” for LLMs 03:14 – Recursive Self-Improvement vs. Fine-Tuning 05:05 – Taking the Top Spot on ARC-AGI 06:37 – Beating Claude on Humanity’s Last Exam 08:40 – How the Meta-System Works 10:26 – Beyond RL: A New S-Curve 11:32 – Automating Prompt Engineering 13:37 – From 5% to 95% Performance 14:50 – Early Access & Putting Your Agent on Stilts 16:17 – From YC Founder to DeepMind Researcher 18:29 – Advice for Engineers in the AI Era

.Poetiq is a new startup that recently achieved a major jump on the ARC-AGI benchmark by layering a recursive self-improvement system on top of existing models. In this episode of the Lightcone Podcast, Poetiq's Founder & CEO Ian Fischer joined us to discuss how small teams can build “reasoning harnesses” that outperform base models, what that means for startups and why automating prompt engineering may be one of the most powerful levers in AI today. 00:00 – Intro 00:40 – What Is Poetiq? 01:07 – Recursive Self-Improvement Explained 02:07 – The Fine-Tuning Trap 02:59 – “Stilts” for LLMs 03:14 – Recursive Self-Improvement vs. Fine-Tuning 05:05 – Taking the Top Spot on ARC-AGI 06:37 – Beating Claude on Humanity’s Last Exam 08:40 – How the Meta-System Works 10:26 – Beyond RL: A New S-Curve 11:32 – Automating Prompt Engineering 13:37 – From 5% to 95% Performance 14:50 – Early Access & Putting Your Agent on Stilts 16:17 – From YC Founder to DeepMind Researcher 18:29 – Advice for Engineers in the AI Era

Y Combinator

155,599 просмотров • 3 месяцев назад

8 months ago we started explaining enterprise AI on YouTube. 2M views later, here are the 5 shifts that actually defined 2025. Timestamps 00:36 Prompt Engineering → Agent Engineering 00:51 Prompting best practices 01:19 How context engineering focuses on curating information 01:51 Evolution of assistants to agents 02:12 Agents handling complex tasks 02:58 How security became foundational 03:36 The correct security architecture for organizations 04:23 The emergence of model flexibility 05:26 The importance of governance and trust

8 months ago we started explaining enterprise AI on YouTube. 2M views later, here are the 5 shifts that actually defined 2025. Timestamps 00:36 Prompt Engineering → Agent Engineering 00:51 Prompting best practices 01:19 How context engineering focuses on curating information 01:51 Evolution of assistants to agents 02:12 Agents handling complex tasks 02:58 How security became foundational 03:36 The correct security architecture for organizations 04:23 The emergence of model flexibility 05:26 The importance of governance and trust

Box

460,406 просмотров • 6 месяцев назад

"We are clearly entering a world where a product-minded engineer is now empowered to produce software without writing a line of code for it." In this conversation, Temporal CEO Samar Abbas joins a16z GPs Sarah Wang and Raghu Raghuram to cover: - Why agents are going from short-lived and interactive to long-running and async - Why the engineer of the future manages 15 parallel AI tasks at once - How OpenAI Codex runs millions of concurrent agent executions on Temporal - How real-time context engineering is exploding on Temporal - Why SaaS isn't dead, and value is migrating to APIs 00:00 Introduction 04:03 Temporal's origin story 11:14 Why agents raise the stakes 16:00 Specialized agents need durable RPC 25:20 Deep research agents 30:58 Execution histories as a superpower 39:04 Minimal viable long-running agent architecture 45:07 Context engineering at scale 52:40 Where value accrues: The "five-layer cake" and breakout AI applications Samar Abbas Sarah Wang Raghu Raghuram

"We are clearly entering a world where a product-minded engineer is now empowered to produce software without writing a line of code for it." In this conversation, Temporal CEO Samar Abbas joins a16z GPs Sarah Wang and Raghu Raghuram to cover: - Why agents are going from short-lived and interactive to long-running and async - Why the engineer of the future manages 15 parallel AI tasks at once - How OpenAI Codex runs millions of concurrent agent executions on Temporal - How real-time context engineering is exploding on Temporal - Why SaaS isn't dead, and value is migrating to APIs 00:00 Introduction 04:03 Temporal's origin story 11:14 Why agents raise the stakes 16:00 Specialized agents need durable RPC 25:20 Deep research agents 30:58 Execution histories as a superpower 39:04 Minimal viable long-running agent architecture 45:07 Context engineering at scale 52:40 Where value accrues: The "five-layer cake" and breakout AI applications Samar Abbas Sarah Wang Raghu Raghuram

a16z

45,496 просмотров • 4 месяцев назад

Revolutionizing Move Programming with OpenLedger In this demo, we showcase how Move datasets contributed by data providers to OpenLedger’s datanets are used to fine-tune specialized models with LoRA fine-tuning. As seen in the video, we showcase an example on how builders can deploy a Move-specialized model that powers Co-pilot agents using our no-code model fine-tuning platform. This is the future of AI and Web3 innovation. Watch this space to see more specialised models and data feeds being built for next generation agents on top of OpenLedger #Move

Revolutionizing Move Programming with OpenLedger In this demo, we showcase how Move datasets contributed by data providers to OpenLedger’s datanets are used to fine-tune specialized models with LoRA fine-tuning. As seen in the video, we showcase an example on how builders can deploy a Move-specialized model that powers Co-pilot agents using our no-code model fine-tuning platform. This is the future of AI and Web3 innovation. Watch this space to see more specialised models and data feeds being built for next generation agents on top of OpenLedger #Move

OpenLedger

61,662 просмотров • 1 год назад

The Sam Altman Interview You know him as the CEO of OpenAI — but he's also an avid writer. We spoke not once but twice about how Sam captures ideas, clarifies his thinking, edits his writing, decides what to work on, and uses ChatGPT. Timestamps: 1:47 Will LLMs change how we write? 8:39 How does Sam use ChatGPT? 11:26 How Sam became less anxious 17:24 Sam once dreamed of being a novelist 18:37 Lessons from Peter Thiel 21:35 Lessons from Paul Graham 26:02 The book Sam Altman wants to write 28:37 Advice for startup founders 30:20 How Y Combinator shapes OpenAI 35:55 How Sam chose to work on AGI 37:35 Writing strategy memos at OpenAI 41:34 Why isn’t ChatGPT a better storyteller? 44:20 Sam's obsessive note-taking method 47:12 Will AI put writers out of work?

The Sam Altman Interview You know him as the CEO of OpenAI — but he's also an avid writer. We spoke not once but twice about how Sam captures ideas, clarifies his thinking, edits his writing, decides what to work on, and uses ChatGPT. Timestamps: 1:47 Will LLMs change how we write? 8:39 How does Sam use ChatGPT? 11:26 How Sam became less anxious 17:24 Sam once dreamed of being a novelist 18:37 Lessons from Peter Thiel 21:35 Lessons from Paul Graham 26:02 The book Sam Altman wants to write 28:37 Advice for startup founders 30:20 How Y Combinator shapes OpenAI 35:55 How Sam chose to work on AGI 37:35 Writing strategy memos at OpenAI 41:34 Why isn’t ChatGPT a better storyteller? 44:20 Sam's obsessive note-taking method 47:12 Will AI put writers out of work?

David Perell

239,633 просмотров • 1 год назад

Benedict Evans on Why AI Feels Like the Internet in 1997 Benedict Evans joins Erik Torenberg for a conversation on the state of AI, including how coding agents hit product-market fit, why foundation models should be thought of as infrastructure, the value of vertical products, and more. 00:00 Intro 00:44 What's changed since last year 05:53 OpenAI vs Anthropic strategy 10:31 The pricing crunch & platform history 22:48 What comes after coding 38:18 AI & the future of enterprise software 48:43 The CapEx problem 55:07 Will models become commodities? Benedict Evans Erik Torenberg

Benedict Evans on Why AI Feels Like the Internet in 1997 Benedict Evans joins Erik Torenberg for a conversation on the state of AI, including how coding agents hit product-market fit, why foundation models should be thought of as infrastructure, the value of vertical products, and more. 00:00 Intro 00:44 What's changed since last year 05:53 OpenAI vs Anthropic strategy 10:31 The pricing crunch & platform history 22:48 What comes after coding 38:18 AI & the future of enterprise software 48:43 The CapEx problem 55:07 Will models become commodities? Benedict Evans Erik Torenberg

a16z

102,407 просмотров • 16 дней назад

BG2 Guest Interview (new experiment) - Deep Dive OpenAI Enterprise. Forward Deployed Engineering, GPT-5, and More 🚀🤖 Apoorv Agrawal Sherwin Wu Olivier Godement @openai

BG2 Guest Interview (new experiment) - Deep Dive OpenAI Enterprise. Forward Deployed Engineering, GPT-5, and More 🚀🤖 Apoorv Agrawal Sherwin Wu Olivier Godement @openai

Bg2 Pod

42,384 просмотров • 9 месяцев назад

What is AI inference engineering, why is it such an in-demand skill, and how do you break into the field? With author of Inference Engineering Philip Kiely and head of training at Baseten Charlie O'Neill 0:00: What is inference? 2:47: History of inference 4:59: Downstream effects of AI research on inference 13:54: What you'll learn from Inference Engineering 16:14: Advice for engineers transitioning into AI 19:00: Open source models driving inference growth 20:55: Specialization vs. frontier closed models 23:51: "Big Token" and the importance of open source AI 27:18: Where to get Inference Engineering

What is AI inference engineering, why is it such an in-demand skill, and how do you break into the field? With author of Inference Engineering Philip Kiely and head of training at Baseten Charlie O'Neill 0:00: What is inference? 2:47: History of inference 4:59: Downstream effects of AI research on inference 13:54: What you'll learn from Inference Engineering 16:14: Advice for engineers transitioning into AI 19:00: Open source models driving inference growth 20:55: Specialization vs. frontier closed models 23:51: "Big Token" and the importance of open source AI 27:18: Where to get Inference Engineering

Madison Kanna

119,268 просмотров • 3 месяцев назад

Google DeepMind Developers: How Nano Banana Was Made In this episode, we sat down with Google DeepMind’s Oliver Wang and Nicole Brichtova to discuss how Nano Banana was created, why it became so viral, and what’s in store for the future of image and video editing. 00:00 Intro 02:00 The origin of Nano Banana 06:20 Seeing yourself in AI 11:00 Control & character consistency 17:10 AI in education and visual learning 24:10 2D vs 3D: Debate over world models 31:10 The Japan phenomenon 35:00 From images to video 41:00 Working with artists 47:30 The next era of image models 53:50 Closing thoughts Oliver Wang Nicole Brichtova Guido Appenzeller Yoko

Google DeepMind Developers: How Nano Banana Was Made In this episode, we sat down with Google DeepMind’s Oliver Wang and Nicole Brichtova to discuss how Nano Banana was created, why it became so viral, and what’s in store for the future of image and video editing. 00:00 Intro 02:00 The origin of Nano Banana 06:20 Seeing yourself in AI 11:00 Control & character consistency 17:10 AI in education and visual learning 24:10 2D vs 3D: Debate over world models 31:10 The Japan phenomenon 35:00 From images to video 41:00 Working with artists 47:30 The next era of image models 53:50 Closing thoughts Oliver Wang Nicole Brichtova Guido Appenzeller Yoko

a16z

97,486 просмотров • 7 месяцев назад

Michael Bolin (Michael Bolin) is the tech lead of the Codex open source repository at OpenAI and formerly a distinguished eng (E9) at Meta. I asked him for all the details on his career story and how he uses Codex for max benefit. Timestamps: 00:00:00 - Intro 00:00:56 - Chickenfoot 00:02:45 - Working at Google 00:06:34 - Overhauling Facebook's build system 00:16:36 - Rewriting Facebook's IDE 00:26:01 - Struggles after Principal Eng (E8) promo 00:28:39 - Building a virtual filesystem for Facebook 00:35:47 - Delayed Distinguished promo (E9) and learnings 00:39:56 - Joining OpenAI 00:43:05 - Research-led vs engineering-led cultures 00:44:53 - The story behind Codex 00:51:00 - How he uses Codex 00:57:00 - Why Codex's harness is open source 00:59:50 - Top technical book recommendations 01:05:02 - Why deep technical skills are still valuable (for now) 01:11:07 - How to start projects well 01:14:27 - Advice on writing better and career planning 01:17:06 - Advice for his younger self 01:19:10 - Outro He was a dream guest of mine and I'm excited to share his story with you all! Other places to watch: • YouTube: • Spotify: • Apple Podcasts: • Transcript:

Michael Bolin (Michael Bolin) is the tech lead of the Codex open source repository at OpenAI and formerly a distinguished eng (E9) at Meta. I asked him for all the details on his career story and how he uses Codex for max benefit. Timestamps: 00:00:00 - Intro 00:00:56 - Chickenfoot 00:02:45 - Working at Google 00:06:34 - Overhauling Facebook's build system 00:16:36 - Rewriting Facebook's IDE 00:26:01 - Struggles after Principal Eng (E8) promo 00:28:39 - Building a virtual filesystem for Facebook 00:35:47 - Delayed Distinguished promo (E9) and learnings 00:39:56 - Joining OpenAI 00:43:05 - Research-led vs engineering-led cultures 00:44:53 - The story behind Codex 00:51:00 - How he uses Codex 00:57:00 - Why Codex's harness is open source 00:59:50 - Top technical book recommendations 01:05:02 - Why deep technical skills are still valuable (for now) 01:11:07 - How to start projects well 01:14:27 - Advice on writing better and career planning 01:17:06 - Advice for his younger self 01:19:10 - Outro He was a dream guest of mine and I'm excited to share his story with you all! Other places to watch: • YouTube: • Spotify: • Apple Podcasts: • Transcript:

Ryan Peterman

119,407 просмотров • 3 месяцев назад

I'm excited to introduce my AI Machine Learning Agent that built 32 ML models in 30 seconds. Today, I'll share with you how to automate building 100s of ML models with the AI ML Agent, which is available on GitHub. We'll create an ML Agent focusing on a Customer Churn Problem. I'll guide you through setting up the ML Agent, creating dozens of ML models, and loading the best model for production. This AI is a huge time-saver! Table of Contents: 00:00 Introduction to my AI Data Science Team 02:56 Setting Up AI Data Science Team 04:48 Running the ML Agent Code 07:12 Create (and Run) the AI Machine Learning Agent 09:53 Reviewing ML Model Summary 12:00 Saving and Loading Models 13:00 Next Steps + Project Roadmap + AI Bootcamp Github to AI Data Science Team (Data Science Agents): Get the Code and Future Updates by Joining my Python AI/ML Tips Newsletter: P.S. - Want to learn how to build AI projects companies actually want? (live Python Code) On Wednesday, May 21st, I'm sharing one of my best AI Projects: AI Customer Segmentation Agent with Python Register here (570+ registered):

I'm excited to introduce my AI Machine Learning Agent that built 32 ML models in 30 seconds. Today, I'll share with you how to automate building 100s of ML models with the AI ML Agent, which is available on GitHub. We'll create an ML Agent focusing on a Customer Churn Problem. I'll guide you through setting up the ML Agent, creating dozens of ML models, and loading the best model for production. This AI is a huge time-saver! Table of Contents: 00:00 Introduction to my AI Data Science Team 02:56 Setting Up AI Data Science Team 04:48 Running the ML Agent Code 07:12 Create (and Run) the AI Machine Learning Agent 09:53 Reviewing ML Model Summary 12:00 Saving and Loading Models 13:00 Next Steps + Project Roadmap + AI Bootcamp Github to AI Data Science Team (Data Science Agents): Get the Code and Future Updates by Joining my Python AI/ML Tips Newsletter: P.S. - Want to learn how to build AI projects companies actually want? (live Python Code) On Wednesday, May 21st, I'm sharing one of my best AI Projects: AI Customer Segmentation Agent with Python Register here (570+ registered):

Matt Dancho (Business Science)

35,452 просмотров • 1 год назад

In the future, you’ll be able to accomplish a goal by just giving Claude an outcome and a budget. That’s the direction Anthropic is building in with its new Managed Agents features, announced at this week’s Code with Claude developer event. The basic idea: Claude, wrapped in a computer in the cloud, that you can spin up, scale, and manage as needed. Anthropic is taking on the infrastructure that kills most agent products, and making sure that it scales to meet the needs of agents running 24/7. On this week’s AI & I from Every 📧, I talk with Angela Jiang (Angela Jiang), head of product for the Claude platform, and Katelyn Lesse (Katelyn Lesse), head of engineering for the Claude platform, about what Anthropic is building and what it takes to make agents reliable in production. We get into: - Why the "build a generic harness, hot-swap any model behind it" playbook is already outdated. Angela points to eval data on Memory where the same task across different harnesses performed drastically differently. - The infrastructure wall every team hits in production—and why Katelyn thinks “my sandbox died and took the agent with it” is the real reason internal agents don't ship. - Why Anthropic is so bullish on using file systems and skills within Claude, including Angela's argument that those early design choices can compound for years. This is a must-watch for anyone trying to take an agent past the demo and into production. Watch below! Timestamps: How the Claude platform evolved from API to agents: 00:01:48 The primitives that make up Claude Managed Agents: 00:04:09 Why the harness and the model are becoming a single unit: 00:10:37 The infrastructure wall that kills most agent projects in production: 00:18:49 Why team agents need a different shape than individual productivity tools: 00:24:49 How Anthropic's legal team uses an agent to review marketing copy: 00:26:36 Using multi-agent orchestration for advisor strategies, adversarial pairs, and swarms: 00:34:24 How to measure agent success with outcome and budget as the end state: 00:35:50 What the platform looks like a year from now, when Claude writes its own harness: 00:39:11

In the future, you’ll be able to accomplish a goal by just giving Claude an outcome and a budget. That’s the direction Anthropic is building in with its new Managed Agents features, announced at this week’s Code with Claude developer event. The basic idea: Claude, wrapped in a computer in the cloud, that you can spin up, scale, and manage as needed. Anthropic is taking on the infrastructure that kills most agent products, and making sure that it scales to meet the needs of agents running 24/7. On this week’s AI & I from Every 📧, I talk with Angela Jiang (Angela Jiang), head of product for the Claude platform, and Katelyn Lesse (Katelyn Lesse), head of engineering for the Claude platform, about what Anthropic is building and what it takes to make agents reliable in production. We get into: - Why the "build a generic harness, hot-swap any model behind it" playbook is already outdated. Angela points to eval data on Memory where the same task across different harnesses performed drastically differently. - The infrastructure wall every team hits in production—and why Katelyn thinks “my sandbox died and took the agent with it” is the real reason internal agents don't ship. - Why Anthropic is so bullish on using file systems and skills within Claude, including Angela's argument that those early design choices can compound for years. This is a must-watch for anyone trying to take an agent past the demo and into production. Watch below! Timestamps: How the Claude platform evolved from API to agents: 00:01:48 The primitives that make up Claude Managed Agents: 00:04:09 Why the harness and the model are becoming a single unit: 00:10:37 The infrastructure wall that kills most agent projects in production: 00:18:49 Why team agents need a different shape than individual productivity tools: 00:24:49 How Anthropic's legal team uses an agent to review marketing copy: 00:26:36 Using multi-agent orchestration for advisor strategies, adversarial pairs, and swarms: 00:34:24 How to measure agent success with outcome and budget as the end state: 00:35:50 What the platform looks like a year from now, when Claude writes its own harness: 00:39:11

Dan Shipper 📧

66,339 просмотров • 1 месяц назад

Why AI Progress Suddenly Feels Real - my conversation with Yann Dubois, who co-leads the Post-Training Frontiers team at OpenAI 00:00 - Intro 01:30 - Why recent AI progress feels like a step function 04:13 - Model reliability & the emotional rollercoaster of shipping GPT-5.5 07:33 - How OpenAI structures vertical and horizontal teams 09:49 - Improving model efficiency and test-time compute 12:32 - Yann's journey from Switzerland to OpenAI 15:37 - Reasoning in 2026: Real-world utility vs verifiable rewards 18:34 - GPT-5.5 Thinking vs Pro: Scaling test-time compute 20:09 - How reasoning models become more efficient 23:23 - Pre-training scaling and overcoming the data wall 27:03 - Multimodal data, synthetic data, and embodied AI 31:05 - Demystifying mid-training and post-training 37:21 - Does RL create new capabilities in AI? 38:53 - The challenges and frontier of scaling RL 43:09 - Is building AI models a craft or a strict science 48:21 - How AI models generalize across different domains 54:18 - How reinforcement learning cures AI hallucinations 56:04 - Negative generalization and conflicting instructions 58:05 - Can RL scale to law, medicine, and the broader economy? 1:00:19 - The evaluation bottleneck and Model as a Judge 1:04:21 - Continuous AI progress & continual learning 1:08:49 - Will foundation models eat the agent harness 1:11:23 - Why startups should focus on the last mile of AI

Why AI Progress Suddenly Feels Real - my conversation with Yann Dubois, who co-leads the Post-Training Frontiers team at OpenAI 00:00 - Intro 01:30 - Why recent AI progress feels like a step function 04:13 - Model reliability & the emotional rollercoaster of shipping GPT-5.5 07:33 - How OpenAI structures vertical and horizontal teams 09:49 - Improving model efficiency and test-time compute 12:32 - Yann's journey from Switzerland to OpenAI 15:37 - Reasoning in 2026: Real-world utility vs verifiable rewards 18:34 - GPT-5.5 Thinking vs Pro: Scaling test-time compute 20:09 - How reasoning models become more efficient 23:23 - Pre-training scaling and overcoming the data wall 27:03 - Multimodal data, synthetic data, and embodied AI 31:05 - Demystifying mid-training and post-training 37:21 - Does RL create new capabilities in AI? 38:53 - The challenges and frontier of scaling RL 43:09 - Is building AI models a craft or a strict science 48:21 - How AI models generalize across different domains 54:18 - How reinforcement learning cures AI hallucinations 56:04 - Negative generalization and conflicting instructions 58:05 - Can RL scale to law, medicine, and the broader economy? 1:00:19 - The evaluation bottleneck and Model as a Judge 1:04:21 - Continuous AI progress & continual learning 1:08:49 - Will foundation models eat the agent harness 1:11:23 - Why startups should focus on the last mile of AI

Matt Turck

99,933 просмотров • 1 месяц назад