Sonya Huang 🐥's banner

Sonya Huang 🐥

@sonyatweetybird • 25,240 subscribers

funding big computer @sequoia

Videos

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Exclusive private shows

1.2k viewers online

Private Show

Join now for exclusive access

Free preview available • Premium content

Two of the people most responsible for scaling the transformer are now betting on a next act. Jerry Tworek ran the Reasoning 🍓 team at OpenAI. rohan anil was a pre-training lead on Gemini after years at Google Brain and Anthropic. They just started Core Automation to find what comes next. Their core argument: (1) models are trained in the lab but deployed in the real world and can't keep learning once they leave; (2) AI research is done by humans today but models will be able to explore and uncover new advances more rapidly and systematically (controversial but timely w this week's petition). The conversation covers: — why Jerry expected AGI in 2025 and what changed his mind — the two kinds of learning from experience, and why RL only captures one — the computational depth problem baked into today's architectures — why the biggest labs can't afford to look for a transformer replacement — the kernel competition where humans + $100K of coding agents found a 60x speedup no frontier model comes close to — a definition of AGI you can actually test: a model that improves itself with no human in the loop 00:00 Introduction 01:46 Appreciating Transformers 02:44 Scaling Hits Limits 04:54 Why Architecture Matters 05:32 RL Reality Check 07:32 Test Time Learning 09:52 Economics Of Scaling 12:47 Why Start A Company 14:24 Rohan On Transformers 19:11 Computational Depth Problem 20:32 When Transformers Top Out 23:22 Beyond Reinforcement Learning 26:41 Optimization And Efficiency 34:24 Building An Automated Lab 39:45 Kernel Automation Roadmap

Two of the people most responsible for scaling the transformer are now betting on a next act. Jerry Tworek ran the Reasoning 🍓 team at OpenAI. rohan anil was a pre-training lead on Gemini after years at Google Brain and Anthropic. They just started Core Automation to find what comes next. Their core argument: (1) models are trained in the lab but deployed in the real world and can't keep learning once they leave; (2) AI research is done by humans today but models will be able to explore and uncover new advances more rapidly and systematically (controversial but timely w this week's petition). The conversation covers: — why Jerry expected AGI in 2025 and what changed his mind — the two kinds of learning from experience, and why RL only captures one — the computational depth problem baked into today's architectures — why the biggest labs can't afford to look for a transformer replacement — the kernel competition where humans + $100K of coding agents found a 60x speedup no frontier model comes close to — a definition of AGI you can actually test: a model that improves itself with no human in the loop 00:00 Introduction 01:46 Appreciating Transformers 02:44 Scaling Hits Limits 04:54 Why Architecture Matters 05:32 RL Reality Check 07:32 Test Time Learning 09:52 Economics Of Scaling 12:47 Why Start A Company 14:24 Rohan On Transformers 19:11 Computational Depth Problem 20:32 When Transformers Top Out 23:22 Beyond Reinforcement Learning 26:41 Optimization And Efficiency 34:24 Building An Automated Lab 39:45 Kernel Automation Roadmap

Sonya Huang 🐥

119,652 просмотров • 17 часов назад

Today's Training Data episode takes us BTS on the infrastructure challenges required to do large RL runs at scale, featuring Federico Cassano (Composer Lead at Cursor) and Dmytro Dzhulgakov (Co-Founder at Fireworks). The Cursor team trained Composer 2 on Fireworks by starting with a strong base model (Kimi 2.5) and performing large-scale mid-training on code tokens and web data to learn common patterns and libraries, followed by a large-scale Reinforcement Learning run to learn how to navigate the Cursor harness, call tools, and write correct code. Today's episode dives into the systems and infrastructure challenges of making that large RL run happening, and there were many (!!), from numerical mismatch to global distribution to synchronizing rollouts across asynchronous pipelines to keeping track of expert activation across runs and more. Extremely nerdy in-the-weeds challenges that Federico and Dima were delighted to nerd out on together :) Beyond RL infra, we also discussed Online vs Simulated rollouts, self-summarization for long-horizon agents, environment design ("the most powerful RL environment is the product itself"), and other technical nuggets. PS: We filmed this episode before the SpaceX news, while the Cursor team was still compute-constrained. While Cursor now has *all* the flops, the takeaways and hurdles crossed ring true for any serious application-level company that is racing to post-train their own models. I believe that more serious application companies will go the way of Cursor and post-train their own models. 00:00 Introduction 00:53 Why Cursor Trained Composer 2 04:55 Specialization vs Bitter Lesson 06:16 Composer 2 Training Recipe 16:32 Scaling RL Infrastructure Globally 23:32 Floating Point Drift 25:11 MoE Sensitivity Explained 26:25 Router Replay Fix 27:19 Real Time RL Loop 31:49 Long Horizon Agents 34:29 Why RL Everywhere 37:34 LLM as Judge Rewards 39:14 RL in Hard Domains 40:13 Build Your Own Environments 44:34 Closing Thoughts

Today's Training Data episode takes us BTS on the infrastructure challenges required to do large RL runs at scale, featuring Federico Cassano (Composer Lead at Cursor) and Dmytro Dzhulgakov (Co-Founder at Fireworks). The Cursor team trained Composer 2 on Fireworks by starting with a strong base model (Kimi 2.5) and performing large-scale mid-training on code tokens and web data to learn common patterns and libraries, followed by a large-scale Reinforcement Learning run to learn how to navigate the Cursor harness, call tools, and write correct code. Today's episode dives into the systems and infrastructure challenges of making that large RL run happening, and there were many (!!), from numerical mismatch to global distribution to synchronizing rollouts across asynchronous pipelines to keeping track of expert activation across runs and more. Extremely nerdy in-the-weeds challenges that Federico and Dima were delighted to nerd out on together :) Beyond RL infra, we also discussed Online vs Simulated rollouts, self-summarization for long-horizon agents, environment design ("the most powerful RL environment is the product itself"), and other technical nuggets. PS: We filmed this episode before the SpaceX news, while the Cursor team was still compute-constrained. While Cursor now has all the flops, the takeaways and hurdles crossed ring true for any serious application-level company that is racing to post-train their own models. I believe that more serious application companies will go the way of Cursor and post-train their own models. 00:00 Introduction 00:53 Why Cursor Trained Composer 2 04:55 Specialization vs Bitter Lesson 06:16 Composer 2 Training Recipe 16:32 Scaling RL Infrastructure Globally 23:32 Floating Point Drift 25:11 MoE Sensitivity Explained 26:25 Router Replay Fix 27:19 Real Time RL Loop 31:49 Long Horizon Agents 34:29 Why RL Everywhere 37:34 LLM as Judge Rewards 39:14 RL in Hard Domains 40:13 Build Your Own Environments 44:34 Closing Thoughts

Sonya Huang 🐥

79,834 просмотров • 2 месяцев назад

The advanced civilizations of sci-fi legend (Banks, Asimov, etc) have some form of simulation to guide society. Joon Sung Park is taking a crack at building that simulator with Simile. As Joon's cofounder Percy Liang puts it: great science starts with a great measurement. From Smallville in 2023 to today, Simile is trying to build the Hubble Telescope equivalent for simulating human behavior. 00:00 Introduction 01:49 Building Generative Agents 02:29 Valentines Day Emergence 03:33 From GPT 3 To Agents 05:03 Social Computing Problem 06:19 Social Simulacra Subreddits 07:57 Models Getting Good Enough 08:57 Humans Are Not Rational 10:04 Turning Research Into Simuli 11:55 Validation And Accuracy Proof 12:43 Customer Workflow CVS Example 16:11 Why Collect Real Data 17:51 Behavioral Signals And RCTs 21:52 Use Cases And Second Order Effects 26:31 Evaluating Convergence Divergence 31:58 Big Societal Simulations Ahead 36:08 Future Of Simulation

The advanced civilizations of sci-fi legend (Banks, Asimov, etc) have some form of simulation to guide society. Joon Sung Park is taking a crack at building that simulator with Simile. As Joon's cofounder Percy Liang puts it: great science starts with a great measurement. From Smallville in 2023 to today, Simile is trying to build the Hubble Telescope equivalent for simulating human behavior. 00:00 Introduction 01:49 Building Generative Agents 02:29 Valentines Day Emergence 03:33 From GPT 3 To Agents 05:03 Social Computing Problem 06:19 Social Simulacra Subreddits 07:57 Models Getting Good Enough 08:57 Humans Are Not Rational 10:04 Turning Research Into Simuli 11:55 Validation And Accuracy Proof 12:43 Customer Workflow CVS Example 16:11 Why Collect Real Data 17:51 Behavioral Signals And RCTs 21:52 Use Cases And Second Order Effects 26:31 Evaluating Convergence Divergence 31:58 Big Societal Simulations Ahead 36:08 Future Of Simulation

Sonya Huang 🐥

43,836 просмотров • 1 месяц назад

"Member of the technical staff" is the hottest job title in SF right now. What's behind the name? OpenAI chose this title deliberately to blow up the previous industry dichotomy between researchers and engineers. The best researchers in AI right now aren't academics in a pure lab environment; they get their hands dirty technically writing code and digging into the data and implementations e.g. Alec Radford Bob McGrew, former Chief Research Officer at OpenAI and one of U.S. Army's newest recruits to Detachment 201, joined us on Training Data to share more about the secret sauce behind leading OpenAI's research org, the three legs of the stool to AGI, and why he thinks we're already there.

"Member of the technical staff" is the hottest job title in SF right now. What's behind the name? OpenAI chose this title deliberately to blow up the previous industry dichotomy between researchers and engineers. The best researchers in AI right now aren't academics in a pure lab environment; they get their hands dirty technically writing code and digging into the data and implementations e.g. Alec Radford Bob McGrew, former Chief Research Officer at OpenAI and one of U.S. Army's newest recruits to Detachment 201, joined us on Training Data to share more about the secret sauce behind leading OpenAI's research org, the three legs of the stool to AGI, and why he thinks we're already there.

Sonya Huang 🐥

358,508 просмотров • 1 год назад

Don't sleep on Google DeepMind in AI... This week on Training Data, Google Labs VP Josh Woodward gave us the BTS on Google's imagination playground for AI, from Notebook to Mariner (computer use agent) to Veo (video models). Thanks Josh for the spicy convo and hot takes :)

Don't sleep on Google DeepMind in AI... This week on Training Data, Google Labs VP Josh Woodward gave us the BTS on Google's imagination playground for AI, from Notebook to Mariner (computer use agent) to Veo (video models). Thanks Josh for the spicy convo and hot takes :)

Sonya Huang 🐥

183,357 просмотров • 1 год назад

Claude Code and Suno have more in common than you might think: "It's fun to build things, and it's fun to use what you build." AI lets people be creative in almost any domain, from coding to making music. Today on Training Data, Mikey shares his thesis for why generative AI is the newest form of active entertainment (the next 'gaming'), music as a cultural phenomenon vs creative expression platform, and more. My favorite part was Mikey's explanation of why Suno learns music theory implicitly vs explicitly: "In Western music, there are 12 tones. If you tell the model there are 12 tones, it will only ever produce those 12 tones. You will be forever limited. And if you tell the model there's 200 instruments, those are the only sounds that you'll ever be able to make." The more you constrain a model with what humans already know, the less capable it becomes. By treating everything as pure sound, Suno built what Mikey calls a totally generalized "music-making machine." Such is the power of neural nets.

Claude Code and Suno have more in common than you might think: "It's fun to build things, and it's fun to use what you build." AI lets people be creative in almost any domain, from coding to making music. Today on Training Data, Mikey shares his thesis for why generative AI is the newest form of active entertainment (the next 'gaming'), music as a cultural phenomenon vs creative expression platform, and more. My favorite part was Mikey's explanation of why Suno learns music theory implicitly vs explicitly: "In Western music, there are 12 tones. If you tell the model there are 12 tones, it will only ever produce those 12 tones. You will be forever limited. And if you tell the model there's 200 instruments, those are the only sounds that you'll ever be able to make." The more you constrain a model with what humans already know, the less capable it becomes. By treating everything as pure sound, Suno built what Mikey calls a totally generalized "music-making machine." Such is the power of neural nets.

Sonya Huang 🐥

22,803 просмотров • 2 месяцев назад

Baumol cost disease is real, AI is the solution Take law -- imagine a world where consumers have plentiful access to high-quality legal services Crosby is building an AI-native law firm towards that vision. @Ryanjdaniels John Sarihan share more on Training Data!

Baumol cost disease is real, AI is the solution Take law -- imagine a world where consumers have plentiful access to high-quality legal services Crosby is building an AI-native law firm towards that vision. @Ryanjdaniels John Sarihan share more on Training Data!

Sonya Huang 🐥

69,577 просмотров • 11 месяцев назад

Best technology M&A of all time has to be NVIDIA's $6.9 Billion acquisition of Mellanox in 2020. It was a special treat to interview Michael Kagan, CTO of NVIDIA and co-founder/CTO of Mellanox, for today's Training Data episode. Michael has been driving forward the Compute Frontier for more than 40 years now, first as Chief Architect at Intel in the 90s, then Co-Founder and CTO at Mellanox, and for the last 5 years CTO at NVIDIA. There's nobody better positioned than Michael to share the complete history of the compute frontier and what's ahead, from decades pushing forward Moore's law (squeezing more transistors on a chip) to the last decade of work scaling beyond single chip physics limitations (scaling out to 100K+ GPU clusters). Interconnect is the secret sauce enabling compute to scale beyond chip-level Moore's law. Connecting a fabric of 100K+ GPUs to function as a single unit of compute is the enabling technology for today's intelligence explosion. But other things break at the 100K+ GPU cluster scale: individual chips inevitably fail, power and networking become more complex, etc etc. Net effect of scale out: we've inflected the silicon frontier from Moore's Law (2x every 2 years) to Huang's law (~10x a year). Very excited about today's episode! Learned so much from Michael with Pat Grady

Best technology M&A of all time has to be NVIDIA's $6.9 Billion acquisition of Mellanox in 2020. It was a special treat to interview Michael Kagan, CTO of NVIDIA and co-founder/CTO of Mellanox, for today's Training Data episode. Michael has been driving forward the Compute Frontier for more than 40 years now, first as Chief Architect at Intel in the 90s, then Co-Founder and CTO at Mellanox, and for the last 5 years CTO at NVIDIA. There's nobody better positioned than Michael to share the complete history of the compute frontier and what's ahead, from decades pushing forward Moore's law (squeezing more transistors on a chip) to the last decade of work scaling beyond single chip physics limitations (scaling out to 100K+ GPU clusters). Interconnect is the secret sauce enabling compute to scale beyond chip-level Moore's law. Connecting a fabric of 100K+ GPUs to function as a single unit of compute is the enabling technology for today's intelligence explosion. But other things break at the 100K+ GPU cluster scale: individual chips inevitably fail, power and networking become more complex, etc etc. Net effect of scale out: we've inflected the silicon frontier from Moore's Law (2x every 2 years) to Huang's law (~10x a year). Very excited about today's episode! Learned so much from Michael with Pat Grady

Sonya Huang 🐥

56,175 просмотров • 9 месяцев назад

Some of the most iconic consumer products -- Tide Pods, M&Ms, etc -- were born out of user research studies. LLMs democratize deep user research to every decision. Synthetic audiences will go even further. Alfred Wahlforss of Listen Labs shares more on Training Data cc Konstantine Buhler

Some of the most iconic consumer products -- Tide Pods, M&Ms, etc -- were born out of user research studies. LLMs democratize deep user research to every decision. Synthetic audiences will go even further. Alfred Wahlforss of Listen Labs shares more on Training Data cc Konstantine Buhler

Sonya Huang 🐥

16,289 просмотров • 1 месяц назад

Platform shifts open the door for founders to build something new and legendary, and we're seeing a Cambrian explosion in applications. Thank you Bloomberg for having me on to talk about what we're seeing in the AI market at Sequoia Capital!

Platform shifts open the door for founders to build something new and legendary, and we're seeing a Cambrian explosion in applications. Thank you Bloomberg for having me on to talk about what we're seeing in the AI market at Sequoia Capital!

Sonya Huang 🐥

136,495 просмотров • 3 лет назад

Today on Training Data: Sanjit Biswas, founder & CEO of Samsara (NYSE:IOT) and former Sequoia Capital backed founder of Cisco Meraki Sanjit shares the ups & downs of running neural nets on constrained compute and power footprints in the real world, ~2-10 watts Physical AI is hard 🫡

Today on Training Data: Sanjit Biswas, founder & CEO of Samsara (NYSE:IOT) and former Sequoia Capital backed founder of Cisco Meraki Sanjit shares the ups & downs of running neural nets on constrained compute and power footprints in the real world, ~2-10 watts Physical AI is hard 🫡

Sonya Huang 🐥

35,714 просмотров • 7 месяцев назад

Today on Training Data, the OpenAI team behind ChatGPT agent explain how Agent Mode works, combining: 1) Deep Research (text based research agent) 2) Operator (GUI/action based computer agent) 3) Other new tools (terminal, computer apps) 4) Tied together with shared state to create an agent that's highly capable at most tasks that humans do on a computer: data science analysis, analyzing spreadsheets, making slides, etc. Thanks for joining us Isa Fulford Casey Chu Zhiqing Sun Lauren Reeder!

Today on Training Data, the OpenAI team behind ChatGPT agent explain how Agent Mode works, combining: 1) Deep Research (text based research agent) 2) Operator (GUI/action based computer agent) 3) Other new tools (terminal, computer apps) 4) Tied together with shared state to create an agent that's highly capable at most tasks that humans do on a computer: data science analysis, analyzing spreadsheets, making slides, etc. Thanks for joining us Isa Fulford Casey Chu Zhiqing Sun Lauren Reeder!

Sonya Huang 🐥

44,149 просмотров • 1 год назад

Our most exciting episode of Training Data yet 🍓🍰 OpenAI’s o1 represents a major leap forward by giving models time to "think." Inference-time compute is the next big research frontier. Thrilled to have Noam Brown, ilge, and hunter on the show Pat Grady Sequoia Capital

Our most exciting episode of Training Data yet 🍓🍰 OpenAI’s o1 represents a major leap forward by giving models time to "think." Inference-time compute is the next big research frontier. Thrilled to have Noam Brown, ilge, and hunter on the show Pat Grady Sequoia Capital

Sonya Huang 🐥

51,628 просмотров • 1 год назад

From airplane wings to Dyson spheres, Paul Eremenko explains why physical engineering AI has lagged behind coding AI, and how @p_1_ai's approach to synthetic training data could change everything.

From airplane wings to Dyson spheres, Paul Eremenko explains why physical engineering AI has lagged behind coding AI, and how @p_1_ai's approach to synthetic training data could change everything.

Sonya Huang 🐥

27,686 просмотров • 1 год назад

Today's Training Data episode features our newest investment 🤗 the fal guys Gorkem Yurtseven, Burkay Gur, and batuhan the fal guy on where generative media is headed~ When computer animation first arrived, the film industry was skeptical at best and downright hostile at worst. But technology doesn’t stop, and now some of the highest grossing films and most celebrated artistic achievements in film are CGI. In this episode we discuss how Hollywood’s stance toward AI has shifted fast, the rise of AI-native studios, why IP holders aren’t sitting still, and so much more. Excited to dig into the ai video compute supercycle with the fal guys.

Today's Training Data episode features our newest investment 🤗 the fal guys Gorkem Yurtseven, Burkay Gur, and batuhan the fal guy on where generative media is headed~ When computer animation first arrived, the film industry was skeptical at best and downright hostile at worst. But technology doesn’t stop, and now some of the highest grossing films and most celebrated artistic achievements in film are CGI. In this episode we discuss how Hollywood’s stance toward AI has shifted fast, the rise of AI-native studios, why IP holders aren’t sitting still, and so much more. Excited to dig into the ai video compute supercycle with the fal guys.

Sonya Huang 🐥

17,181 просмотров • 7 месяцев назад

Can we map the mind of an LLM? Our first mechanistic interpretability episode on Training Data featuring Goodfire founder Eric Ho (and our first cameo from Roelof Botha!) Goodfire is building an independent mech interp lab, led by some heavyweight researchers from the field (e.g. Lee Sharkey who has led a lot of important work in sparse autoencoders to "unscramble" LLMs and resolve superposition, Nick who has been a key pioneer behind auto interpretability) On this episode, Eric gives us a flyover of the technical results so far from this nascent field (universality, superposition), what's ahead in the research (going from circuits to weights, going from understanding to increasingly surgical editing), a preview of the real-world work they're doing already with Arc Institute, and the impact he expects Goodfire and the broader field to have on steering, safety, editing and more.

Can we map the mind of an LLM? Our first mechanistic interpretability episode on Training Data featuring Goodfire founder Eric Ho (and our first cameo from Roelof Botha!) Goodfire is building an independent mech interp lab, led by some heavyweight researchers from the field (e.g. Lee Sharkey who has led a lot of important work in sparse autoencoders to "unscramble" LLMs and resolve superposition, Nick who has been a key pioneer behind auto interpretability) On this episode, Eric gives us a flyover of the technical results so far from this nascent field (universality, superposition), what's ahead in the research (going from circuits to weights, going from understanding to increasingly surgical editing), a preview of the real-world work they're doing already with Arc Institute, and the impact he expects Goodfire and the broader field to have on steering, safety, editing and more.

Sonya Huang 🐥

19,379 просмотров • 1 год назад

This week on Training Data: robots! I really admire @Physical_int for their open publishing spirit. Karol Hausman and Tobias Springenberg joined me and Alfred Lin to chat about pi*0.6, learning from experience, long-horizon robot performance, and more.

This week on Training Data: robots! I really admire @Physical_int for their open publishing spirit. Karol Hausman and Tobias Springenberg joined me and Alfred Lin to chat about pi*0.6, learning from experience, long-horizon robot performance, and more.

Sonya Huang 🐥

12,093 просмотров • 6 месяцев назад

What if 8 hours of research could be done in 5 minutes? OpenAI's Deep Research is an agent trained end-to-end w/ RL fine-tuning on my favorite task: internet sleuthing👩‍💻 Isa Fulford & Josh Tobin joined us to share what's under the hood + the future of OpenAI's agent roadmap.

What if 8 hours of research could be done in 5 minutes? OpenAI's Deep Research is an agent trained end-to-end w/ RL fine-tuning on my favorite task: internet sleuthing👩‍💻 Isa Fulford & Josh Tobin joined us to share what's under the hood + the future of OpenAI's agent roadmap.

Sonya Huang 🐥

20,212 просмотров • 1 год назад