Epoch AI's banner

Epoch AI

@EpochAIResearch • 47,193 subscribers

Investigating the trajectory of AI for the benefit of society.

Videos

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Exclusive private shows

1.2k viewers online

Private Show

Join now for exclusive access

Free preview available • Premium content

What are the largest software engineering tasks AI can perform? To answer this, we built MirrorCode, our long-horizon SWE benchmark that lets AI code autonomously for days at a time. The best models complete some tasks we estimate would take human engineers several weeks.

What are the largest software engineering tasks AI can perform? To answer this, we built MirrorCode, our long-horizon SWE benchmark that lets AI code autonomously for days at a time. The best models complete some tasks we estimate would take human engineers several weeks.

40,224 Aufrufe • vor 21 Stunden

1/ Can AI scaling continue through 2030? We examine whether constraints on power, chip manufacturing, training data, or data center latencies might hinder AI growth. Our analysis suggests that AI scaling can likely continue its current trend through 2030.

1/ Can AI scaling continue through 2030? We examine whether constraints on power, chip manufacturing, training data, or data center latencies might hinder AI growth. Our analysis suggests that AI scaling can likely continue its current trend through 2030.

1,507,992 Aufrufe • vor 1 Jahr

How does math research change when the cost of trying your first dumb idea goes to zero? Daniel Litt joins Greg Burnham and Anson Ho to discuss what today’s models can and can’t do in math, and how far they are from doing high-quality research. 0:00:00 What's the hardest math problem AI can solve today? 00:16:08 How helpful are today’s AI models for math research? 00:23:36 Junk papers, LLM-generated proofs, and the refereeing crisis 00:27:21 AI enables searching through problems at scale 00:33:49 When will AI be good enough to publish in top math journals? 00:42:15 What are the returns to intelligence? 00:59:50 Will AI solve Millennium problems? 01:11:54 Is math full of low-hanging fruit? 01:18:47 How Daniel has adapted his professional life to AI progress 01:25:28 What do AI math benchmarks actually measure? 01:33:05 Designing the Open Problems benchmark 01:56:35 Do mathematicians believe heuristic arguments about conjectures? 02:01:24 What if FrontierMath: Open Problems gets solved? 02:06:53 Is AI on the cusp of accelerating math progress?

How does math research change when the cost of trying your first dumb idea goes to zero? Daniel Litt joins Greg Burnham and Anson Ho to discuss what today’s models can and can’t do in math, and how far they are from doing high-quality research. 0:00:00 What's the hardest math problem AI can solve today? 00:16:08 How helpful are today’s AI models for math research? 00:23:36 Junk papers, LLM-generated proofs, and the refereeing crisis 00:27:21 AI enables searching through problems at scale 00:33:49 When will AI be good enough to publish in top math journals? 00:42:15 What are the returns to intelligence? 00:59:50 Will AI solve Millennium problems? 01:11:54 Is math full of low-hanging fruit? 01:18:47 How Daniel has adapted his professional life to AI progress 01:25:28 What do AI math benchmarks actually measure? 01:33:05 Designing the Open Problems benchmark 01:56:35 Do mathematicians believe heuristic arguments about conjectures? 02:01:24 What if FrontierMath: Open Problems gets solved? 02:06:53 Is AI on the cusp of accelerating math progress?

178,703 Aufrufe • vor 4 Monaten

Mathematics offers a unique window into AI's reasoning capabilities. Discover why we've launched FrontierMath—a benchmark of hundreds of unpublished, expert-level math problems—to understand the frontier of artificial intelligence.

Mathematics offers a unique window into AI's reasoning capabilities. Discover why we've launched FrontierMath—a benchmark of hundreds of unpublished, expert-level math problems—to understand the frontier of artificial intelligence.

394,050 Aufrufe • vor 1 Jahr

What are current economic models missing about AGI? How would we know if we were approaching explosive growth? Stanford economist Phil Trammell has been rigorously thinking about the intersection of economic theory and AI (incl. AGI) for over five years, long before the recent surge of interest in large language models. In this episode of Epoch After Hours, Philip Trammell and Epoch AI researcher Anson Ho discuss what economic theory really has to say about the development and impacts of AGI: what current economic models get wrong, the odds of explosive economic growth, what “real GDP” actually measures, and much more! -- Timestamps -- 0:00:00 - Problems with existing work on the economics of AI 0:10:18 - Declining returns to R&D 0:18:28 - What real GDP misses 0:26:57 - Task-based models & AI automation 0:49:32 - The limits of economic theory 1:09:11 - How to detect an economic singularity 1:23:32 - Increasing returns to scale

What are current economic models missing about AGI? How would we know if we were approaching explosive growth? Stanford economist Phil Trammell has been rigorously thinking about the intersection of economic theory and AI (incl. AGI) for over five years, long before the recent surge of interest in large language models. In this episode of Epoch After Hours, Philip Trammell and Epoch AI researcher Anson Ho discuss what economic theory really has to say about the development and impacts of AGI: what current economic models get wrong, the odds of explosive economic growth, what “real GDP” actually measures, and much more! -- Timestamps -- 0:00:00 - Problems with existing work on the economics of AI 0:10:18 - Declining returns to R&D 0:18:28 - What real GDP misses 0:26:57 - Task-based models & AI automation 0:49:32 - The limits of economic theory 1:09:11 - How to detect an economic singularity 1:23:32 - Increasing returns to scale

103,973 Aufrufe • vor 8 Monaten

Even within our own research team, timelines for transformative AI differ substantially. In this episode, the two Epoch AI researchers with the longest and the shortest timelines for transformative AI candidly examine the roots of their disagreements. They discuss: How and why their timelines for specific milestones differ Current technical challenges for AI progress Why widespread automation beats geniuses in datacenters Pitfalls of conventional AI prediction approaches Critiquing "single AGI" or "utopia vs. doom" narratives How a world with AGI might look like And much, much more. (0:00:00) - Preview (0:01:08) - Contrasting AGI Timelines (0:08:30) - Updating Beliefs as Capabilities Advance (0:17:07) - Moravec’s Paradox and the Agency Challenge (0:32:40) - Missing Capabilities for AGI (0:47:43) - Beating benchmarks vs Being Useful (0:59:20) - AI Excelling in Some Tasks While Struggling with Others (1:07:33) - Economic Impact of AI vs the Internet (1:24:08) - Widespread Automation Beats Genius in Datacenters (1:51:37) - How Stories Shape Our Expectations of AI (2:03:24) - How AGI Will Impact Culture (2:10:46) - Beyond Utopia-or-Extinction (2:16:57) - AI's Impact on Wages and Labor (2:27:49) - Why Better Preservation of Information Accelerates Change (2:39:32) - Markets Shaping Cultural Priorities (2:55:51) - Challenges in Defining What We Want to Preserve (3:06:47) - Risk Attitudes in AI Decision-Making (3:12:50) - Historical Lessons for AI Coexistence (3:21:45) - A Warning Sign in Safety Discourse (3:30:20) - Revisiting Core Assumptions in AI Alignment (3:49:46) - Simple Models in Complex Domains

Even within our own research team, timelines for transformative AI differ substantially. In this episode, the two Epoch AI researchers with the longest and the shortest timelines for transformative AI candidly examine the roots of their disagreements. They discuss: How and why their timelines for specific milestones differ Current technical challenges for AI progress Why widespread automation beats geniuses in datacenters Pitfalls of conventional AI prediction approaches Critiquing "single AGI" or "utopia vs. doom" narratives How a world with AGI might look like And much, much more. (0:00:00) - Preview (0:01:08) - Contrasting AGI Timelines (0:08:30) - Updating Beliefs as Capabilities Advance (0:17:07) - Moravec’s Paradox and the Agency Challenge (0:32:40) - Missing Capabilities for AGI (0:47:43) - Beating benchmarks vs Being Useful (0:59:20) - AI Excelling in Some Tasks While Struggling with Others (1:07:33) - Economic Impact of AI vs the Internet (1:24:08) - Widespread Automation Beats Genius in Datacenters (1:51:37) - How Stories Shape Our Expectations of AI (2:03:24) - How AGI Will Impact Culture (2:10:46) - Beyond Utopia-or-Extinction (2:16:57) - AI's Impact on Wages and Labor (2:27:49) - Why Better Preservation of Information Accelerates Change (2:39:32) - Markets Shaping Cultural Priorities (2:55:51) - Challenges in Defining What We Want to Preserve (3:06:47) - Risk Attitudes in AI Decision-Making (3:12:50) - Historical Lessons for AI Coexistence (3:21:45) - A Warning Sign in Safety Discourse (3:30:20) - Revisiting Core Assumptions in AI Alignment (3:49:46) - Simple Models in Complex Domains

159,795 Aufrufe • vor 1 Jahr

We've got a podcast! In our first episode, Ege, Tamay and Jaime dig into: • What they expect AI to look like by 2030 • Why economists are underestimating the likelihood of explosive growth • The startling regularity in technological trends like Moore's Law • Moravec’s paradox, and how we might overcome it And more more! Timestamps: 00:00:00 Preview 00:00:37 What is Epoch AI? 00:02:32 Scaling Laws 00:08:43 Key Drivers 00:19:20 End of the Decade Predictions 00:21:18 Bottlenecks: Power 00:27:59 Bottlenecks: GPUs 00:32:07 Bottlenecks: Data 00:45:37 Bottlenecks: Latency 00:56:18 Bottlenecks: Failure Rates 01:03:55 AI Investment 01:07:11 Automation 01:12:10 Benchmarks & Moravec’s Paradox 01:19:45 Economic Impact 01:45:48 Open Questions & Takeaways

We've got a podcast! In our first episode, Ege, Tamay and Jaime dig into: • What they expect AI to look like by 2030 • Why economists are underestimating the likelihood of explosive growth • The startling regularity in technological trends like Moore's Law • Moravec’s paradox, and how we might overcome it And more more! Timestamps: 00:00:00 Preview 00:00:37 What is Epoch AI? 00:02:32 Scaling Laws 00:08:43 Key Drivers 00:19:20 End of the Decade Predictions 00:21:18 Bottlenecks: Power 00:27:59 Bottlenecks: GPUs 00:32:07 Bottlenecks: Data 00:45:37 Bottlenecks: Latency 00:56:18 Bottlenecks: Failure Rates 01:03:55 AI Investment 01:07:11 Automation 01:12:10 Benchmarks & Moravec’s Paradox 01:19:45 Economic Impact 01:45:48 Open Questions & Takeaways

152,591 Aufrufe • vor 1 Jahr

Are AI benchmarks doomed? Greg Burnham and Tom Adamczewski join Anson Ho to push back on benchmark pessimism and dig into what the next generation of AI benchmarks could look like. (0:00:00) - Preview (0:00:36) - Intro: Are AI benchmarks doomed? (0:03:13) - The costs and benefits of benchmark development (0:11:48) - MirrorCode and scalable benchmarks (0:20:57) - AI speed-up in benchmark development (0:23:28) - The benchmark-reality gap (0:38:26) - Can an AGI benchmark exist? (0:43:18) - Beyond automated scoring (1:00:45) - How AI changes benchmark building in practice

Are AI benchmarks doomed? Greg Burnham and Tom Adamczewski join Anson Ho to push back on benchmark pessimism and dig into what the next generation of AI benchmarks could look like. (0:00:00) - Preview (0:00:36) - Intro: Are AI benchmarks doomed? (0:03:13) - The costs and benefits of benchmark development (0:11:48) - MirrorCode and scalable benchmarks (0:20:57) - AI speed-up in benchmark development (0:23:28) - The benchmark-reality gap (0:38:26) - Can an AGI benchmark exist? (0:43:18) - Beyond automated scoring (1:00:45) - How AI changes benchmark building in practice

22,037 Aufrufe • vor 1 Monat

Will AI drive Europe towards a “high interest rates, no growth” future? How will automation impact entry-level workers and the structure of firms? What role does Europe play as the US and China continue to push AI progress? On this episode of Epoch After Hours, Luis Garicano 🇪🇺🇺🇦 joins Epoch AI researchers Anson Ho and Andrei Potlogea to discuss these questions. -- Timestamps -- 0:00:00 – Will AI trigger explosive growth? 0:06:26 – Short-run macroeconomic effects 0:11:29 – The decline of junior jobs 0:20:21 – The missing training ladder 0:39:31 – Europe’s AI regulation problem 0:52:46 – Who captures AI value? 01:08:17 – AI, interest rates & fiscal future

Will AI drive Europe towards a “high interest rates, no growth” future? How will automation impact entry-level workers and the structure of firms? What role does Europe play as the US and China continue to push AI progress? On this episode of Epoch After Hours, Luis Garicano 🇪🇺🇺🇦 joins Epoch AI researchers Anson Ho and Andrei Potlogea to discuss these questions. -- Timestamps -- 0:00:00 – Will AI trigger explosive growth? 0:06:26 – Short-run macroeconomic effects 0:11:29 – The decline of junior jobs 0:20:21 – The missing training ladder 0:39:31 – Europe’s AI regulation problem 0:52:46 – Who captures AI value? 01:08:17 – AI, interest rates & fiscal future

33,102 Aufrufe • vor 6 Monaten

New Epoch podcast episode: How far can current AI trends continue? Jaime Sevilla and Yafah Edelman on where current AI trends carry us —and where they break. They disagree on mechanisms and outcomes, but agree on this: fast diffusion now, broad cognitive automation by ~2035, and extreme uncertainty after. (0:00:00) - Preview (0:00:41) - Intro: Does 5× compute scaling continue? (0:08:15) - Largest training run in 2030 & what does it imply? (0:12:44) - Impact on Software Engineering & other cognitive tasks (0:23:27) - Economic impacts near the end of the decade (0:31:34) - 2030 bifurcation: Slow down or take off? (0:35:49) - Physical vs cognitive automation (0:44:37) - Timelines and impact of full cognitive automation (1:02:37) - Returns to intelligence (1:08:51) - Three cruxes after 2035 (Robots, technology & intelligence) (1:16:28) - What happens in 2040? (1:23:16) - Recap: Three eras of forecasting (1:37:42) - Closing remarks

New Epoch podcast episode: How far can current AI trends continue? Jaime Sevilla and Yafah Edelman on where current AI trends carry us —and where they break. They disagree on mechanisms and outcomes, but agree on this: fast diffusion now, broad cognitive automation by ~2035, and extreme uncertainty after. (0:00:00) - Preview (0:00:41) - Intro: Does 5× compute scaling continue? (0:08:15) - Largest training run in 2030 & what does it imply? (0:12:44) - Impact on Software Engineering & other cognitive tasks (0:23:27) - Economic impacts near the end of the decade (0:31:34) - 2030 bifurcation: Slow down or take off? (0:35:49) - Physical vs cognitive automation (0:44:37) - Timelines and impact of full cognitive automation (1:02:37) - Returns to intelligence (1:08:51) - Three cruxes after 2035 (Robots, technology & intelligence) (1:16:28) - What happens in 2040? (1:23:16) - Recap: Three eras of forecasting (1:37:42) - Closing remarks

34,979 Aufrufe • vor 9 Monaten

Are most economists wrong about AI? Most tech doesn't radically accelerate growth, and most economists think AI won't either. But this misses a key point: AI can substitute for humans across *all* tasks. Without human bottlenecks, the economy could grow enormously faster.

Are most economists wrong about AI? Most tech doesn't radically accelerate growth, and most economists think AI won't either. But this misses a key point: AI can substitute for humans across all tasks. Without human bottlenecks, the economy could grow enormously faster.

42,031 Aufrufe • vor 1 Jahr

When mathematicians make breakthroughs, they hallucinate too. They reach beyond established results. But unlike AI, they’ve learned to tell a promising hallucination from a dead end. Number theorist Ken Ono on AI, creativity, and mathematical discovery. -- Timestamps -- 00:00:00 – Why predicting problem difficulty is so hard 00:00:27 – How mathematicians and AI both “hallucinate” 00:02:15 – AI as a copilot for mathematical discovery 00:04:00 – How AI helped reveal new formulas for primes 00:09:02 – The promise and peril of AI 00:15:22 – What makes a great mathematical question 00:17:54 – Teaching by making deliberate mistakes 00:21:22 – Will AI reshape the world like the Industrial Revolution?

When mathematicians make breakthroughs, they hallucinate too. They reach beyond established results. But unlike AI, they’ve learned to tell a promising hallucination from a dead end. Number theorist Ken Ono on AI, creativity, and mathematical discovery. -- Timestamps -- 00:00:00 – Why predicting problem difficulty is so hard 00:00:27 – How mathematicians and AI both “hallucinate” 00:02:15 – AI as a copilot for mathematical discovery 00:04:00 – How AI helped reveal new formulas for primes 00:09:02 – The promise and peril of AI 00:15:22 – What makes a great mathematical question 00:17:54 – Teaching by making deliberate mistakes 00:21:22 – Will AI reshape the world like the Industrial Revolution?

18,095 Aufrufe • vor 8 Monaten

A month ago, we invited 30 of the world’s top mathematicians to Berkeley for a weekend to finish a very hard math exam. The 2025 FrontierMath Symposium wrapped up the hardest tier of FrontierMath, our benchmark for AI’s math abilities. The mathematicians tested AI models on their most challenging math problems, and discussed AI’s future in math. Watch our closing ceremony footage for a glimpse behind the scenes. -Timestamps- (00:00) - Intro by Elliot Glazer - Epoch AI (03:38) - Topology | Sergei Gukov - Caltech (06:53) - Algebraic Geometry | Ravi Vakil - Stanford (09:10) - Number Theory | Ken Ono - University of Virginia (13:19) - Combinatorics | Igor Pak - UCLA (16:54) - Analysis | Paata Ivanisvili - UC Irvine (18:18) - Closing Remarks

A month ago, we invited 30 of the world’s top mathematicians to Berkeley for a weekend to finish a very hard math exam. The 2025 FrontierMath Symposium wrapped up the hardest tier of FrontierMath, our benchmark for AI’s math abilities. The mathematicians tested AI models on their most challenging math problems, and discussed AI’s future in math. Watch our closing ceremony footage for a glimpse behind the scenes. -Timestamps- (00:00) - Intro by Elliot Glazer - Epoch AI (03:38) - Topology | Sergei Gukov - Caltech (06:53) - Algebraic Geometry | Ravi Vakil - Stanford (09:10) - Number Theory | Ken Ono - University of Virginia (13:19) - Combinatorics | Igor Pak - UCLA (16:54) - Analysis | Paata Ivanisvili - UC Irvine (18:18) - Closing Remarks

21,379 Aufrufe • vor 1 Jahr

Stanford mathematician Ravi Vakil, president of the American Mathematical Society, expects AI’s impact on mathematics to come as a phase change, not a slow climb. Every major shift in math has caught experts off guard, he says. This one will be no different, except that all our predictions will be even more wrong. -- Timestamps – 00:00:00 – Playing games with imperfect information against AI 00:02:35 – When AI will learn to be truly creative 00:03:48 – AI’s impact will be even more unpredictable than the internet 00:08:02 – What an “AlphaGo moment” would look like for math 00:10:35 – How AI will actually be useful in mathematical research 00:12:20 – Writing “wow”-level math problems for AI 00:15:06 – On a 0-10 scale, AI will change math 8 + 3i 00:16:17 – Is math the next chess?

Stanford mathematician Ravi Vakil, president of the American Mathematical Society, expects AI’s impact on mathematics to come as a phase change, not a slow climb. Every major shift in math has caught experts off guard, he says. This one will be no different, except that all our predictions will be even more wrong. -- Timestamps – 00:00:00 – Playing games with imperfect information against AI 00:02:35 – When AI will learn to be truly creative 00:03:48 – AI’s impact will be even more unpredictable than the internet 00:08:02 – What an “AlphaGo moment” would look like for math 00:10:35 – How AI will actually be useful in mathematical research 00:12:20 – Writing “wow”-level math problems for AI 00:15:06 – On a 0-10 scale, AI will change math 8 + 3i 00:16:17 – Is math the next chess?

12,920 Aufrufe • vor 8 Monaten

Keine weiteren Inhalte verfügbar