
Epoch AI
@EpochAIResearch • 47,193 subscribers
Investigating the trajectory of AI for the benefit of society.
Videos

What are the largest software engineering tasks AI can perform? To answer this, we built MirrorCode, our long-horizon SWE benchmark that lets AI code autonomously for days at a time. The best models complete some tasks we estimate would take human engineers several weeks.
Epoch AI40,224 Aufrufe • vor 21 Stunden

How does math research change when the cost of trying your first dumb idea goes to zero? Daniel Litt joins Greg Burnham and Anson Ho to discuss what today’s models can and can’t do in math, and how far they are from doing high-quality research. 0:00:00 What's the hardest math problem AI can solve today? 00:16:08 How helpful are today’s AI models for math research? 00:23:36 Junk papers, LLM-generated proofs, and the refereeing crisis 00:27:21 AI enables searching through problems at scale 00:33:49 When will AI be good enough to publish in top math journals? 00:42:15 What are the returns to intelligence? 00:59:50 Will AI solve Millennium problems? 01:11:54 Is math full of low-hanging fruit? 01:18:47 How Daniel has adapted his professional life to AI progress 01:25:28 What do AI math benchmarks actually measure? 01:33:05 Designing the Open Problems benchmark 01:56:35 Do mathematicians believe heuristic arguments about conjectures? 02:01:24 What if FrontierMath: Open Problems gets solved? 02:06:53 Is AI on the cusp of accelerating math progress?
Epoch AI178,703 Aufrufe • vor 4 Monaten

What are current economic models missing about AGI? How would we know if we were approaching explosive growth? Stanford economist Phil Trammell has been rigorously thinking about the intersection of economic theory and AI (incl. AGI) for over five years, long before the recent surge of interest in large language models. In this episode of Epoch After Hours, Philip Trammell and Epoch AI researcher Anson Ho discuss what economic theory really has to say about the development and impacts of AGI: what current economic models get wrong, the odds of explosive economic growth, what “real GDP” actually measures, and much more! -- Timestamps -- 0:00:00 - Problems with existing work on the economics of AI 0:10:18 - Declining returns to R&D 0:18:28 - What real GDP misses 0:26:57 - Task-based models & AI automation 0:49:32 - The limits of economic theory 1:09:11 - How to detect an economic singularity 1:23:32 - Increasing returns to scale
Epoch AI103,973 Aufrufe • vor 8 Monaten

Even within our own research team, timelines for transformative AI differ substantially. In this episode, the two Epoch AI researchers with the longest and the shortest timelines for transformative AI candidly examine the roots of their disagreements. They discuss: How and why their timelines for specific milestones differ Current technical challenges for AI progress Why widespread automation beats geniuses in datacenters Pitfalls of conventional AI prediction approaches Critiquing "single AGI" or "utopia vs. doom" narratives How a world with AGI might look like And much, much more. (0:00:00) - Preview (0:01:08) - Contrasting AGI Timelines (0:08:30) - Updating Beliefs as Capabilities Advance (0:17:07) - Moravec’s Paradox and the Agency Challenge (0:32:40) - Missing Capabilities for AGI (0:47:43) - Beating benchmarks vs Being Useful (0:59:20) - AI Excelling in Some Tasks While Struggling with Others (1:07:33) - Economic Impact of AI vs the Internet (1:24:08) - Widespread Automation Beats Genius in Datacenters (1:51:37) - How Stories Shape Our Expectations of AI (2:03:24) - How AGI Will Impact Culture (2:10:46) - Beyond Utopia-or-Extinction (2:16:57) - AI's Impact on Wages and Labor (2:27:49) - Why Better Preservation of Information Accelerates Change (2:39:32) - Markets Shaping Cultural Priorities (2:55:51) - Challenges in Defining What We Want to Preserve (3:06:47) - Risk Attitudes in AI Decision-Making (3:12:50) - Historical Lessons for AI Coexistence (3:21:45) - A Warning Sign in Safety Discourse (3:30:20) - Revisiting Core Assumptions in AI Alignment (3:49:46) - Simple Models in Complex Domains
Epoch AI159,795 Aufrufe • vor 1 Jahr

We've got a podcast! In our first episode, Ege, Tamay and Jaime dig into: • What they expect AI to look like by 2030 • Why economists are underestimating the likelihood of explosive growth • The startling regularity in technological trends like Moore's Law • Moravec’s paradox, and how we might overcome it And more more! Timestamps: 00:00:00 Preview 00:00:37 What is Epoch AI? 00:02:32 Scaling Laws 00:08:43 Key Drivers 00:19:20 End of the Decade Predictions 00:21:18 Bottlenecks: Power 00:27:59 Bottlenecks: GPUs 00:32:07 Bottlenecks: Data 00:45:37 Bottlenecks: Latency 00:56:18 Bottlenecks: Failure Rates 01:03:55 AI Investment 01:07:11 Automation 01:12:10 Benchmarks & Moravec’s Paradox 01:19:45 Economic Impact 01:45:48 Open Questions & Takeaways
Epoch AI152,591 Aufrufe • vor 1 Jahr

Are AI benchmarks doomed? Greg Burnham and Tom Adamczewski join Anson Ho to push back on benchmark pessimism and dig into what the next generation of AI benchmarks could look like. (0:00:00) - Preview (0:00:36) - Intro: Are AI benchmarks doomed? (0:03:13) - The costs and benefits of benchmark development (0:11:48) - MirrorCode and scalable benchmarks (0:20:57) - AI speed-up in benchmark development (0:23:28) - The benchmark-reality gap (0:38:26) - Can an AGI benchmark exist? (0:43:18) - Beyond automated scoring (1:00:45) - How AI changes benchmark building in practice
Epoch AI22,037 Aufrufe • vor 1 Monat

Will AI drive Europe towards a “high interest rates, no growth” future? How will automation impact entry-level workers and the structure of firms? What role does Europe play as the US and China continue to push AI progress? On this episode of Epoch After Hours, Luis Garicano 🇪🇺🇺🇦 joins Epoch AI researchers Anson Ho and Andrei Potlogea to discuss these questions. -- Timestamps -- 0:00:00 – Will AI trigger explosive growth? 0:06:26 – Short-run macroeconomic effects 0:11:29 – The decline of junior jobs 0:20:21 – The missing training ladder 0:39:31 – Europe’s AI regulation problem 0:52:46 – Who captures AI value? 01:08:17 – AI, interest rates & fiscal future
Epoch AI33,102 Aufrufe • vor 6 Monaten

New Epoch podcast episode: How far can current AI trends continue? Jaime Sevilla and Yafah Edelman on where current AI trends carry us —and where they break. They disagree on mechanisms and outcomes, but agree on this: fast diffusion now, broad cognitive automation by ~2035, and extreme uncertainty after. (0:00:00) - Preview (0:00:41) - Intro: Does 5× compute scaling continue? (0:08:15) - Largest training run in 2030 & what does it imply? (0:12:44) - Impact on Software Engineering & other cognitive tasks (0:23:27) - Economic impacts near the end of the decade (0:31:34) - 2030 bifurcation: Slow down or take off? (0:35:49) - Physical vs cognitive automation (0:44:37) - Timelines and impact of full cognitive automation (1:02:37) - Returns to intelligence (1:08:51) - Three cruxes after 2035 (Robots, technology & intelligence) (1:16:28) - What happens in 2040? (1:23:16) - Recap: Three eras of forecasting (1:37:42) - Closing remarks
Epoch AI34,979 Aufrufe • vor 9 Monaten

Are most economists wrong about AI? Most tech doesn't radically accelerate growth, and most economists think AI won't either. But this misses a key point: AI can substitute for humans across *all* tasks. Without human bottlenecks, the economy could grow enormously faster.
Epoch AI42,031 Aufrufe • vor 1 Jahr

When mathematicians make breakthroughs, they hallucinate too. They reach beyond established results. But unlike AI, they’ve learned to tell a promising hallucination from a dead end. Number theorist Ken Ono on AI, creativity, and mathematical discovery. -- Timestamps -- 00:00:00 – Why predicting problem difficulty is so hard 00:00:27 – How mathematicians and AI both “hallucinate” 00:02:15 – AI as a copilot for mathematical discovery 00:04:00 – How AI helped reveal new formulas for primes 00:09:02 – The promise and peril of AI 00:15:22 – What makes a great mathematical question 00:17:54 – Teaching by making deliberate mistakes 00:21:22 – Will AI reshape the world like the Industrial Revolution?
Epoch AI18,095 Aufrufe • vor 8 Monaten

A month ago, we invited 30 of the world’s top mathematicians to Berkeley for a weekend to finish a very hard math exam. The 2025 FrontierMath Symposium wrapped up the hardest tier of FrontierMath, our benchmark for AI’s math abilities. The mathematicians tested AI models on their most challenging math problems, and discussed AI’s future in math. Watch our closing ceremony footage for a glimpse behind the scenes. -Timestamps- (00:00) - Intro by Elliot Glazer - Epoch AI (03:38) - Topology | Sergei Gukov - Caltech (06:53) - Algebraic Geometry | Ravi Vakil - Stanford (09:10) - Number Theory | Ken Ono - University of Virginia (13:19) - Combinatorics | Igor Pak - UCLA (16:54) - Analysis | Paata Ivanisvili - UC Irvine (18:18) - Closing Remarks
Epoch AI21,379 Aufrufe • vor 1 Jahr

Stanford mathematician Ravi Vakil, president of the American Mathematical Society, expects AI’s impact on mathematics to come as a phase change, not a slow climb. Every major shift in math has caught experts off guard, he says. This one will be no different, except that all our predictions will be even more wrong. -- Timestamps – 00:00:00 – Playing games with imperfect information against AI 00:02:35 – When AI will learn to be truly creative 00:03:48 – AI’s impact will be even more unpredictable than the internet 00:08:02 – What an “AlphaGo moment” would look like for math 00:10:35 – How AI will actually be useful in mathematical research 00:12:20 – Writing “wow”-level math problems for AI 00:15:06 – On a 0-10 scale, AI will change math 8 + 3i 00:16:17 – Is math the next chess?
Epoch AI12,920 Aufrufe • vor 8 Monaten
Keine weiteren Inhalte verfügbar