
Semi Doped
@semidoped • 2,677 subscribers
Substack: https://t.co/QbBWKXz7aE podcast: https://t.co/ZbC7r4fkZp YouTube: https://t.co/3o2BY2nYWW
Videos

A masterclass on Google's TPU v8 Networking. Two TPU chips? Pssh. We already knew workload-specific silicon was here. But two scale-up networking topologies? That's the actual Google TPU news. Workload-specific interconnects. Think about that. New Semi Doped with Vikram Sekar and Austin Lyons. Copper? Yep. Optics? Yep. What we cover: - TPU splits in two: 8t training, 8i inference. - Virgo: 47 Pb/s scale-out fabric, 100% OCS. - Boardfly scale-up: copper PCB + AECs inside racks, OCS between groups. 16 hops → 7. - Training uses 3D torus (Rubik's Cube). - Inference doesn't. Workload-specific topologies now. - Dragonfly traces to a 2008 paper by Kim, Dally, Scott, Abts. Abts went on to build Groq's interconnect before Nvidia. Chapters: 0:00 Intro 0:21 Two TPUs for two workloads 2:31 HBM, SRAM, and Axion CPUs 7:22 Why networking is the new bottleneck 17:14 Virgo: rebuilding scale-out on optics 25:24 3D torus Rubik's Cube scale-up for training 34:50 Boardfly: scale-up for MoE inference 42:07 Workload-specific everything $GOOGL
Semi Doped92,064 views • 1 month ago

Whiplash week. Optics? Copper? Both? Mon AM: Nvidia bets $4B on optics. Mon PM: Credo posts 200% YoY growth on copper. Wed PM: Hock Tan claims 400G/lane works over copper, potentially pushing CPO past 2030. 48 hours of whiplash. Optics? Copper? The answer is both. The question is when. This week and Vikram Sekar unpack: - Nvidia locking up laser supply - Credo’s blowout quarter and the reliability thesis - Broadcom’s copper bombshell - A 4D chess theory on why Hock Tan downplays optics when Broadcom is a CPO company Chapters (00:00) - Newsletter Plugs: Groq LPUs & Broadcom’s Laser Business (03:15) - Dynamo & the Rise of Workload-Specific Hardware (08:04) - Austin’s Broadcom Laser Deep Dive (09:53) - The Week’s Whiplash: Optics Monday, Copper Wednesday (17:50) - Why Nvidia Invested $4B: Geopolitics, Supply & the HBM Playbook (24:15) - CPO Lasers & Optical Circuit Switches (26:16) - Credo Earnings: 200% YoY Growth & the Copper Bull Case (31:09) - Reliability, AECs & Oracle’s GPU Cluster Problem (35:48) - Credo’s Optics Play: Micro-LED Active Cables & the CPO Timing Risk (38:45) - Broadcom Earnings: Hock Tan’s Copper Bombshell (43:34) - Customer-Owned Tooling: Hock Tan Says “Good Luck” (44:25) - Vik’s 4D Chess Theory: Why Hock Tan Talks Up Copper (47:03) - Wrap-Up: It’s Both — The Real Question Is Timing $AVGO $CRDO $NVDA
Semi Doped33,695 views • 3 months ago

New interview: Reiner Pope, co-founder/CEO of MatX A counterintuitive throughput insight: “Low latency means small batch sizes. That is just Little’s law. Memory occupancy in HBM is proportional to batch size. So you can actually fit longer contexts than you could if the latency were larger. Low latency is not just a usability win, it improves throughput.” We get into: • The hybrid SRAM + HBM bet, and why pipeline parallelism finally works • Why sparse MoE drives MatX to “the most interconnect of any announced product” • Why frontier labs are willing to bet on an AI ASIC startup • Memory-bandwidth-efficient attention, numerics, and what MatX publishes (and what it does not) • Why 95% of model-side news is noise for chip design • The biggest challenges ahead 00:00 “We left Google one week before ChatGPT” 00:24 Intro: who is MatX 01:17 Origin story: leaving Google for LLM chips 02:21 GPT-3 and the “too expensive” problem 04:25 Why buy hardware that is not a GPU 05:52 Overcoming the CUDA moat 08:46 Early investors 09:35 The name MatX 09:59 The chip: matrix multiply + hybrid SRAM/HBM 12:11 Why pipeline parallelism finally works 14:22 Reading papers and Google going dark 15:20 Research agenda: attention and numerics 17:06 Five specs and meeting customers where they are 19:24 Why frontier labs are the natural first customer 20:32 Workloads: training, prefill, decode 22:18 Little’s law and the throughput case for low latency 24:29 Interconnect and MoE topology 26:35 Inside the team: 100 people, full stack 28:32 Agentic AI: 95% noise for hardware 30:35 KV cache sizing in an agentic world 32:11 How MatX uses AI for chip design (Verilog + BlueSpec) 34:23 Go to market: proving credibility under NDA 35:12 Porting effort for frontier labs 36:34 Biggest skepticism: manufacturing at gigawatt scale 37:32 Hiring plug Vikram Sekar
Semi Doped19,439 views • 1 month ago

The optical networking supercycle is here! In this podcast, and Vikram Sekar go through all the tech jargon and explain what everything means. In just about 45 mins, you will know everything required to keep up with the next revolution in AI. Chapters 00:00 Introduction to AI and CPU Bottlenecks 03:00 The Rise of Silicon Photonics 06:01 Understanding Optical Networking and Data Centers 08:49 Scale Across: Connecting Data Centers 11:56 Scale Out: Optimizing Data Center Connectivity 14:53 Scale Up: The Future of GPU Connectivity 23:32 The Shift from Copper to Optical Connections 26:13 Challenges and Reliability of Lasers 30:47 Understanding Co-Packaged Optics 34:17 Market Dynamics: Demand and Supply of Lasers 40:46 Emerging Technologies: Optical Circuit Switches
Semi Doped21,842 views • 3 months ago

This week we bring Fintwit’s favorite game to the podcast. “Would you buy this optical company?” The game goes like this. First explain: - What the company does - How it relates to the optics industry - What is their strength/moat - What is the risk/downside Then ask: Would you buy it? Why/Why not? Not financial advice. Just a silly, fun game. Educational. Better be serious when investing money. Pod bros don’t cut it. DYDD. Chapters (00:01) - Intro (06:59) - AXT $AXTI (13:38) - Tower Semiconductor $TSEM (23:58) - GlobalFoundries $GFS (32:43) - Lumentum $LITE (39:38) - Coherent $COHR (47:09) - Fabrinet $FN (54:07) - Corning $GLW Vikram Sekar
Semi Doped13,046 views • 3 months ago

Context memory essentially unlocks Agentic AI Much needed for Opus 4.6's "multi-agent swarms" In this SemiDoped pod, Vikram Sekar talks to Val Bercovici from Weka about context storage. - How token warehouses save inference costs - A new networking tier? Context Storage Network! - High Bandwidth Flash for context? - Weka's Augmented Memory Grid for context storage - Where this is all headed The convo is info packed. Don't miss out on it! b/acc, context platform engineer Chapters (00:00) Introduction to Weka and AI Storage Solutions (05:18) The Evolution of Context Memory in AI (09:30) Understanding Memory Hierarchies and Their Impact (16:24) Latency Challenges in Modern Storage Solutions (21:32) The Role of Networking in AI Storage Efficiency (29:42) Dynamic Resource Utilization in AI Networks (30:04) Introducing the Context Memory Network (31:13) High Bandwidth Flash: A Game Changer (32:54) Weka’s Neural Mesh and Storage Solutions (35:01) Axon: Transforming GPU Storage into Memory (39:00) Augmented Memory Grid Explained (42:00) Pooling DRAM and CXL Innovations (46:02) Token Warehouses and Inference Economics (52:10) The Future of Storage Innovations
Semi Doped Podcast12,492 views • 4 months ago
No more content to load