Ben Pouladian's banner
Ben Pouladian's profile picture

Ben Pouladian

@benitoz20,652 subscribers

The world isn’t short of oil. It’s short of tokens. Hardware, software, models. System-level AI infrastructure analysis. $NVDA since 2016. DMs open ↓

Shorts

Alphabet is raising 80B to buy compute “to meet unprecedented customer demand” Berkshire put in 10B of it Read that again The most patient capital on earth just underwrote a GPU bet Jensen says compute equals revenue Buffett just priced the collateral $GOOGL $BRK.A $NVDA

Alphabet is raising 80B to buy compute “to meet unprecedented customer demand” Berkshire put in 10B of it Read that again The most patient capital on earth just underwrote a GPU bet Jensen says compute equals revenue Buffett just priced the collateral $GOOGL $BRK.A $NVDA

140,189 görüntüleme

$NVDA down 8% on a massive beat. Market rewards unprofitable neo clouds and cyclical semi suppliers instead. Relative valuation: 0.90x SOX. 10-year trough was March. Jensen: "You can't hold back performance." Everything gets sorted out. Stacy Rasgon for the inspo

$NVDA down 8% on a massive beat. Market rewards unprofitable neo clouds and cyclical semi suppliers instead. Relative valuation: 0.90x SOX. 10-year trough was March. Jensen: "You can't hold back performance." Everything gets sorted out. Stacy Rasgon for the inspo

60,033 görüntüleme

A year ago, I called Nvidia the literal bargaining chip in US-China trade. Today the thesis is no longer mine alone. Anthropic just published "2028: Two scenarios for global AI leadership" arguing compute is the entire game. Close the smuggling loopholes, kill distillation attacks, lock in a 12-24 month US lead. Same day, Reuters: US Commerce approved H200 sales to Alibaba, Tencent, ByteDance, Up to 75,000 chips each. Lenovo and Foxconn cleared as distributors. Jensen is in China this week trying to convert paper approvals into actual deliveries. The math: China was 13% of Nvidia revenue ($17B in FY25). Jensen at GTC: "$50B opportunity in 2025 alone, growing 50% annually." To CNBC in October: "a couple hundred billion by the end of the decade." That's the prize. But the prize is not just the revenue. It's the CUDA lock-in. On Dwarkesh in April, Jensen said the moat is not the silicon. The moat is the install base. Every cloud. Every robot. Every developer trained on CUDA. If China spins up on Huawei Ascend, that is a parallel stack that compounds against Nvidia forever. Concede the second largest compute market, you concede the ecosystem. This is why Jensen got visibly agitated when Dwarkesh pushed. "You are not talking to somebody who woke up a loser." It is also why Beijing is slow-walking the H200 orders. They understand the same thing in reverse. Every CUDA developer is a Huawei customer they will never get back. The bargaining chip is leverage in both directions. Anthropic's policy paper today is the US position. Jensen's posture is the corporate position. Beijing's go-slow is the Chinese position. All three agree on one thing: whoever owns the compute, owns the future. Memory Wars. Co-Design. The Reasoning Tax. All downstream of this. Compute is the unit of national power in the AI era. $NVDA

A year ago, I called Nvidia the literal bargaining chip in US-China trade. Today the thesis is no longer mine alone. Anthropic just published "2028: Two scenarios for global AI leadership" arguing compute is the entire game. Close the smuggling loopholes, kill distillation attacks, lock in a 12-24 month US lead. Same day, Reuters: US Commerce approved H200 sales to Alibaba, Tencent, ByteDance, Up to 75,000 chips each. Lenovo and Foxconn cleared as distributors. Jensen is in China this week trying to convert paper approvals into actual deliveries. The math: China was 13% of Nvidia revenue ($17B in FY25). Jensen at GTC: "$50B opportunity in 2025 alone, growing 50% annually." To CNBC in October: "a couple hundred billion by the end of the decade." That's the prize. But the prize is not just the revenue. It's the CUDA lock-in. On Dwarkesh in April, Jensen said the moat is not the silicon. The moat is the install base. Every cloud. Every robot. Every developer trained on CUDA. If China spins up on Huawei Ascend, that is a parallel stack that compounds against Nvidia forever. Concede the second largest compute market, you concede the ecosystem. This is why Jensen got visibly agitated when Dwarkesh pushed. "You are not talking to somebody who woke up a loser." It is also why Beijing is slow-walking the H200 orders. They understand the same thing in reverse. Every CUDA developer is a Huawei customer they will never get back. The bargaining chip is leverage in both directions. Anthropic's policy paper today is the US position. Jensen's posture is the corporate position. Beijing's go-slow is the Chinese position. All three agree on one thing: whoever owns the compute, owns the future. Memory Wars. Co-Design. The Reasoning Tax. All downstream of this. Compute is the unit of national power in the AI era. $NVDA

14,452 görüntüleme

Videos

benitoz's profile picture

I read a lot of Peter Lynch. Met him once. The one rule I carry into tech investing is the most boring one he ever wrote, know what you own, down to the physics if the position demands it. For me that has meant living inside NVIDIA's stack for years, and pulling apart the alternatives next to it, Trainium, the TPU, every serious accelerator someone is willing to tape out against Jensen. I was also an early investor in Mellanox, the networking company NVIDIA bought to own the switched fabric the entire scale up era now runs on. So when the conversation turns to networking as the real moat, this is not theory to me. It is a position I watched become the thesis. You do not understand what you own until you understand what could take it. Gavin Baker at The Sohn Idea Contest just gave the most physically grounded read on AI infrastructure I have heard this cycle, and it is a Lynch lesson in disguise. The reframe that matters: The last terrestrial mega data center may already be on someone's drawing board. Everything else follows from two constraints, watts and wafers, and Gavin walks both down to first principles. That is the work. Most people are pricing the narrative. Lynch would have asked what the thing actually is. 1. TSMC is the global rate limiter Jensen reportedly visits every quarter asking to double or triple leading edge capacity. TSMC expands at roughly 5 percent. A handful of disciplined operators in Taiwan are the physical governor on the entire AI buildout. This is the part the bubble crowd misses. The constraint is not demand and it is not capital. It is one fab's deliberate refusal to overbuild. That stretches the cycle longer and smoother instead of bubble and bust. It reads like the mid 1990s capacity cycle, not a standard 25 year memory peak where a 60 to 70 percent price spike would be your signal to cut the weed and walk. I have held NVIDIA since 2016 for exactly this reason. Owning it meant understanding it. The thesis was never the chip. It was the chokepoint. 2. The most underestimated silicon is Trainium Consensus is still pricing a one horse race. Gavin's sharpest non NVIDIA call is AWS Trainium, specifically Trainium 3 ramping in the back half of 2026. Here is the part that took me a while to internalize from studying these architectures side by side. As frontier models go fully Mixture of Experts, inference stops being a matmul problem and becomes a networking problem. You need a switched scale up fabric, not just fast chips. Today two organizations on earth have a working one. NVIDIA and Amazon. NVIDIA's came from Mellanox, which is the whole reason I sized that position the way I did years ago, the bet was always that networking would decide this, not raw flops. The TPU is formidable in its own lane, but the scale up fabric is the moat people are not modeling, and it is why I track every accelerator, not just the one I own. 3. The neocloud moat is operational, not arbitrage The lazy take is that CoreWeave and Crusoe are just renting hyperscaler slack. Gavin's counter is that running dense GPU clusters is like driving an F1 car. Looks easy until you try it. Top tier neoclouds run 2 to 3x the hardware utilization per hour of lower tier providers. That is an execution and inventory moat, and it compounds. 4. The structural short nobody is pricing Watts and wafers eventually force the buildout off the planet. Gavin expects orbital data infrastructure to prove technical and economic viability within roughly two years and take meaningful share by the end of the decade. Space solves power with unattenuated solar and solves cooling with massive radiators in the satellite's own shadow. Dense single rack nodes stitched together with lasers into a virtual hyperscale cluster in orbit. The unpriced risk is everything that over expanded to serve a terrestrial buildout. Cooling, power, industrial equipment names sized for a curve that may bend down within seven years. The whole interview is a lesson in pattern recognition over narrative. Lynch built a career on retail investors knowing their companies better than Wall Street did. The same edge exists in AI infrastructure right now, it just requires you to understand watts and wafers instead of same store sales. If you are not modeling the physical boundaries of the stack through the lens of history, you are not underwriting the position. You are following it.

Ben Pouladian

93,402 görüntüleme • 1 ay önce

benitoz's profile picture

Rene Haas just confirmed the Vera CPU thesis on yesterday’s Arm Q4 call. He didn’t mean to His framing: GPUs are reticle-limited. CPUs are not. The ratio shift is happening in core count, not chip count His exact words: “256 Vera CPU chips, 88 cores per chip, a 200-kilowatt liquid-cooled rack designed to sit in a data center adjacent to a Vera Rubin system” That is not a host CPU. That is a dedicated agentic orchestration Two days ago NVIDIA’s own engineers published the receipt. They traced a real 33-minute Claude Code session: 283 inference requests 58 main-agent turns coordinating 225 sub-agent invocations Context grew from 15K to 156K tokens before compaction dropped it to 20K Main agent alone processed ~3.5 million input tokens in the first 40 turns Anthropic’s own number: agentic systems consume up to 15x more tokens than chat. Coding agents sustain 95 to 98 percent prompt cache hit rates. Without caching, costs would be 6x higher This is what’s happening between GPU calls. File reads. Tool invocations. Sub-agent spawns. Compaction. KV cache management. None of it runs on the GPU That’s why 12,000 GPUs need 400,000 CPU cores. The 33-to-1 ratio isn’t a forecast. It’s a measurement NVIDIA states it in the blog directly: this won’t be resolved by adding more compute FLOPs and memory capacity Translation: the GPU-only path is exhausted. The agentic chapter requires a platform, not a chip Their seven-chip answer: Vera Rubin NVL72 —capacity and prefill Vera CPU — tool execution, KV cache offload Groq 3 LPX — SRAM-first decode, low-jitter generation NVLink 6, ConnectX-9, BlueField-4, Spectrum-X — fabric Result they claim: 400+ tokens per second per user on trillion-parameter MoE at 400K context. Vera spec: 88 Olympus cores, 176 threads, 1.8 TB/s NVLink-C2C, 1.2 TB/s LPDDR5X, 227 billion transistors. A 256-CPU rack delivers 45,056 threads and 400 TB of memory One detail nobody is talking about. The blog’s second author was previously Head of Agents at Groq. The third was previously at Groq Inc and Intel. NVIDIA didn’t license the LPX architecture. They absorbed the team that built it Haas isn’t pitching a competing thesis. He’s confirming this one from the other side of the table. Arm data center royalties doubled year-on-year. He expects them to double again Things feel slow right now because we’re between platforms. The speedup ships in H2 2026. The architectural argument is over. Deployment is the only variable left I cover this in The Quiet Architect and The Fourth Piece $arm $NVDA

Ben Pouladian

62,918 görüntüleme • 1 ay önce