正在加载视频...

视频加载失败

加载此视频时出现问题。这可能是由于临时网络问题，或视频可能不可用。

Two days ago, Deepseek surprised everyone with an "undefined-behavior" PTX optimization speeding up particular ML workloads on a Hopper NVIDIA GPU Kernel. Let's reverse engineer the hack, implement it ourselves, and benchmark the speedup on an H100.

LaurieWired

155,104 subscribers

228,591 次观看 • 1 年前 •via X (Twitter)

科学技术新闻政治教育

Anya Rossi• Live Now

Private livecam show

11 条评论

LaurieWired 的头像

LaurieWired1 年前

Full Video:

LaurieWired 的头像

LaurieWired1 年前

My test code:

NetMind.AI 的头像

NetMind.AI2 年前

Get access to a wide range of GPUs like H100, A100, 4090, 3090 and save over 90% at NetMind Power. Rent Now!

numanumabruh 的头像

numanumabruh1 年前

You'd never know she's 6'5"

Jason Ho 的头像

Jason Ho1 年前

laurie supremacy

Bob (Moderna #7) Kerns 的头像

Bob (Moderna #7) Kerns1 年前

Until recently, I'd only seen your tweets; the first video I encountered was the 2025 prediction ones. Assumptions violated: higher voice, younger. Always good to have one's assumptions flagged, but especially the age. I was struck by the maturity of your analysis!

Dave 🚀 的头像

Dave 🚀1 年前

LFG!

KnowledgeisMostValuable 的头像

KnowledgeisMostValuable1 年前

I'd fight off a bear for you

nisten - e/acc 的头像

nisten - e/acc1 年前

lfg

🥀shiVam🥀 的头像

🥀shiVam🥀1 年前

wait, did you film this at Google HQ? (must appreciate the audio recording and editing)

Calcs 的头像

Calcs1 年前

Fantastic video, more please, lol 😂

相关视频

Just built an MCP for Ghidra. Now basically any LLM (Claude, Gemini, local...) can Reverse Engineer malware for you. With the right prompting, it automates a *ton* of tedious tasks. One-shot markups of entire binaries with just a click. Open source, on Github now.

Just built an MCP for Ghidra. Now basically any LLM (Claude, Gemini, local...) can Reverse Engineer malware for you. With the right prompting, it automates a ton of tedious tasks. One-shot markups of entire binaries with just a click. Open source, on Github now.

LaurieWired

284,568 次观看 • 1 年前

🚨BREAKING NEWS🚨: Phala Shipped the First-Ever GPU TEE Benchmark! Benchmark Research Highlight: ✅ LLaMa 3, Microsoft Phi Models Tested ✅ Tests Performed on nVIDIA H100 ✅ With #TEE Mode On 📊 Key Results: • Up to 150 TPS for LLaMA-3-8B • Performance trade-off as low as 0.5%, up to 5% Read Full Report: <

🚨BREAKING NEWS🚨: Phala Shipped the First-Ever GPU TEE Benchmark! Benchmark Research Highlight: ✅ LLaMa 3, Microsoft Phi Models Tested ✅ Tests Performed on nVIDIA H100 ✅ With #TEE Mode On 📊 Key Results: • Up to 150 TPS for LLaMA-3-8B • Performance trade-off as low as 0.5%, up to 5% Read Full Report: <

Phala

68,388 次观看 • 1 年前

Woow Nvidia has just released a 2.6B open-source world model 🔥 You can turn a single image, text prompt and trajectory into controllable worlds... And on a single GPU! - Code available on GitHub - Paper as well on arxiv You can use it for many things like embodied AI and robotics research, simulations, etc. Because it can run on a single GPU (like an RTX 5090 or H100) it makes world models accessible to basically everyone!

Woow Nvidia has just released a 2.6B open-source world model 🔥 You can turn a single image, text prompt and trajectory into controllable worlds... And on a single GPU! - Code available on GitHub - Paper as well on arxiv You can use it for many things like embodied AI and robotics research, simulations, etc. Because it can run on a single GPU (like an RTX 5090 or H100) it makes world models accessible to basically everyone!

Paul Couvert

173,148 次观看 • 1 个月前

You can run CUDA, on a Mac ARM GPU, in the browser. It sounds ridiculous but it actually works. HipScript chains CUDA, to OpenCL, to Vulkan, to Tint (Google’s shader translator), to a WASM WebGPU. I got a plasma simulation in running in just a few minutes, no NVIDIA GPU!

You can run CUDA, on a Mac ARM GPU, in the browser. It sounds ridiculous but it actually works. HipScript chains CUDA, to OpenCL, to Vulkan, to Tint (Google’s shader translator), to a WASM WebGPU. I got a plasma simulation in running in just a few minutes, no NVIDIA GPU!

LaurieWired

160,223 次观看 • 1 年前

First ever melted power connector on an Nvidia GPU.

First ever melted power connector on an Nvidia GPU.

PowerGPU

13,477 次观看 • 1 个月前

YOUR PARENTS PAID FOR THE CUDA MOAT! The #1 contributor to the CUDA MOAT isn't the the developers at NVIDIA, but it is the millions of developers outside of NVIDIA that invent new algorithms for CUDA like Flash Attention. For most of them, it started with an GeForce gaming GPU. NVIDIA is the only companies that has an reasonable good developer stack on consumer grade GPUs. As people grow up beyond playing CSGO & League of Legends & Minecraft, they either become anime weeaboos or they start programming on their existing computer with has an GeForce GPU

YOUR PARENTS PAID FOR THE CUDA MOAT! The #1 contributor to the CUDA MOAT isn't the the developers at NVIDIA, but it is the millions of developers outside of NVIDIA that invent new algorithms for CUDA like Flash Attention. For most of them, it started with an GeForce gaming GPU. NVIDIA is the only companies that has an reasonable good developer stack on consumer grade GPUs. As people grow up beyond playing CSGO & League of Legends & Minecraft, they either become anime weeaboos or they start programming on their existing computer with has an GeForce GPU

SemiAnalysis

25,230 次观看 • 2 个月前

Excited to introduce PyRoki ("Python Robot Kinematics"): easier IK, trajectory optimization, motion retargeting... with an open-source toolkit on both CPU and GPU

Excited to introduce PyRoki ("Python Robot Kinematics"): easier IK, trajectory optimization, motion retargeting... with an open-source toolkit on both CPU and GPU

Chung Min Kim

117,066 次观看 • 1 年前

My gaming PC with an Nvidia GPU after it hears me criticise Israel

My gaming PC with an Nvidia GPU after it hears me criticise Israel

Red Operative Leninist

3,657,116 次观看 • 3 个月前

The financialization of compute is here. Architect has launched 24/7 perpetuals on Nvidia H100 GPU prices — the first regulated futures contracts on compute, built with Ornn’s live market indices. The AI economy has its first exchange-traded futures market.

The financialization of compute is here. Architect has launched 24/7 perpetuals on Nvidia H100 GPU prices — the first regulated futures contracts on compute, built with Ornn’s live market indices. The AI economy has its first exchange-traded futures market.

Brett Harrison

72,899 次观看 • 2 个月前

🧠 Chat with Reasoning A few days ago the DeepSeek team released a LLM model with reasoning in various sizes. This we show is an example of 1bl that can run on machines with low GPU power like a mobile, but have enough power to answer complex questions. With these advanced models it is possible to link it with #IoT equipment to control information and use it in advanced control environments. All this under Open Source models and decentralized networks such as #Neurai #XNA $XNA #DeepSeek #Reasoning #AIchat

🧠 Chat with Reasoning A few days ago the DeepSeek team released a LLM model with reasoning in various sizes. This we show is an example of 1bl that can run on machines with low GPU power like a mobile, but have enough power to answer complex questions. With these advanced models it is possible to link it with #IoT equipment to control information and use it in advanced control environments. All this under Open Source models and decentralized networks such as #Neurai #XNA $XNA #DeepSeek #Reasoning #AIchat

NeurAI Project / XNA

17,691 次观看 • 1 年前

Proud to present io.net (old account) at @Solana #Breakpoint2023 yesterday! 🔥 Whether you're a GPU provider or an ML engineer - tune in for the live demonstration of the platform and join now. Watch the full video 🔽

Proud to present io.net (old account) at @Solana #Breakpoint2023 yesterday! 🔥 Whether you're a GPU provider or an ML engineer - tune in for the live demonstration of the platform and join now. Watch the full video 🔽

io.net

3,037,629 次观看 • 2 年前

What's The most you would pay for an NVIDIA GeForce RTX 5090 FE GPU? What GPU do you realistically have your eye on for your next upgrade?

What's The most you would pay for an NVIDIA GeForce RTX 5090 FE GPU? What GPU do you realistically have your eye on for your next upgrade?

DaPoets

73,122 次观看 • 5 个月前

"If Wemby is an A+ on the defensive end, and let's say he's an A- on the offensive end – does anybody match up with those grades? And the answer is nobody." – DP on Victor Wembanyama's chances to win MVP.

"If Wemby is an A+ on the defensive end, and let's say he's an A- on the offensive end – does anybody match up with those grades? And the answer is nobody." – DP on Victor Wembanyama's chances to win MVP.

Dan Patrick Show

177,008 次观看 • 3 个月前

Let's check in on the Charlotte, NC, blacks just days ago on Mother's Day. A minor dispute over-checks notes-burnt biscuits resulted in an attempted murder.

Let's check in on the Charlotte, NC, blacks just days ago on Mother's Day. A minor dispute over-checks notes-burnt biscuits resulted in an attempted murder.

Tom Hennessy

18,168 次观看 • 1 年前

Updated my GPU latency animation for the H100 today. Good to get an intuition for different memory areas.

Updated my GPU latency animation for the H100 today. Good to get an intuition for different memory areas.

Fleetwood

16,637 次观看 • 5 个月前

The new version of DeepSeek-R1, it's an INSANE model, my god. DeepSeek Engineer v2 with function calling support drops tomorrow. I have to.

The new version of DeepSeek-R1, it's an INSANE model, my god. DeepSeek Engineer v2 with function calling support drops tomorrow. I have to.

Pietro Schirano

138,907 次观看 • 1 年前

Caleb Landry Jones stars in the surreal fable HARVEST. Over seven hallucinatory days, a village with no name, in an undefined time and place, disappears.

Caleb Landry Jones stars in the surreal fable HARVEST. Over seven hallucinatory days, a village with no name, in an undefined time and place, disappears.

Bloody Disgusting

21,609 次观看 • 1 年前