Sharon Zhou's banner

Sharon Zhou

@realSharonZhou • 27,455 subscribers

Recursively self-improving | VP Eng & AI, @AMD | Prev: Founder & CEO, Lamini. CS Faculty & PhD @Stanford. @Google. @Harvard | @MIT 35 under 35. Angel investor.

Videos

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Exclusive private shows

1.2k viewers online

Private Show

Join now for exclusive access

Free preview available • Premium content

It's here: We just hit superhuman performance on AI kernel optimization! Real customer models & production settings. Not toy problems (what I typically see). This is the year that Claude writes its own kernels, Codex its own kernels, for every new GPU that it wants to run on -- something that takes months to port between GPU generations today. This has a massive impact to scaling intelligence. More compute means getting the next frontier model sooner.

It's here: We just hit superhuman performance on AI kernel optimization! Real customer models & production settings. Not toy problems (what I typically see). This is the year that Claude writes its own kernels, Codex its own kernels, for every new GPU that it wants to run on -- something that takes months to port between GPU generations today. This has a massive impact to scaling intelligence. More compute means getting the next frontier model sooner.

241,935 просмотров • 4 месяцев назад

Mood: agents optimizing kernels Claude won on kernel optimization: gemm_bf16 at 1.19x vs Codex's 0.94x. Codex was faster (~1.3h vs ~3.4h) but produced no reinjectable optimizations. Claude used (hipBLASLt) as a drop-in replacement for the custom Triton kernel. For Codex, shape mismatch caused slight regression. Still improving, open sourcing soon --- AMD-AGI team (Sina Rafati, Emad Barsoum, and many more)

Mood: agents optimizing kernels Claude won on kernel optimization: gemm_bf16 at 1.19x vs Codex's 0.94x. Codex was faster (~1.3h vs ~3.4h) but produced no reinjectable optimizations. Claude used (hipBLASLt) as a drop-in replacement for the custom Triton kernel. For Codex, shape mismatch caused slight regression. Still improving, open sourcing soon --- AMD-AGI team (Sina Rafati, Emad Barsoum, and many more)

23,583 просмотров • 4 месяцев назад

I like to think of evals as something active, not passive -- it's a North Star that steers LLMs toward higher intelligence. Evals should drive your RL/SFT/post-training decisions. Internal evals at frontier labs make a huge difference -- and you can see it in how models behave differently (GPT seems better at one-shot tasks, Claude at multi-turn). If you want to learn more about building evals that actually improve your model in post-training, check out our AMD x DeeplearningAI course "Fine-tuning & RL for LLMs: Intro to Post-training" (content is free):

I like to think of evals as something active, not passive -- it's a North Star that steers LLMs toward higher intelligence. Evals should drive your RL/SFT/post-training decisions. Internal evals at frontier labs make a huge difference -- and you can see it in how models behave differently (GPT seems better at one-shot tasks, Claude at multi-turn). If you want to learn more about building evals that actually improve your model in post-training, check out our AMD x DeeplearningAI course "Fine-tuning & RL for LLMs: Intro to Post-training" (content is free):

16,613 просмотров • 5 месяцев назад

Super excited to launch a new AI course! 🚀 Fine-Tuning & Reinforcement Learning for LLMs: Intro to Post-Training A collaboration between AMD 🤝 Andrew Ng’s DeepLearning.AI to give every developer the tools & compute to work with the same post-training techniques, used across today’s leading AI labs. 🎓 Learn for free → 🧵

Super excited to launch a new AI course! 🚀 Fine-Tuning & Reinforcement Learning for LLMs: Intro to Post-Training A collaboration between AMD 🤝 Andrew Ng’s DeepLearning.AI to give every developer the tools & compute to work with the same post-training techniques, used across today’s leading AI labs. 🎓 Learn for free → 🧵

20,386 просмотров • 9 месяцев назад

Больше нет контента для загрузки