Загрузка видео...

Не удалось загрузить видео

Возникла проблема при загрузке этого видео. Это может быть связано с временными проблемами сети или видео может быть недоступно.

На главную

OpenAI recently released its first open-weights model since GPT-2, entering a field led by DeepSeek and Alibaba's Qwen. Ankit () breaks down these top OSS models, including what sets them apart under the hood: mixture-of-experts, long-context training, and post-training techniques that shape reasoning and alignment—and how different design choices... lead to surprisingly similar performance. 00:00 – OpenAI OSS Launch 01:00 – Comparing Open Source LLM Architectures 01:46 – GPT OSS Overview 02:37 – Under The Hood of GPT OSS 03:25 – Qwen-3 Architecture 04:17 – Qwen-3 Training 05:12 – Qwen-3 Post-Training 06:08 – Qwen-3 Reasoning & RL Innovations 06:52 – DeepSeek V3 Overview 07:40 – DeepSeek V3.1 Updates 08:39 – Attention Mechanism (MLA) 09:39 – Comparing Model Sizes 10:35 – Long Context Strategies 11:25 – Reflections on Methods 12:00 – Takeawaysshow more

Y Combinator

1,614,198 subscribers

208,680 просмотров • 10 месяцев назад •via X (Twitter)

Наука и технологии Новости и политика Образование

Anya Rossi• Live Now

Private livecam show

Комментарии: 0

Нет доступных комментариев

Здесь появятся комментарии из оригинального поста

Похожие видео

*Major* open source AI drop today. Can America win the Open AI race? My conversation with Nathan Lambert and Luca Soldaini 🎀 of Ai2 about the launch of Olmo 3 00:00 – Cold Open 00:39 – Welcome & today’s big announcement 01:18 – Introducing the Olmo 3 model family 02:07 – What “base models” really are (and why they matter) 05:51 – Dolma 3: the data behind Olmo 3 08:06 – Performance vs Qwen, Gemma, DeepSeek 10:28 – What true open source means (and why it’s rare) 12:51 – Intermediate checkpoints, transparency, and why AI2 publishes everything 16:37 – Why Qwen is everywhere (including U.S. startups) 18:31 – Why Chinese labs go open source (and why U.S. labs don’t) 20:28 – Inside ATOM: the U.S. response to China’s model surge 22:13 – The rise of “thinking models” and inference-time scaling 35:58 – The full Olmo pipeline, explained simply 46:52 – Pre-training: data, scale, and avoiding catastrophic spikes 50:27 – Mid-training (tail patching) and avoiding test leakage 52:06 – Why long-context training matters 55:28 – SFT: building the foundation for reasoning 1:04:53 – Preference tuning & why DPO still works 1:10:51 – The hard part: RLVR, long reasoning chains, and infrastructure pain 1:13:59 – Why RL is so technically brutal 1:18:17 – Complexity tax vs AGI hype 1:21:58 – How everyone can contribute to the future of AI 1:27:26 – Closing thoughts

Major open source AI drop today. Can America win the Open AI race? My conversation with Nathan Lambert and Luca Soldaini 🎀 of Ai2 about the launch of Olmo 3 00:00 – Cold Open 00:39 – Welcome & today’s big announcement 01:18 – Introducing the Olmo 3 model family 02:07 – What “base models” really are (and why they matter) 05:51 – Dolma 3: the data behind Olmo 3 08:06 – Performance vs Qwen, Gemma, DeepSeek 10:28 – What true open source means (and why it’s rare) 12:51 – Intermediate checkpoints, transparency, and why AI2 publishes everything 16:37 – Why Qwen is everywhere (including U.S. startups) 18:31 – Why Chinese labs go open source (and why U.S. labs don’t) 20:28 – Inside ATOM: the U.S. response to China’s model surge 22:13 – The rise of “thinking models” and inference-time scaling 35:58 – The full Olmo pipeline, explained simply 46:52 – Pre-training: data, scale, and avoiding catastrophic spikes 50:27 – Mid-training (tail patching) and avoiding test leakage 52:06 – Why long-context training matters 55:28 – SFT: building the foundation for reasoning 1:04:53 – Preference tuning & why DPO still works 1:10:51 – The hard part: RLVR, long reasoning chains, and infrastructure pain 1:13:59 – Why RL is so technically brutal 1:18:17 – Complexity tax vs AGI hype 1:21:58 – How everyone can contribute to the future of AI 1:27:26 – Closing thoughts

Matt Turck

37,482 просмотров • 7 месяцев назад

Thanksgiving-week treat: an epic conversation on Frontier AI with Lukasz Kaiser -co-author of “Attention Is All You Need” (Transformers) and leading research scientist at OpenAI working on GPT-5.1-era reasoning models. 00:00 – Cold open and intro 01:29 – “AI slowdown” vs a wild week of new frontier models 08:03 – Low-hanging fruit, infra, RL training and better data 11:39 – What is a reasoning model, in plain language 17:02 – Chain-of-thought and training the thinking process with RL 21:39 – Łukasz’s path: from logic and France to Google and Kurzweil 24:20 – Inside the Transformer story and what “attention” really means 28:42 – From Google Brain to OpenAI: culture, scale and GPUs 32:49 – What’s next for pre-training, GPUs and distillation 37:29 – Can we still understand these models? Circuits, sparsity and black boxes 39:42 – GPT-4 → GPT-5 → GPT-5.1: what actually changed 42:40 – Post-training, safety and teaching GPT-5.1 different tones 46:16 – How long should GPT-5.1 think? Reasoning tokens and jagged abilities 47:43 – The five-year-old’s dot puzzle that still breaks frontier models 52:22 – Generalization, child-like learning and whether reasoning is enough 53:48 – Beyond Transformers: ARC, LeCun’s ideas and multimodal bottlenecks 56:10 – GPT-5.1 Codex Max, long-running agents and compaction 1:00:06 – Will foundation models eat most apps? The translation analogy and trust 1:02:34 – What still needs to be solved, and where AI might go next

Thanksgiving-week treat: an epic conversation on Frontier AI with Lukasz Kaiser -co-author of “Attention Is All You Need” (Transformers) and leading research scientist at OpenAI working on GPT-5.1-era reasoning models. 00:00 – Cold open and intro 01:29 – “AI slowdown” vs a wild week of new frontier models 08:03 – Low-hanging fruit, infra, RL training and better data 11:39 – What is a reasoning model, in plain language 17:02 – Chain-of-thought and training the thinking process with RL 21:39 – Łukasz’s path: from logic and France to Google and Kurzweil 24:20 – Inside the Transformer story and what “attention” really means 28:42 – From Google Brain to OpenAI: culture, scale and GPUs 32:49 – What’s next for pre-training, GPUs and distillation 37:29 – Can we still understand these models? Circuits, sparsity and black boxes 39:42 – GPT-4 → GPT-5 → GPT-5.1: what actually changed 42:40 – Post-training, safety and teaching GPT-5.1 different tones 46:16 – How long should GPT-5.1 think? Reasoning tokens and jagged abilities 47:43 – The five-year-old’s dot puzzle that still breaks frontier models 52:22 – Generalization, child-like learning and whether reasoning is enough 53:48 – Beyond Transformers: ARC, LeCun’s ideas and multimodal bottlenecks 56:10 – GPT-5.1 Codex Max, long-running agents and compaction 1:00:06 – Will foundation models eat most apps? The translation analogy and trust 1:02:34 – What still needs to be solved, and where AI might go next

Matt Turck

167,926 просмотров • 7 месяцев назад

How GPT-5 thinks, with OpenAI VP of Research Jerry Tworek 00:00 - Intro 01:01 - What Reasoning Actually Means in AI 02:32 - Chain of Thought: Models Thinking in Words 05:25 - How Models Decide How Long to Think 07:24 - Evolution from o1 to o3 to GPT-5 11:00 - The Road to OpenAI: Growing up in Poland, Dropping out of School, Trading 20:32 - Working on Robotics and Rubik's Cube Solving 23:02 - A Day in the Life: Talking to Researchers 24:06 - How Research Priorities Are Determined 26:53 - OpenAI's Culture of Transparency 29:32 - Balancing Research with Shipping Fast 31:52 - Using OpenAI's Own Tools Daily 32:43 - Pre-Training Plus RL: The Modern AI Stack 35:10 - Reinforcement Learning 101: Training Dogs 40:17 - The Evolution of Deep Reinforcement Learning 42:09 - When GPT-4 Seemed Underwhelming at First 45:39 - How RLHF Made GPT-4 Actually Useful 48:02 - Unsupervised vs Supervised Learning 49:59 - GRPO and How DeepSeek Accelerated US Research 53:05 - What It Takes to Scale Reinforcement Learning 55:36 - Agentic AI and Long-Horizon Thinking 59:19 - Alignment as an RL Problem 1:01:11 - Winning ICPC World Finals Without Specific Training 1:05:53 - Applying RL Beyond Math and Coding 1:09:15 - The Path from Here to AGI 1:12:23 - Pure RL vs Language Models

How GPT-5 thinks, with OpenAI VP of Research Jerry Tworek 00:00 - Intro 01:01 - What Reasoning Actually Means in AI 02:32 - Chain of Thought: Models Thinking in Words 05:25 - How Models Decide How Long to Think 07:24 - Evolution from o1 to o3 to GPT-5 11:00 - The Road to OpenAI: Growing up in Poland, Dropping out of School, Trading 20:32 - Working on Robotics and Rubik's Cube Solving 23:02 - A Day in the Life: Talking to Researchers 24:06 - How Research Priorities Are Determined 26:53 - OpenAI's Culture of Transparency 29:32 - Balancing Research with Shipping Fast 31:52 - Using OpenAI's Own Tools Daily 32:43 - Pre-Training Plus RL: The Modern AI Stack 35:10 - Reinforcement Learning 101: Training Dogs 40:17 - The Evolution of Deep Reinforcement Learning 42:09 - When GPT-4 Seemed Underwhelming at First 45:39 - How RLHF Made GPT-4 Actually Useful 48:02 - Unsupervised vs Supervised Learning 49:59 - GRPO and How DeepSeek Accelerated US Research 53:05 - What It Takes to Scale Reinforcement Learning 55:36 - Agentic AI and Long-Horizon Thinking 59:19 - Alignment as an RL Problem 1:01:11 - Winning ICPC World Finals Without Specific Training 1:05:53 - Applying RL Beyond Math and Coding 1:09:15 - The Path from Here to AGI 1:12:23 - Pure RL vs Language Models

Matt Turck

451,229 просмотров • 8 месяцев назад

겸🙂 라이브 노래 모음 #도겸 Happy 00:03~ 00:36 한편의 너 00:37~ 00:58 Lunch 00:59~ 02:07 좋겠다 02:08~ 02:37 단발머리 02:38~ 03:01 Hey Buddy 03:02~ 03:21 베텔기우스 03:22~ 03:56 빈대떡신사 03:57~ 04:38 지금이순간 04:39~ 05:16 녹아내려요 05:17~ 05:56 몰래듣지마요 05:57~ 06:28 겨우 06:29~ 06:51 Second Life 06:52~ 07:03 Steal the Show 07:04~ 07:42 Love Lee 07:43~ 07:52 If You 07:53~ 08:30 빠른걸음 08:31~ 08:52 달링 08:53~ 08:58 SOS 08:59~ 09:24

겸🙂 라이브 노래 모음 #도겸 Happy 00:03~ 00:36 한편의 너 00:37~ 00:58 Lunch 00:59~ 02:07 좋겠다 02:08~ 02:37 단발머리 02:38~ 03:01 Hey Buddy 03:02~ 03:21 베텔기우스 03:22~ 03:56 빈대떡신사 03:57~ 04:38 지금이순간 04:39~ 05:16 녹아내려요 05:17~ 05:56 몰래듣지마요 05:57~ 06:28 겨우 06:29~ 06:51 Second Life 06:52~ 07:03 Steal the Show 07:04~ 07:42 Love Lee 07:43~ 07:52 If You 07:53~ 08:30 빠른걸음 08:31~ 08:52 달링 08:53~ 08:58 SOS 08:59~ 09:24

J

224,605 просмотров • 1 год назад

좋아하는 #최정은 댄스커버&챌린지 모음 ❤︎ 00:00 pocket locket 00:17 Touch 00:39 마지막 축제 01:02 청바지 01:26 Confident 01:38 DENIAL IS A RIVER 01:51 Pink Hoodie 02:16 ExtraL 02:55 SIGN 03:15 Devil Game 03:30 1999 03:53 PUSH 2 START 04:09 In Bloom 04:31 Shake It To The Max 04:44 Gnarly 05:22 Imma Be 05:37 Lips Hips Kiss 06:08 HEYDAY 06:31 All The Stars 06:55 THUNDER 07:30 F Girl 07:52 소원을 말해봐 08:10 Soda Pop 08:25 Anthem 08:47 ICONIK 09:07 OVERDRIVE 09:38 BURNING UP 10:03 Body 10:38 CHANEL 10:53 Jumpshot 11:00 The Fate of Ophelia 11:15 WHERE YOU AT 11:41 Bad Desire

좋아하는 #최정은 댄스커버&챌린지 모음 ❤︎ 00:00 pocket locket 00:17 Touch 00:39 마지막 축제 01:02 청바지 01:26 Confident 01:38 DENIAL IS A RIVER 01:51 Pink Hoodie 02:16 ExtraL 02:55 SIGN 03:15 Devil Game 03:30 1999 03:53 PUSH 2 START 04:09 In Bloom 04:31 Shake It To The Max 04:44 Gnarly 05:22 Imma Be 05:37 Lips Hips Kiss 06:08 HEYDAY 06:31 All The Stars 06:55 THUNDER 07:30 F Girl 07:52 소원을 말해봐 08:10 Soda Pop 08:25 Anthem 08:47 ICONIK 09:07 OVERDRIVE 09:38 BURNING UP 10:03 Body 10:38 CHANEL 10:53 Jumpshot 11:00 The Fate of Ophelia 11:15 WHERE YOU AT 11:41 Bad Desire

찜

60,864 просмотров • 5 месяцев назад

Introducing `gpt-oss Pro Mode`! Basically o3 Pro, but for the new OpenAI open source models :) Pro Mode chains together up to 10 instances of the new OpenAI GPT-OSS model, enabling it to produce a better answer than one instance alone could create! Open source, link here:

Introducing `gpt-oss Pro Mode`! Basically o3 Pro, but for the new OpenAI open source models :) Pro Mode chains together up to 10 instances of the new OpenAI GPT-OSS model, enabling it to produce a better answer than one instance alone could create! Open source, link here:

Matt Shumer

315,683 просмотров • 10 месяцев назад

GPT-Image-2 was just released... It's the best image model (By a wide margin) Heres how to use GPT-Image-2 in ChatGPT, OpenAI Playground and of Course - Codex. TIMESTAMPS 00:00 Intro 01:27 Initial Tests of GPT-Image-2 02:24 It can GENERATE working BARCODES? 03:40 11 Edits one Prompt 06:40 Creating Cartoons 07:40 Overlay Explanations 08:50 How to Get Started 10:51 IPhone Mockup Images (Amazing) 15:43 Using GPT-Image in OpenAI Playground 16:11 Generate images in 4k 17:44 The FIRST limitation... Its bad at counting 18:26 Using GPT-Image in Codex (Agentic Image Gen)

GPT-Image-2 was just released... It's the best image model (By a wide margin) Heres how to use GPT-Image-2 in ChatGPT, OpenAI Playground and of Course - Codex. TIMESTAMPS 00:00 Intro 01:27 Initial Tests of GPT-Image-2 02:24 It can GENERATE working BARCODES? 03:40 11 Edits one Prompt 06:40 Creating Cartoons 07:40 Overlay Explanations 08:50 How to Get Started 10:51 IPhone Mockup Images (Amazing) 15:43 Using GPT-Image in OpenAI Playground 16:11 Generate images in 4k 17:44 The FIRST limitation... Its bad at counting 18:26 Using GPT-Image in Codex (Agentic Image Gen)

Riley Brown

35,552 просмотров • 2 месяцев назад

The vik episode on Ground Zero! [youtube link in replies] 0:00:00 - Vik's story and Moondream 0:18:50 - The core thesis of small and scale 0:28:35 - The data problem 0:33:35 - Deciding the training architecture 0:43:20 - Post-Training and RL perf 0:46:06 - Post-Training recipes of Moondream 3 0:47:40 - Open Source and VLM development priorities 0:52:05 - AI War : America and China 0:55:07 - Moondream acquisition and Future 1:04:08 - Community Questions 1:16:40 - Trivia, Lores and Opinions on state of AI 1:28:07 - Advice to 20yo

The vik episode on Ground Zero! [youtube link in replies] 0:00:00 - Vik's story and Moondream 0:18:50 - The core thesis of small and scale 0:28:35 - The data problem 0:33:35 - Deciding the training architecture 0:43:20 - Post-Training and RL perf 0:46:06 - Post-Training recipes of Moondream 3 0:47:40 - Open Source and VLM development priorities 0:52:05 - AI War : America and China 0:55:07 - Moondream acquisition and Future 1:04:08 - Community Questions 1:16:40 - Trivia, Lores and Opinions on state of AI 1:28:07 - Advice to 20yo

himanshu

43,246 просмотров • 6 месяцев назад

춤 챌린지 모음 #윈터 #WINTER 00:00 Kick Back 00:17 Weekend 00:38 Second 01:02 Sparkling 01:23 Sneakers 01:35 Candy 01:55 야야야 02:08 Unforgiven 02:25 Movie Star 02:34 이프푸 02:55 Hard 03:20 최애의 아이 03:37 Vujade 03:53 It is what it is 04:06 Get A Guitar 04:16 INVITATION 04:25 Guilty 05:10 Fact Check 05:34 Maniac 05:56 WOP 06:13 Chill Kill 06:34 Trance x I Know 06:45 Sweet Venom 07:03 Siren 07:16 Hot & Cold 07:41 Rockin' Around The Pole Again 08:03 Underneath the Tree 08:21 Be There For Me 08:42 뭣 같아 08:54 UNTOUCHABLE 09:09 Two Of Hearts 09:19 Boyfriend 09:42 Nee, 09:59 움파룸파 10:17 사랑스러워 10:38 Like I Do 10:53 Like I Do (골댕이ver.) 11:08 민정이가 저장해놓은.. 11:29 갈매기.. 11:45 날 바라바라봐 12:01 GGB 12:12 홀씨 12:56 시나모롤 13:10 띵띵땅땅 13:26 EENIE MEENIE 13:47 How Sweet

춤 챌린지 모음 #윈터 #WINTER 00:00 Kick Back 00:17 Weekend 00:38 Second 01:02 Sparkling 01:23 Sneakers 01:35 Candy 01:55 야야야 02:08 Unforgiven 02:25 Movie Star 02:34 이프푸 02:55 Hard 03:20 최애의 아이 03:37 Vujade 03:53 It is what it is 04:06 Get A Guitar 04:16 INVITATION 04:25 Guilty 05:10 Fact Check 05:34 Maniac 05:56 WOP 06:13 Chill Kill 06:34 Trance x I Know 06:45 Sweet Venom 07:03 Siren 07:16 Hot & Cold 07:41 Rockin' Around The Pole Again 08:03 Underneath the Tree 08:21 Be There For Me 08:42 뭣 같아 08:54 UNTOUCHABLE 09:09 Two Of Hearts 09:19 Boyfriend 09:42 Nee, 09:59 움파룸파 10:17 사랑스러워 10:38 Like I Do 10:53 Like I Do (골댕이ver.) 11:08 민정이가 저장해놓은.. 11:29 갈매기.. 11:45 날 바라바라봐 12:01 GGB 12:12 홀씨 12:56 시나모롤 13:10 띵띵땅땅 13:26 EENIE MEENIE 13:47 How Sweet

취향의 윈터

379,853 просмотров • 2 лет назад

Robots, Small Models, and RL with DeepSeek Alumnus Zihan Wang — Manifold #86 Great conversation with Zihan Wang - on RAGEN ✈️ CVPR 25 🙂 Zihan Wang is an AI researcher at Northwestern University, where he works on vision-language models, robotics, and reinforcement learning. Previously, he interned at DeepSeek, contributing to projects like DeepSeek-V2. 01:13 - Zihan's Background, CS and AI Research in China 11:09 - DeepSeek; Human capital flow from PRC to US 16:07 - DeepSeek, Open Source and AI Research 31:52 - Model Size and Performance Constraints 33:01 - Data Bottleneck in Pre-trained Models 34:12 - Transformer Architecture and Scaling Laws 36:30 - Efficiency in Model Training 47:44 - Chain of Experts Architecture 1:01:06 - Future of AI and Robotics

Robots, Small Models, and RL with DeepSeek Alumnus Zihan Wang — Manifold #86 Great conversation with Zihan Wang - on RAGEN ✈️ CVPR 25 🙂 Zihan Wang is an AI researcher at Northwestern University, where he works on vision-language models, robotics, and reinforcement learning. Previously, he interned at DeepSeek, contributing to projects like DeepSeek-V2. 01:13 - Zihan's Background, CS and AI Research in China 11:09 - DeepSeek; Human capital flow from PRC to US 16:07 - DeepSeek, Open Source and AI Research 31:52 - Model Size and Performance Constraints 33:01 - Data Bottleneck in Pre-trained Models 34:12 - Transformer Architecture and Scaling Laws 36:30 - Efficiency in Model Training 47:44 - Chain of Experts Architecture 1:01:06 - Future of AI and Robotics

steve hsu

34,221 просмотров • 1 год назад

The Arcee AI Podcast is here! In this episode, Lucas Atkins and stochasm join us to discuss the story of Trinity models and everything frontier. I can say, this talk has been one of the most amazing and technical conversations we've had on Ground Zero. 0:00:00 - Intro 0:00:59 - Varun's transition from SWE to Pre-Training Lead 0:04:20 - Trinity Manifesto, Openclaw Ecosystem 0:12:15 - Arcee's Post-Training to Pre-Training Pivot 0:23:45 - Varun's first Pre-Training run (you can just do things!) 0:27:33 - Saturation in Pre-Training?, Mid-Training 0:37:00 - Tweaking the Training Architecture, Adam vs Muon, Evals 01:09:07 - Inference Engineering, Quick Fire, Post-Training Recipe 01:18:02 - Alpha in RL Envs, Harness Design 01:23:00 - American Open Source is trailing Chinese Competitors, Trinity Adoption 01:29:25 - Hiring at Arcee, Advice to 20yo

The Arcee AI Podcast is here! In this episode, Lucas Atkins and stochasm join us to discuss the story of Trinity models and everything frontier. I can say, this talk has been one of the most amazing and technical conversations we've had on Ground Zero. 0:00:00 - Intro 0:00:59 - Varun's transition from SWE to Pre-Training Lead 0:04:20 - Trinity Manifesto, Openclaw Ecosystem 0:12:15 - Arcee's Post-Training to Pre-Training Pivot 0:23:45 - Varun's first Pre-Training run (you can just do things!) 0:27:33 - Saturation in Pre-Training?, Mid-Training 0:37:00 - Tweaking the Training Architecture, Adam vs Muon, Evals 01:09:07 - Inference Engineering, Quick Fire, Post-Training Recipe 01:18:02 - Alpha in RL Envs, Harness Design 01:23:00 - American Open Source is trailing Chinese Competitors, Trinity Adoption 01:29:25 - Hiring at Arcee, Advice to 20yo

himanshu

32,305 просмотров • 2 месяцев назад

Why AI Progress Suddenly Feels Real - my conversation with Yann Dubois, who co-leads the Post-Training Frontiers team at OpenAI 00:00 - Intro 01:30 - Why recent AI progress feels like a step function 04:13 - Model reliability & the emotional rollercoaster of shipping GPT-5.5 07:33 - How OpenAI structures vertical and horizontal teams 09:49 - Improving model efficiency and test-time compute 12:32 - Yann's journey from Switzerland to OpenAI 15:37 - Reasoning in 2026: Real-world utility vs verifiable rewards 18:34 - GPT-5.5 Thinking vs Pro: Scaling test-time compute 20:09 - How reasoning models become more efficient 23:23 - Pre-training scaling and overcoming the data wall 27:03 - Multimodal data, synthetic data, and embodied AI 31:05 - Demystifying mid-training and post-training 37:21 - Does RL create new capabilities in AI? 38:53 - The challenges and frontier of scaling RL 43:09 - Is building AI models a craft or a strict science 48:21 - How AI models generalize across different domains 54:18 - How reinforcement learning cures AI hallucinations 56:04 - Negative generalization and conflicting instructions 58:05 - Can RL scale to law, medicine, and the broader economy? 1:00:19 - The evaluation bottleneck and Model as a Judge 1:04:21 - Continuous AI progress & continual learning 1:08:49 - Will foundation models eat the agent harness 1:11:23 - Why startups should focus on the last mile of AI

Why AI Progress Suddenly Feels Real - my conversation with Yann Dubois, who co-leads the Post-Training Frontiers team at OpenAI 00:00 - Intro 01:30 - Why recent AI progress feels like a step function 04:13 - Model reliability & the emotional rollercoaster of shipping GPT-5.5 07:33 - How OpenAI structures vertical and horizontal teams 09:49 - Improving model efficiency and test-time compute 12:32 - Yann's journey from Switzerland to OpenAI 15:37 - Reasoning in 2026: Real-world utility vs verifiable rewards 18:34 - GPT-5.5 Thinking vs Pro: Scaling test-time compute 20:09 - How reasoning models become more efficient 23:23 - Pre-training scaling and overcoming the data wall 27:03 - Multimodal data, synthetic data, and embodied AI 31:05 - Demystifying mid-training and post-training 37:21 - Does RL create new capabilities in AI? 38:53 - The challenges and frontier of scaling RL 43:09 - Is building AI models a craft or a strict science 48:21 - How AI models generalize across different domains 54:18 - How reinforcement learning cures AI hallucinations 56:04 - Negative generalization and conflicting instructions 58:05 - Can RL scale to law, medicine, and the broader economy? 1:00:19 - The evaluation bottleneck and Model as a Judge 1:04:21 - Continuous AI progress & continual learning 1:08:49 - Will foundation models eat the agent harness 1:11:23 - Why startups should focus on the last mile of AI

Matt Turck

100,183 просмотров • 1 месяц назад

The latest Qwen 3 VL by Qwen running on iPhone 17 Pro with MLX Qwen 3 VL brings upgraded visual understanding, recognition, and OCR capabilities without sacrificing text performance like previous models The 4B model here is close to Qwen 2.5 VL 72B in many benchmarks

The latest Qwen 3 VL by Qwen running on iPhone 17 Pro with MLX Qwen 3 VL brings upgraded visual understanding, recognition, and OCR capabilities without sacrificing text performance like previous models The 4B model here is close to Qwen 2.5 VL 72B in many benchmarks

Adrien Grondin

109,700 просмотров • 8 месяцев назад

DeepSeek beat out OpenAI then Alibaba released Qwen 2.5 Max

DeepSeek beat out OpenAI then Alibaba released Qwen 2.5 Max

Michael Burry Stock Tracker ♟

18,449 просмотров • 1 год назад

SpaceX is about to launch their first V3 Starship and it’s by far the biggest and most radical change to the program to date. Here's a super quick overview of what all is new and different including the incredible new Raptor 3 engines, the new launch pad, and everything else that’s debuting on Flight 12. 00:00 - Intro 01:07 - Pad 2 03:15 - Raptor 3 04:59 - SuperHeavy V3 07:56 - Starship V3 10:52 - Flight 12 Profile

Everyday Astronaut

250,496 просмотров • 1 месяц назад

Announcing GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano in the API. TL;DR: Major improvements on coding, instruction following, and long context. 💥 00:00 Intro 02:18 Coding 04:53 Instruction following 06:58 Long context 10:22 Demos, pricing, and availability 20:00 @windsurf

Announcing GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano in the API. TL;DR: Major improvements on coding, instruction following, and long context. 💥 00:00 Intro 02:18 Coding 04:53 Instruction following 06:58 Long context 10:22 Demos, pricing, and availability 20:00 @windsurf

OpenAI Developers

603,320 просмотров • 1 год назад

2025 연말결산 Full ver. 🌳🌸 #숩밤 #soogyu op- 0:00 01 - 0:19 02 - 4:44 03 - 19:37 04 - 33:22 05 - 43:40 06 - 52:58 07 - 01:12:34 08 - 01:22:32 09 - 01:41:04 10 - 01:55:14 11 - 02:05:22 12 - 02:12:52

2025 연말결산 Full ver. 🌳🌸 #숩밤 #soogyu op- 0:00 01 - 0:19 02 - 4:44 03 - 19:37 04 - 33:22 05 - 43:40 06 - 52:58 07 - 01:12:34 08 - 01:22:32 09 - 01:41:04 10 - 01:55:14 11 - 02:05:22 12 - 02:12:52

뚜뚜

94,312 просмотров • 5 месяцев назад

🚨DEEPSEEK ADMITS TRAINING ON OPENAI’S GPT-4? DeepSeek AI confessed that it was trained on ChatGPT (GPT-4)—after being asked about misconceptions about its reasoning model. DeepSeek seemingly identifies itself as GPT-4, raising serious questions about whether it used OpenAI data in its training. Ironically, some argue OpenAI itself scrapes massive amounts of data without permission—now facing its own tactics used against it. What goes around, comes around? Source: Sarainwondertech on Instagram

🚨DEEPSEEK ADMITS TRAINING ON OPENAI’S GPT-4? DeepSeek AI confessed that it was trained on ChatGPT (GPT-4)—after being asked about misconceptions about its reasoning model. DeepSeek seemingly identifies itself as GPT-4, raising serious questions about whether it used OpenAI data in its training. Ironically, some argue OpenAI itself scrapes massive amounts of data without permission—now facing its own tactics used against it. What goes around, comes around? Source: Sarainwondertech on Instagram

Mario Nawfal

315,357 просмотров • 1 год назад

개인적으로 앙콘에서 듣고싶은 거 (머 아님 드림쇼5 라던가 . .) 00:00 같은 시간 같은 자리 00:41 사랑한단 뜻이야 01:01 Don’t Need Your Love 01:18 사랑이 좀 어려워 01:40 사랑은 또다시 01:58 Quiet Down 02:11 지금처럼만 02:46 Rainbow (책갈피) 03:17 우리의 계절 03:49 주인공 04:11 Countdown 04:36 Better Than Gold 05:11 Fire Alarm 05:39 마지막인사 06:05 Best Friend Ever 06:22 icantfeelanything 06:48 SOS 07:05 Stupid Cupid 07:33 i hate fruits 07:58 Flying Kiss 08:17 나의 소나기 08:39 항해 09:03 Dream team 09:27 Butterflies 09:57 Cold Coffee

개인적으로 앙콘에서 듣고싶은 거 (머 아님 드림쇼5 라던가 . .) 00:00 같은 시간 같은 자리 00:41 사랑한단 뜻이야 01:01 Don’t Need Your Love 01:18 사랑이 좀 어려워 01:40 사랑은 또다시 01:58 Quiet Down 02:11 지금처럼만 02:46 Rainbow (책갈피) 03:17 우리의 계절 03:49 주인공 04:11 Countdown 04:36 Better Than Gold 05:11 Fire Alarm 05:39 마지막인사 06:05 Best Friend Ever 06:22 icantfeelanything 06:48 SOS 07:05 Stupid Cupid 07:33 i hate fruits 07:58 Flying Kiss 08:17 나의 소나기 08:39 항해 09:03 Dream team 09:27 Butterflies 09:57 Cold Coffee

동담 #TASTE

66,889 просмотров • 5 месяцев назад

OpenAI just released GPT-OSS: An Open Source Language Model on Hugging Face Open source meaning: 💸 Free 🔒 Private 🔧 Customizable

OpenAI just released GPT-OSS: An Open Source Language Model on Hugging Face Open source meaning: 💸 Free 🔒 Private 🔧 Customizable

dylan

21,568 просмотров • 10 месяцев назад