正在加载视频...

视频加载失败

加载此视频时出现问题。这可能是由于临时网络问题，或视频可能不可用。

4/ Recursive Memory Attention Running self-attention via the kernel for processing hierarchical attention unreasonably fast Built in Rust and runs quickly noah

Alex Reibman 🖇️

42,926 subscribers

31,669 次观看 • 1 年前 •via X (Twitter)

教育科学技术游戏

Anya Rossi• Live Now

Private livecam show

12 条评论

Alex Reibman 🖇️ 的头像

Alex Reibman 🖇️1 年前

Saw these guys walking around Hayes looking for coworking space, so I invited them to HQ I was not prepared. These guys are absolutely cracked. Here’s the demos from the Swedish Hack Mafia (🧵):

Alex Reibman 🖇️ 的头像

Alex Reibman 🖇️1 年前

1/ Lovable AI agent for building + replicating websites Agentic web app builder @lovable_dev

Alex Reibman 🖇️ 的头像

Alex Reibman 🖇️1 年前

2/ Rubik’s Cube CSS 3D interactive Rubik’s cube built entirely in CSS @whosmatu

Alex Reibman 🖇️ 的头像

Alex Reibman 🖇️1 年前

3/ ChatTree Visualize ChatGPT conversations as interactive trees @rikardradovac

Alex Reibman 🖇️ 的头像

Alex Reibman 🖇️1 年前

Are you also cracked and want to work on insane, ambitious projects? I’m hiring engineers to build the next generation Agent Stack, DM me @AlexReibman @AgentOpsAI

Alex Reibman 🖇️ 的头像

Alex Reibman 🖇️1 年前

5/ E.M.A.I.L. AI agent that autonomously scrapes websites and creates cold outreach emails @jameszhou02 (not actually Swedish)

Jordan Ross 的头像

Jordan Ross1 年前

The most successful agency owners remove themselves from client servicing so they can focus on growing the business. Here's how we help our clients do that:

Dhruv 👾 的头像

Dhruv 👾1 年前

@DonaldPepe1 Hey @DonaldPepe1 , is this open source? I would love to checkout your project.

DonnySolana 的头像

DonnySolana1 年前

@DonaldPepe1 nice.

∯🔔 的头像

∯🔔1 年前

@threadreaderapp unroll here

Thread Reader App 的头像

Thread Reader App1 年前

@AlexReibman @Thrasymachus5 Hola, here is your unroll: Talk to you soon. 🤖

Free 的头像

Free1 年前

@DonaldPepe1 Smart. Does it improve accuracy?

相关视频

Last week, we launched "Attention in Transformers: Concepts and Code in PyTorch" instructed by Joshua Starmer! In this course, you'll: ✅ Learn how the attention mechanism in LLMs helps convert base token embeddings into rich context-aware embeddings. ✅ Understand the Query, Key, and Value matrices, what they are for, how to produce them, and how to use them in attention. ✅ Learn the difference between self-attention, masked self-attention, and cross-attention, and how multi-head attention scales the algorithm. 🔗 Enroll for free:

Last week, we launched "Attention in Transformers: Concepts and Code in PyTorch" instructed by Joshua Starmer! In this course, you'll: ✅ Learn how the attention mechanism in LLMs helps convert base token embeddings into rich context-aware embeddings. ✅ Understand the Query, Key, and Value matrices, what they are for, how to produce them, and how to use them in attention. ✅ Learn the difference between self-attention, masked self-attention, and cross-attention, and how multi-head attention scales the algorithm. 🔗 Enroll for free:

DeepLearning.AI

36,832 次观看 • 1 年前

New short course: Attention in Transformers: Concepts and Code in PyTorch. Last week we released a course on how LLM transformers work. This week, go deeper and learn about the technical ideas behind the attention mechanism, and see how to code it in PyTorch. This course is built with Joshua Starmer, Founder and CEO of StatQuest. The attention mechanism was a breakthrough that led to transformers, the architecture powering large language models like ChatGPT. Transformers, introduced in the 2017 paper: "Attention is All You Need" by Viswani and others, took off because of its highly scalable design. In this course, you’ll learn how the attention mechanism, a key element of transformer-based LLMs, works and implement it in PyTorch. You'll develop deep intuition about building reliable, functional, and scalable AI applications. What you will do: - Understand the evolution of the attention mechanism, a key breakthrough that led to transformers. - Learn the relationships between word embeddings, positional embeddings, and attention. - Learn about the Query, Key, and Value matrices, and how to produce and use them in attention. - Walk through the math required to calculate self-attention and masked self-attention to learn why and how they work. - Understand the difference between self-attention and masked self-attention and how one is used in the encoder to build context-aware embeddings and the other is used in the decoder for generative outputs. - Learn the details of the encoder-decoder architecture, cross-attention, and multi-head attention and how they are all incorporated into a transformer. - Use PyTorch to code a class that implements self-attention, masked self-attention, and multi-head attention. There're lots of exciting technical details in this course. Please sign up here:

New short course: Attention in Transformers: Concepts and Code in PyTorch. Last week we released a course on how LLM transformers work. This week, go deeper and learn about the technical ideas behind the attention mechanism, and see how to code it in PyTorch. This course is built with Joshua Starmer, Founder and CEO of StatQuest. The attention mechanism was a breakthrough that led to transformers, the architecture powering large language models like ChatGPT. Transformers, introduced in the 2017 paper: "Attention is All You Need" by Viswani and others, took off because of its highly scalable design. In this course, you’ll learn how the attention mechanism, a key element of transformer-based LLMs, works and implement it in PyTorch. You'll develop deep intuition about building reliable, functional, and scalable AI applications. What you will do: - Understand the evolution of the attention mechanism, a key breakthrough that led to transformers. - Learn the relationships between word embeddings, positional embeddings, and attention. - Learn about the Query, Key, and Value matrices, and how to produce and use them in attention. - Walk through the math required to calculate self-attention and masked self-attention to learn why and how they work. - Understand the difference between self-attention and masked self-attention and how one is used in the encoder to build context-aware embeddings and the other is used in the decoder for generative outputs. - Learn the details of the encoder-decoder architecture, cross-attention, and multi-head attention and how they are all incorporated into a transformer. - Use PyTorch to code a class that implements self-attention, masked self-attention, and multi-head attention. There're lots of exciting technical details in this course. Please sign up here:

Andrew Ng

132,135 次观看 • 1 年前

AND if you pay attention when noah joins the livestream finn smiles for the first time :)

AND if you pay attention when noah joins the livestream finn smiles for the first time :)

rockin’ kiv

430,855 次观看 • 5 个月前

Pretending to self-harm yourself in prison for attention. The lowest is what?

Sensitive content

Pretending to self-harm yourself in prison for attention. The lowest is what?

Boochie is the Name

94,172 次观看 • 1 年前

🇺🇸 Chamath just revealed the truth on Rogan… attention is what runs it all. Google ranks the web by attention. Facebook and Instagram are built to capture attention. The most important AI paper ever written is literally called "Attention Is All You Need." Every system that now runs the world reduces to the same single currency. And at some point, that stops sounding like branding and starts sounding like a warning. Joe Rogan, Chamath Palihapitiya

🇺🇸 Chamath just revealed the truth on Rogan… attention is what runs it all. Google ranks the web by attention. Facebook and Instagram are built to capture attention. The most important AI paper ever written is literally called "Attention Is All You Need." Every system that now runs the world reduces to the same single currency. And at some point, that stops sounding like branding and starts sounding like a warning. Joe Rogan, Chamath Palihapitiya

Mario Nawfal

314,597 次观看 • 1 个月前

Look for attention and you get attention....

Look for attention and you get attention....

Wild content

4,957,460 次观看 • 2 年前

This is my favorite clip of the new Elon pod. He opens up saying xAI struggles with memory usage/bandwidth and CUDA kernel optimization (matmul, attention, MoE, etc). If you are good kernel or performance engineering in general, you should apply. Steer the world in a better direction.

This is my favorite clip of the new Elon pod. He opens up saying xAI struggles with memory usage/bandwidth and CUDA kernel optimization (matmul, attention, MoE, etc). If you are good kernel or performance engineering in general, you should apply. Steer the world in a better direction.

Elliot Arledge

163,922 次观看 • 5 个月前

Today’s Athlete: Respect vs. Attention Respect endures longer than attention. Respect is built in the unseen hours of practice, through perseverance after tough losses and poor performances, and by sacrifice. Attention focuses on the highlights, hype, and viral moments—here one second, gone the next. #RespectOverAttention

Today’s Athlete: Respect vs. Attention Respect endures longer than attention. Respect is built in the unseen hours of practice, through perseverance after tough losses and poor performances, and by sacrifice. Attention focuses on the highlights, hype, and viral moments—here one second, gone the next. #RespectOverAttention

Paul Biancardi

33,059 次观看 • 10 个月前

“He’s in a weird, self-destructive tailspin at this point.” Molly Jong-Fast tells Mehdi Hasan the barrage of Elon Musk coverage won’t last forever: “This kind of attention tends to peter out pretty quickly."

“He’s in a weird, self-destructive tailspin at this point.” Molly Jong-Fast tells Mehdi Hasan the barrage of Elon Musk coverage won’t last forever: “This kind of attention tends to peter out pretty quickly."

The Mehdi Hasan Show

51,297 次观看 • 3 年前

Memory Corruption Vulnerability in Linux "Memory Safe" Rust Code Rust programmers re-wrote a portion of the Linux kernel in Rust. That Rust code had a crashing vulnerability in an "unsafe" chunk of code... which Linux is littered with.

Memory Corruption Vulnerability in Linux "Memory Safe" Rust Code Rust programmers re-wrote a portion of the Linux kernel in Rust. That Rust code had a crashing vulnerability in an "unsafe" chunk of code... which Linux is littered with.

The Lunduke Journal

70,616 次观看 • 6 个月前

Landing page designed in just a few days. Take a look at these micro-interactions. Built fast. Obsessive attention to detail.

Landing page designed in just a few days. Take a look at these micro-interactions. Built fast. Obsessive attention to detail.

James Laurent

74,681 次观看 • 5 个月前

self-attention explained by 3blue1brown's new video

self-attention explained by 3blue1brown's new video

ℏεsam

48,342 次观看 • 1 年前

When the beauty stand out... She wasn’t looking for attention… attention found her

When the beauty stand out... She wasn’t looking for attention… attention found her

Grace Aprilia

14,094 次观看 • 1 个月前

This is early in the 2nd half. Worthy and Brown stack at the top with a yo-yo motion for Brown. Noah Gray takes the attention of the left Saftey, and Worthy runs wide open down the field. Mahomes feels the collapse, even though there's no "immediate" threat. Runs into a sack

This is early in the 2nd half. Worthy and Brown stack at the top with a yo-yo motion for Brown. Noah Gray takes the attention of the left Saftey, and Worthy runs wide open down the field. Mahomes feels the collapse, even though there's no "immediate" threat. Runs into a sack

Daniel Harms

85,071 次观看 • 1 年前

Huge #bbc gets thotz attention fast #monkeyapp

Huge #bbc gets thotz attention fast #monkeyapp

Kobe

193,826 次观看 • 6 个月前

Attention to the map Attention to the map Attention to the designated grid square My reliable teammates:

Attention to the map Attention to the map Attention to the designated grid square My reliable teammates:

Cherry

521,071 次观看 • 4 个月前

When you dress for attention, but don't want attention.

When you dress for attention, but don't want attention.

NewsForce

95,034 次观看 • 6 个月前

2. Cognitive Effects Nicotine improves cognition almost immediately. It boosts: • Working memory (UCLA, 2012) • Attention span (Nature, 1998) • Processing speed (Psychopharmacology, 2000) But how? Max Lugavere

2. Cognitive Effects Nicotine improves cognition almost immediately. It boosts: • Working memory (UCLA, 2012) • Attention span (Nature, 1998) • Processing speed (Psychopharmacology, 2000) But how? Max Lugavere

Shayan Sen

98,408 次观看 • 1 年前

Nemotron 3 Nano runs nicely with mlx-lm on an M4 Max. Could be a great model for local use on Mac: MoE + hybrid attention make it fast even for very long context. Generating in realtime with 4-bit model:

Nemotron 3 Nano runs nicely with mlx-lm on an M4 Max. Could be a great model for local use on Mac: MoE + hybrid attention make it fast even for very long context. Generating in realtime with 4-bit model:

Awni Hannun

51,029 次观看 • 6 个月前

Grandpa built this as a Christmas present for his grandson from Santa and the attention to detail is actually insane

Grandpa built this as a Christmas present for his grandson from Santa and the attention to detail is actually insane

Dudes Posting Their W’s

193,144 次观看 • 5 个月前