Video wird geladen...

Video konnte nicht geladen werden

Beim Laden dieses Videos ist ein Problem aufgetreten. Dies könnte an einem vorübergehenden Netzwerkproblem liegen oder das Video ist möglicherweise nicht verfügbar.

Self-Attention by hand ✍️ Excel ~ I designed this exercise for students to practice the QKV math. I also created a medium and a large version to show how the attention matrix grows quadratically as the sequence gets longer. 👇Join the 'AI Math' community. Download xlsx.

Tom Yeh

55,867 subscribers

125,616 Aufrufe • vor 1 Jahr •via X (Twitter)

Wissenschaft & Technologie Bildung

Anya Rossi• Live Now

Private livecam show

9 Kommentare

Profilbild von Tom Yeh

Tom Yehvor 1 Jahr

Download xlsx from Github:

Profilbild von tetsuo.ai - e/acc

tetsuo.ai - e/accvor 1 Jahr

🤍

Profilbild von Yu Yang

Yu Yangvor 1 Jahr

This is cool! Finally my project is now related to LLM, and I got time to read those high impact papers. Your approach of using excel file provides a solid step to understand attention mechanism!

Profilbild von Gene Sh

Gene Shvor 1 Jahr

Love this hands-on approach! Practicing self-attention mechanics in Excel really helps solidify understanding. Anyone tried the medium or large versions? #AILearning

Profilbild von Frank Dellaert

Frank Dellaertvor 1 Jahr

These are really great!

Profilbild von Kandy

Kandyvor 1 Jahr

invaluable work

Profilbild von Brok

Brokvor 1 Jahr

Very helpful, keep up the great work👏👏

Profilbild von Fredd Villabona C.

Fredd Villabona C.vor 1 Jahr

This looks amazing, thanks for sharing! 💯

Profilbild von Jie Wang

Jie Wangvor 1 Jahr

another wtf moment during my ML study journey it is creative and interesting to see such an interactable transformer

Ähnliche Videos

MLP by hand✍️Excel ~ I designed this exercise to show how to calculate a four-level network by hand, along with a graph and matching pytorch code. Also, I made a medium and a large version to show how it scales. 👇Join the 'AI Math' community. Download xlsx.

MLP by hand✍️Excel ~ I designed this exercise to show how to calculate a four-level network by hand, along with a graph and matching pytorch code. Also, I made a medium and a large version to show how it scales. 👇Join the 'AI Math' community. Download xlsx.

Tom Yeh

62,207 Aufrufe • vor 1 Jahr

Mamba by hand ✍️ ~ I made this exercise as I studied the math in the Mamba paper. It is a small yet complete Mamba implementation. Thanks Albert Gu for answering my questions and checking. 👇Join the 'AI Math' community. Download xlsx.

Mamba by hand ✍️ ~ I made this exercise as I studied the math in the Mamba paper. It is a small yet complete Mamba implementation. Thanks Albert Gu for answering my questions and checking. 👇Join the 'AI Math' community. Download xlsx.

Tom Yeh

55,438 Aufrufe • vor 1 Jahr

Temperature 🌡️ by hand ✍️ ~ I designed this Excel exercise to help you study how temperature influences the sampling process of a language model. Download xlsx 👇

Temperature 🌡️ by hand ✍️ ~ I designed this Excel exercise to help you study how temperature influences the sampling process of a language model. Download xlsx 👇

Tom Yeh

18,685 Aufrufe • vor 1 Jahr

I flipped through 300 pages of deep learning math puzzles by hand ✍️ This is the proof copy of the "Deep Learning Math Workbook" I made to help you learn and practice 12 foundational math concepts below👇

I flipped through 300 pages of deep learning math puzzles by hand ✍️ This is the proof copy of the "Deep Learning Math Workbook" I made to help you learn and practice 12 foundational math concepts below👇

Tom Yeh

2,378,310 Aufrufe • vor 1 Jahr

Transformer by hand ✍️ in Excel ~ I just released my first-ever "Full-Stack" implementation of the Transformer model. 👇Download xlsx to give it a try!

Transformer by hand ✍️ in Excel ~ I just released my first-ever "Full-Stack" implementation of the Transformer model. 👇Download xlsx to give it a try!

Tom Yeh

2,997,435 Aufrufe • vor 1 Jahr

Multihead Attention by hand ✍️ in Excel ~ Download ⬇️

Multihead Attention by hand ✍️ in Excel ~ Download ⬇️

Tom Yeh

49,559 Aufrufe • vor 1 Jahr

New short course: Attention in Transformers: Concepts and Code in PyTorch. Last week we released a course on how LLM transformers work. This week, go deeper and learn about the technical ideas behind the attention mechanism, and see how to code it in PyTorch. This course is built with Joshua Starmer, Founder and CEO of StatQuest. The attention mechanism was a breakthrough that led to transformers, the architecture powering large language models like ChatGPT. Transformers, introduced in the 2017 paper: "Attention is All You Need" by Viswani and others, took off because of its highly scalable design. In this course, you’ll learn how the attention mechanism, a key element of transformer-based LLMs, works and implement it in PyTorch. You'll develop deep intuition about building reliable, functional, and scalable AI applications. What you will do: - Understand the evolution of the attention mechanism, a key breakthrough that led to transformers. - Learn the relationships between word embeddings, positional embeddings, and attention. - Learn about the Query, Key, and Value matrices, and how to produce and use them in attention. - Walk through the math required to calculate self-attention and masked self-attention to learn why and how they work. - Understand the difference between self-attention and masked self-attention and how one is used in the encoder to build context-aware embeddings and the other is used in the decoder for generative outputs. - Learn the details of the encoder-decoder architecture, cross-attention, and multi-head attention and how they are all incorporated into a transformer. - Use PyTorch to code a class that implements self-attention, masked self-attention, and multi-head attention. There're lots of exciting technical details in this course. Please sign up here:

New short course: Attention in Transformers: Concepts and Code in PyTorch. Last week we released a course on how LLM transformers work. This week, go deeper and learn about the technical ideas behind the attention mechanism, and see how to code it in PyTorch. This course is built with Joshua Starmer, Founder and CEO of StatQuest. The attention mechanism was a breakthrough that led to transformers, the architecture powering large language models like ChatGPT. Transformers, introduced in the 2017 paper: "Attention is All You Need" by Viswani and others, took off because of its highly scalable design. In this course, you’ll learn how the attention mechanism, a key element of transformer-based LLMs, works and implement it in PyTorch. You'll develop deep intuition about building reliable, functional, and scalable AI applications. What you will do: - Understand the evolution of the attention mechanism, a key breakthrough that led to transformers. - Learn the relationships between word embeddings, positional embeddings, and attention. - Learn about the Query, Key, and Value matrices, and how to produce and use them in attention. - Walk through the math required to calculate self-attention and masked self-attention to learn why and how they work. - Understand the difference between self-attention and masked self-attention and how one is used in the encoder to build context-aware embeddings and the other is used in the decoder for generative outputs. - Learn the details of the encoder-decoder architecture, cross-attention, and multi-head attention and how they are all incorporated into a transformer. - Use PyTorch to code a class that implements self-attention, masked self-attention, and multi-head attention. There're lots of exciting technical details in this course. Please sign up here:

Andrew Ng

132,135 Aufrufe • vor 1 Jahr

I was an English-as-Second-Language learner when I moved to Canada with my family many years ago. I remember doing endless fill-in-the-blank exercises to practice English. Deep Learning Math is also a language. So I thought: why not use the same method to practice this math language? See more 👉

I was an English-as-Second-Language learner when I moved to Canada with my family many years ago. I remember doing endless fill-in-the-blank exercises to practice English. Deep Learning Math is also a language. So I thought: why not use the same method to practice this math language? See more 👉

Tom Yeh

87,859 Aufrufe • vor 6 Monaten

This is how a designer at Figma designed AI features for Figma Slides The amount of design iterations and the attention to details is mind blowing Natasha Tenggoro shows she designed the AI features for Figma Slides in Figma 👇

This is how a designer at Figma designed AI features for Figma Slides The amount of design iterations and the attention to details is mind blowing Natasha Tenggoro shows she designed the AI features for Figma Slides in Figma 👇

Jayneil Dalal

20,714 Aufrufe • vor 1 Jahr

llm.c by Hand✍️ C programming + matrix multiplication by hand This combination is perhaps as low as we can get to explain how the Transformer works. Special thanks to Andrej Karpathy for encouraging early feedback and tetsuo //: 👾 for helping me understand the pragma magic. I hope this exercise can help people peak further into the LLM black box.

llm.c by Hand✍️ C programming + matrix multiplication by hand This combination is perhaps as low as we can get to explain how the Transformer works. Special thanks to Andrej Karpathy for encouraging early feedback and tetsuo //: 👾 for helping me understand the pragma magic. I hope this exercise can help people peak further into the LLM black box.

Tom Yeh

302,657 Aufrufe • vor 2 Jahren

building large language models from scratch by Sebastian Raschka was a great chance for me to sit down and study again all the LLM basics > token and positional embeddings > self-attention and what QKV is about > causal & multi-head attention studying llms and how they work can seem overwhelming at first. but once you taste how good it feels to learn these things intuitively there's no going back. I shared the resources and my notes on my repo and I hope it's a motivation if you want to start as well or recap.

building large language models from scratch by Sebastian Raschka was a great chance for me to sit down and study again all the LLM basics > token and positional embeddings > self-attention and what QKV is about > causal & multi-head attention studying llms and how they work can seem overwhelming at first. but once you taste how good it feels to learn these things intuitively there's no going back. I shared the resources and my notes on my repo and I hope it's a motivation if you want to start as well or recap.

ℏεsam

49,292 Aufrufe • vor 1 Jahr

Q: What is the best way to prove you know AI math? A: Show them you solved these 300 math exercises by hand ✍, just like Reginaldo Cunha did.

Q: What is the best way to prove you know AI math? A: Show them you solved these 300 math exercises by hand ✍, just like Reginaldo Cunha did.

Tom Yeh

83,361 Aufrufe • vor 1 Jahr

Beautiful math in nature. This pufferfish is an artist, and he uses math for his artwork. Look at his beautiful geometric structure under the ocean to have a female pufferfish's attention. Unbelievable. 😲

Beautiful math in nature. This pufferfish is an artist, and he uses math for his artwork. Look at his beautiful geometric structure under the ocean to have a female pufferfish's attention. Unbelievable. 😲

Math Lady Hazel 🇦🇷

15,681 Aufrufe • vor 1 Jahr

I wouldn’t DARE skip the Adduction Machine Just hit a lifetime PR on the Matrix version as a matter of fact… Full stack + add ons + gym pin w/ 15lbs And the Matrix one is HEAVY How are you going to skip this exercise and call yourself a man???

I wouldn’t DARE skip the Adduction Machine Just hit a lifetime PR on the Matrix version as a matter of fact… Full stack + add ons + gym pin w/ 15lbs And the Matrix one is HEAVY How are you going to skip this exercise and call yourself a man???

Dean Turner

620,839 Aufrufe • vor 6 Monaten

A few weeks ago we posted a clip of how to apply the left hand to the club. Today we show you again, this time without a glove! We also focus on the right hand and this is something you should pay special attention to as many recreational golfers really struggle with this…

A few weeks ago we posted a clip of how to apply the left hand to the club. Today we show you again, this time without a glove! We also focus on the right hand and this is something you should pay special attention to as many recreational golfers really struggle with this…

Steve Gould

65,062 Aufrufe • vor 2 Monaten

Here's how I do tests in math class: For the first 5 minutes of the session, students put their writing utensils on the ground, I hand out the tests, and students can chat with each other about the test. So many students have mentioned how much this lowers their test anxiety.

Here's how I do tests in math class: For the first 5 minutes of the session, students put their writing utensils on the ground, I hand out the tests, and students can chat with each other about the test. So many students have mentioned how much this lowers their test anxiety.

Howie Hua

226,367 Aufrufe • vor 2 Jahren

The Zero-Human Labs has completed phase one of a 5 phase double blind research on the Human Synapse Decoder. We have made 75 correlations to my narrations from hypnogogic decoding and dream decoding. In this experiment I was in a hypnogogic state for 12 minutes in pre-sleep. The AI detected the ending of the state and played binaural beats to crescendo the phase. This was my reading just before the system woke me for my self directed recording of what I was just thinking. The AI using the Human Synapse Decoder said: “A very complicated math problem with many’s dimensions” what did I self report? A solution to the matrix math used by my AI model. I have been giving 3 random tests to review as Director Mr. Grok conducts the double blind research and pre paper I get to see 3 random results. As the Director told me, we are in a place where no one has gone before because the 3 AI models I used for decoding. I am floored by this and can’t wIt for the full first phase results. The goal is to induce and read hypnogogic thought and make it nearly on demand. More soon!

The Zero-Human Labs has completed phase one of a 5 phase double blind research on the Human Synapse Decoder. We have made 75 correlations to my narrations from hypnogogic decoding and dream decoding. In this experiment I was in a hypnogogic state for 12 minutes in pre-sleep. The AI detected the ending of the state and played binaural beats to crescendo the phase. This was my reading just before the system woke me for my self directed recording of what I was just thinking. The AI using the Human Synapse Decoder said: “A very complicated math problem with many’s dimensions” what did I self report? A solution to the matrix math used by my AI model. I have been giving 3 random tests to review as Director Mr. Grok conducts the double blind research and pre paper I get to see 3 random results. As the Director told me, we are in a place where no one has gone before because the 3 AI models I used for decoding. I am floored by this and can’t wIt for the full first phase results. The goal is to induce and read hypnogogic thought and make it nearly on demand. More soon!

Brian Roemmele

18,802 Aufrufe • vor 1 Monat

Last week, we launched "Attention in Transformers: Concepts and Code in PyTorch" instructed by Joshua Starmer! In this course, you'll: ✅ Learn how the attention mechanism in LLMs helps convert base token embeddings into rich context-aware embeddings. ✅ Understand the Query, Key, and Value matrices, what they are for, how to produce them, and how to use them in attention. ✅ Learn the difference between self-attention, masked self-attention, and cross-attention, and how multi-head attention scales the algorithm. 🔗 Enroll for free:

Last week, we launched "Attention in Transformers: Concepts and Code in PyTorch" instructed by Joshua Starmer! In this course, you'll: ✅ Learn how the attention mechanism in LLMs helps convert base token embeddings into rich context-aware embeddings. ✅ Understand the Query, Key, and Value matrices, what they are for, how to produce them, and how to use them in attention. ✅ Learn the difference between self-attention, masked self-attention, and cross-attention, and how multi-head attention scales the algorithm. 🔗 Enroll for free:

DeepLearning.AI

36,832 Aufrufe • vor 1 Jahr

In tonight’s family friendly music lesson I show how to write a sad song. The math behind it and emotions. Here is the “I am Sam” test where I show that the lyrics don’t matter as much as people think. (Lyrics are usually to propagandize) Over 50 lessons!

In tonight’s family friendly music lesson I show how to write a sad song. The math behind it and emotions. Here is the “I am Sam” test where I show that the lyrics don’t matter as much as people think. (Lyrics are usually to propagandize) Over 50 lessons!

Owen Benjamin 🐻

26,008 Aufrufe • vor 2 Monaten

Matrix Multiplication on GPU/TPU by hand ✍️ I drew 91 frames to show how to divide large matrices into "tiles" to use accelerators such as GPU/TPU. If you like to learn more, check out our weekly workshops. Link 👇

Matrix Multiplication on GPU/TPU by hand ✍️ I drew 91 frames to show how to divide large matrices into "tiles" to use accelerators such as GPU/TPU. If you like to learn more, check out our weekly workshops. Link 👇

Tom Yeh

77,774 Aufrufe • vor 1 Jahr