Video yükleniyor...

Video Yüklenemedi

Ana Sayfaya Dön

Self-Attention by hand ✍️ Excel ~ I designed this exercise for students to practice the QKV math. I also created a medium and a large version to show how the attention matrix grows quadratically as the sequence gets longer. 👇Join the 'AI Math' community. Download xlsx.

125,616 görüntüleme • 1 yıl önce •via X (Twitter)

9 Yorum

Tom Yeh profil fotoğrafı
Tom Yeh1 yıl önce

Download xlsx from Github:

tetsuo.ai - e/acc profil fotoğrafı
tetsuo.ai - e/acc1 yıl önce

🤍

Yu Yang profil fotoğrafı
Yu Yang1 yıl önce

This is cool! Finally my project is now related to LLM, and I got time to read those high impact papers. Your approach of using excel file provides a solid step to understand attention mechanism!

Gene Sh profil fotoğrafı
Gene Sh1 yıl önce

Love this hands-on approach! Practicing self-attention mechanics in Excel really helps solidify understanding. Anyone tried the medium or large versions? #AILearning

Frank Dellaert profil fotoğrafı
Frank Dellaert1 yıl önce

These are really great!

Kandy profil fotoğrafı
Kandy1 yıl önce

invaluable work

Brok profil fotoğrafı
Brok1 yıl önce

Very helpful, keep up the great work👏👏

Fredd Villabona C. profil fotoğrafı
Fredd Villabona C.1 yıl önce

This looks amazing, thanks for sharing! 💯

Jie Wang profil fotoğrafı
Jie Wang1 yıl önce

another wtf moment during my ML study journey it is creative and interesting to see such an interactable transformer

Benzer Videolar

New short course: Attention in Transformers: Concepts and Code in PyTorch. Last week we released a course on how LLM transformers work. This week, go deeper and learn about the technical ideas behind the attention mechanism, and see how to code it in PyTorch. This course is built with Joshua Starmer, Founder and CEO of StatQuest. The attention mechanism was a breakthrough that led to transformers, the architecture powering large language models like ChatGPT. Transformers, introduced in the 2017 paper: "Attention is All You Need" by Viswani and others, took off because of its highly scalable design. In this course, you’ll learn how the attention mechanism, a key element of transformer-based LLMs, works and implement it in PyTorch. You'll develop deep intuition about building reliable, functional, and scalable AI applications. What you will do: - Understand the evolution of the attention mechanism, a key breakthrough that led to transformers. - Learn the relationships between word embeddings, positional embeddings, and attention. - Learn about the Query, Key, and Value matrices, and how to produce and use them in attention. - Walk through the math required to calculate self-attention and masked self-attention to learn why and how they work. - Understand the difference between self-attention and masked self-attention and how one is used in the encoder to build context-aware embeddings and the other is used in the decoder for generative outputs. - Learn the details of the encoder-decoder architecture, cross-attention, and multi-head attention and how they are all incorporated into a transformer. - Use PyTorch to code a class that implements self-attention, masked self-attention, and multi-head attention. There're lots of exciting technical details in this course. Please sign up here:

Andrew Ng

132,135 görüntüleme • 1 yıl önce