Video yükleniyor...

Video Yüklenemedi

Ana Sayfaya Dön

flash attention explained by Andrej Andrej Karpathy

168,185 görüntüleme • 1 yıl önce •via X (Twitter)

10 Yorum

water profil fotoğrafı
water1 yıl önce

@karpathy What lecture is this from?

ℏεsam profil fotoğrafı
ℏεsam1 yıl önce

@karpathy

Mathias Mendoza profil fotoğrafı
Mathias Mendoza1 yıl önce

@karpathy Andrej Karpathy’s tutorials and classes are gold 🤌🏻

Zack Angelo profil fotoğrafı
Zack Angelo1 yıl önce

@karpathy He openly asks why pytorch can't figure out how to call FA automatically. I think it's because FA has limitations that make it unusable in some cases. eg., you can't pass it an arbitrary attention mask. FlexAttention should fix some of these.

Albert Buchard 🇪🇺 profil fotoğrafı
Albert Buchard 🇪🇺1 yıl önce

@karpathy He is a master teacher

high_byte profil fotoğrafı
high_byte1 yıl önce

@karpathy I was wondering what that was about

ℏεsam profil fotoğrafı
ℏεsam1 yıl önce

@karpathy Let's Reproduce GPT-2

RecurseChat profil fotoğrafı
RecurseChat1 yıl önce

@karpathy Thanks for sharing this, Andrej's explanations are always gold.

Xirtam Esrevni profil fotoğrafı
Xirtam Esrevni1 yıl önce

@karpathy This is who I need to be, always investing the time to dissect the ideas in papers & build from the ground up. My suspicion is Karpathy is given the leeway to do what he wants since he probably doesn't need the money & people/companies just want him around.

schatt profil fotoğrafı
schatt1 yıl önce

@karpathy woah ! It was on my to-do list for tomorrow

Benzer Videolar