Loading video...

Video Failed to Load

Go Home

flash attention explained by Andrej Andrej Karpathy

168,185 views • 1 year ago •via X (Twitter)

10 Comments

water's profile picture
water1 year ago

@karpathy What lecture is this from?

ℏεsam's profile picture
ℏεsam1 year ago

@karpathy

Mathias Mendoza's profile picture
Mathias Mendoza1 year ago

@karpathy Andrej Karpathy’s tutorials and classes are gold 🤌🏻

Zack Angelo's profile picture
Zack Angelo1 year ago

@karpathy He openly asks why pytorch can't figure out how to call FA automatically. I think it's because FA has limitations that make it unusable in some cases. eg., you can't pass it an arbitrary attention mask. FlexAttention should fix some of these.

Albert Buchard 🇪🇺's profile picture
Albert Buchard 🇪🇺1 year ago

@karpathy He is a master teacher

high_byte's profile picture
high_byte1 year ago

@karpathy I was wondering what that was about

ℏεsam's profile picture
ℏεsam1 year ago

@karpathy Let's Reproduce GPT-2

RecurseChat's profile picture
RecurseChat1 year ago

@karpathy Thanks for sharing this, Andrej's explanations are always gold.

Xirtam Esrevni's profile picture
Xirtam Esrevni1 year ago

@karpathy This is who I need to be, always investing the time to dissect the ideas in papers & build from the ground up. My suspicion is Karpathy is given the leeway to do what he wants since he probably doesn't need the money & people/companies just want him around.

schatt's profile picture
schatt1 year ago

@karpathy woah ! It was on my to-do list for tomorrow

Related Videos