Video wird geladen...

Video konnte nicht geladen werden

Zur Startseite

flash attention explained by Andrej Andrej Karpathy

168,185 Aufrufe • vor 1 Jahr •via X (Twitter)

10 Kommentare

Profilbild von water
watervor 1 Jahr

@karpathy What lecture is this from?

Profilbild von ℏεsam
ℏεsamvor 1 Jahr

@karpathy

Profilbild von Mathias Mendoza
Mathias Mendozavor 1 Jahr

@karpathy Andrej Karpathy’s tutorials and classes are gold 🤌🏻

Profilbild von Zack Angelo
Zack Angelovor 1 Jahr

@karpathy He openly asks why pytorch can't figure out how to call FA automatically. I think it's because FA has limitations that make it unusable in some cases. eg., you can't pass it an arbitrary attention mask. FlexAttention should fix some of these.

Profilbild von Albert Buchard 🇪🇺
Albert Buchard 🇪🇺vor 1 Jahr

@karpathy He is a master teacher

Profilbild von high_byte
high_bytevor 1 Jahr

@karpathy I was wondering what that was about

Profilbild von ℏεsam
ℏεsamvor 1 Jahr

@karpathy Let's Reproduce GPT-2

Profilbild von RecurseChat
RecurseChatvor 1 Jahr

@karpathy Thanks for sharing this, Andrej's explanations are always gold.

Profilbild von Xirtam Esrevni
Xirtam Esrevnivor 1 Jahr

@karpathy This is who I need to be, always investing the time to dissect the ideas in papers & build from the ground up. My suspicion is Karpathy is given the leeway to do what he wants since he probably doesn't need the money & people/companies just want him around.

Profilbild von schatt
schattvor 1 Jahr

@karpathy woah ! It was on my to-do list for tomorrow

Ähnliche Videos