Загрузка видео...

Не удалось загрузить видео

На главную

flash attention explained by Andrej Andrej Karpathy

168,185 просмотров • 1 год назад •via X (Twitter)

Комментарии: 10

Фото профиля water
water1 год назад

@karpathy What lecture is this from?

Фото профиля ℏεsam
ℏεsam1 год назад

@karpathy

Фото профиля Mathias Mendoza
Mathias Mendoza1 год назад

@karpathy Andrej Karpathy’s tutorials and classes are gold 🤌🏻

Фото профиля Zack Angelo
Zack Angelo1 год назад

@karpathy He openly asks why pytorch can't figure out how to call FA automatically. I think it's because FA has limitations that make it unusable in some cases. eg., you can't pass it an arbitrary attention mask. FlexAttention should fix some of these.

Фото профиля Albert Buchard 🇪🇺
Albert Buchard 🇪🇺1 год назад

@karpathy He is a master teacher

Фото профиля high_byte
high_byte1 год назад

@karpathy I was wondering what that was about

Фото профиля ℏεsam
ℏεsam1 год назад

@karpathy Let's Reproduce GPT-2

Фото профиля RecurseChat
RecurseChat1 год назад

@karpathy Thanks for sharing this, Andrej's explanations are always gold.

Фото профиля Xirtam Esrevni
Xirtam Esrevni1 год назад

@karpathy This is who I need to be, always investing the time to dissect the ideas in papers & build from the ground up. My suspicion is Karpathy is given the leeway to do what he wants since he probably doesn't need the money & people/companies just want him around.

Фото профиля schatt
schatt1 год назад

@karpathy woah ! It was on my to-do list for tomorrow

Похожие видео