Yan Chen's banner

Yan Chen

@HCI_Prof_YC • 1,812 subscribers

#CS Assistant Prof @virginia_tech. #HCI #CSEdu #AIEdu #VIS

Shorts

Transformer: Multi-Head Attention ~ Math vs Code 🔢💻 ~ I made this visualization to show you how to implement the multi-head attention math in PyTorch within 50 LoC. Multi-Head Attention is what makes the Transformer's performance outstanding. It captures and represents more diverse linguistic relationships and patterns, and attends to different learned input embedding spaces. The parallel computing design also makes the model more efficient.

Transformer: Multi-Head Attention ~ Math vs Code 🔢💻 ~ I made this visualization to show you how to implement the multi-head attention math in PyTorch within 50 LoC. Multi-Head Attention is what makes the Transformer's performance outstanding. It captures and represents more diverse linguistic relationships and patterns, and attends to different learned input embedding spaces. The parallel computing design also makes the model more efficient.

33,326 просмотров