Загрузка видео...

Не удалось загрузить видео

На главную

Project #2: LLM Visualization So I created a web-page to visualize a small LLM, of the sort that's behind ChatGPT. Rendered in 3D, it shows all the steps to run a single token inference. (link in bio)

1,201,038 просмотров • 2 лет назад •via X (Twitter)

Комментарии: 12

Фото профиля Brendan Bycroft
Brendan Bycroft2 лет назад

It also contains a walkthrough/guide of the steps, as well as a few interactive elements to play with. Why, you ask? For what purpose did I put all the time & effort into this project?

Фото профиля Brendan Bycroft
Brendan Bycroft2 лет назад

There's a real advantage to unpacking a set of abstractions, flattening them out. Abstractions can be useful for terseness and management, but they can be a real blocker to seeing the big picture.

Фото профиля Brendan Bycroft
Brendan Bycroft2 лет назад

With this, you can see the whole thing at once. You can see where the computation takes place, its complexity, and relative sizes of the tensors & weights.

Фото профиля Brendan Bycroft
Brendan Bycroft2 лет назад

The model with all the animations is tiiny, to make it tractable. For comparison, I threw in a few of the larger models (GPT-2, GPT-3), render-only. And when you see what it takes to just produce a single value in a mat-mul, the sheer scale of these things becomes apparent.

Фото профиля Brendan Bycroft
Brendan Bycroft2 лет назад

(Here's what goes into calculating a _single_ output value of a matrix-multiply)

Фото профиля Brendan Bycroft
Brendan Bycroft2 лет назад

What about understanding what each layer does? Uhh, sorry, won't be much help. The project just came out of "Let's build a 3D viz!", so the scope is a bit limited. It's more: here's a way to learn & digest the algorithm, and perhaps think about how to optimize the process.

Фото профиля Brendan Bycroft
Brendan Bycroft2 лет назад

As for what I got out of creating this: before I made it, I mostly knew how image convolution nets worked, but language-based models seemed kinda magical in comparison. Well, now I know them in a fair amount of detail!

Фото профиля Brendan Bycroft
Brendan Bycroft2 лет назад

I also learnt a good amount of GL (dF/dx, fwidth, ubos, instancing), and animation approaches. So, uhh, even if no-one sees this, the project definitely has some value to me.

Фото профиля Brendan Bycroft
Brendan Bycroft2 лет назад

Oh yeah, the link is here: Works best on desktop (sorry mobile). Left-click drag, right-click rotate, scroll to zoom. And hover over the tensor cells. Blue cells are weights/parameters, green cells are intermediate values. Each cell is a single number!

Фото профиля Brendan Bycroft
Brendan Bycroft2 лет назад

Well, I hope you find it interesting. Let me know your thoughts! And if someone makes it through the walkthrough and finds it a little ~incomplete towards the end I might even getting around to fix it (my attention has largely turned to other projects oops)

Фото профиля Brendan Bycroft
Brendan Bycroft2 лет назад

Short follow-up thread:

Фото профиля Brendan Bycroft
Brendan Bycroft2 лет назад

A technical guide to how I built it:

Похожие видео