Video yükleniyor...

Video Yüklenemedi

Ana Sayfaya Dön

Project #2: LLM Visualization So I created a web-page to visualize a small LLM, of the sort that's behind ChatGPT. Rendered in 3D, it shows all the steps to run a single token inference. (link in bio)

1,201,234 görüntüleme • 2 yıl önce •via X (Twitter)

12 Yorum

Brendan Bycroft profil fotoğrafı
Brendan Bycroft2 yıl önce

It also contains a walkthrough/guide of the steps, as well as a few interactive elements to play with. Why, you ask? For what purpose did I put all the time & effort into this project?

Brendan Bycroft profil fotoğrafı
Brendan Bycroft2 yıl önce

There's a real advantage to unpacking a set of abstractions, flattening them out. Abstractions can be useful for terseness and management, but they can be a real blocker to seeing the big picture.

Brendan Bycroft profil fotoğrafı
Brendan Bycroft2 yıl önce

With this, you can see the whole thing at once. You can see where the computation takes place, its complexity, and relative sizes of the tensors & weights.

Brendan Bycroft profil fotoğrafı
Brendan Bycroft2 yıl önce

The model with all the animations is tiiny, to make it tractable. For comparison, I threw in a few of the larger models (GPT-2, GPT-3), render-only. And when you see what it takes to just produce a single value in a mat-mul, the sheer scale of these things becomes apparent.

Brendan Bycroft profil fotoğrafı
Brendan Bycroft2 yıl önce

(Here's what goes into calculating a _single_ output value of a matrix-multiply)

Brendan Bycroft profil fotoğrafı
Brendan Bycroft2 yıl önce

What about understanding what each layer does? Uhh, sorry, won't be much help. The project just came out of "Let's build a 3D viz!", so the scope is a bit limited. It's more: here's a way to learn & digest the algorithm, and perhaps think about how to optimize the process.

Brendan Bycroft profil fotoğrafı
Brendan Bycroft2 yıl önce

As for what I got out of creating this: before I made it, I mostly knew how image convolution nets worked, but language-based models seemed kinda magical in comparison. Well, now I know them in a fair amount of detail!

Brendan Bycroft profil fotoğrafı
Brendan Bycroft2 yıl önce

I also learnt a good amount of GL (dF/dx, fwidth, ubos, instancing), and animation approaches. So, uhh, even if no-one sees this, the project definitely has some value to me.

Brendan Bycroft profil fotoğrafı
Brendan Bycroft2 yıl önce

Oh yeah, the link is here: Works best on desktop (sorry mobile). Left-click drag, right-click rotate, scroll to zoom. And hover over the tensor cells. Blue cells are weights/parameters, green cells are intermediate values. Each cell is a single number!

Brendan Bycroft profil fotoğrafı
Brendan Bycroft2 yıl önce

Well, I hope you find it interesting. Let me know your thoughts! And if someone makes it through the walkthrough and finds it a little ~incomplete towards the end I might even getting around to fix it (my attention has largely turned to other projects oops)

Brendan Bycroft profil fotoğrafı
Brendan Bycroft2 yıl önce

Short follow-up thread:

Brendan Bycroft profil fotoğrafı
Brendan Bycroft2 yıl önce

A technical guide to how I built it:

Benzer Videolar