正在加载视频...

视频加载失败

llama 4 maverick do visually look stunning.

62,407 次观看 • 1 年前 •via X (Twitter)

10 条评论

losh 的头像
losh1 年前

a little explanation of what you are seeing here. The models when stored have a structure (not execution graph, just how parameters are grouped). the color coding represent types of param blocks and the block size is log(param_size). I did similar plots in 2d sometime back:

Viraat 的头像
Viraat1 年前

This is really cool - in a few months if you are looking for a job and are interested hmu

losh 的头像
losh1 年前

sure, and thanks!

Krishna Mohan 的头像
Krishna Mohan1 年前

Amazing

⛓️☆ ilex ☆⛓️ 的头像
⛓️☆ ilex ☆⛓️1 年前

I like how this feels in my brain

losh 的头像
losh1 年前

latest

ueaj 的头像
ueaj1 年前

It might look cooler if experts are separate squares, like a grid or something

Louis 的头像
Louis1 年前

wow

deep Manifold 的头像
deep Manifold1 年前

Really cool

Karan Lokchandani 的头像
Karan Lokchandani1 年前

im never deleting this app

相关视频