Mariusz Kurman's banner

Mariusz Kurman

@mkurman88 • 3,846 subscribers

from bedside to byte_side, MD to AI, 🇵🇱

Videos

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Exclusive private shows

1.2k viewers online

Private Show

Join now for exclusive access

Free preview available • Premium content

I've recorded it for you guys. Trust me, there is no better harness than this. I haven't touched it since Monday. 5x speed

I've recorded it for you guys. Trust me, there is no better harness than this. I haven't touched it since Monday. 5x speed

211,503 次观看 • 2 个月前

This is unbelievable. One of my greatest runs ever. This model didn't even see 40B tokens. ~190M, trained from scratch, no self-attention per se, with conv1d, conv2d, and chunk-token attention. It already reasons about user intent, not just blabbing random things related to the query. Not perfect, I know, but still!

This is unbelievable. One of my greatest runs ever. This model didn't even see 40B tokens. ~190M, trained from scratch, no self-attention per se, with conv1d, conv2d, and chunk-token attention. It already reasons about user intent, not just blabbing random things related to the query. Not perfect, I know, but still!

12,406 次观看 • 19 天前

This is it: A single-person project; Trained from scratch on TPUs (Google TRC) on the one and only SYTNH dataset by pleias; Neuroblast-v3 architecture running on my local vLLM instance Just wow (I'm amazed by how good it looks; speed is incredible, here slightly slowed by high-resolution recording) Todos > needs agentic fine-tuning in the future > needs some fine-grained RL

This is it: A single-person project; Trained from scratch on TPUs (Google TRC) on the one and only SYTNH dataset by pleias; Neuroblast-v3 architecture running on my local vLLM instance Just wow (I'm amazed by how good it looks; speed is incredible, here slightly slowed by high-resolution recording) Todos > needs agentic fine-tuning in the future > needs some fine-grained RL

32,474 次观看 • 6 个月前

没有更多内容可加载