
Pavlo Molchanov
@PavloMolchanov • 28,888 subscribers
Director of Research @NVIDIA
Shorts
Videos

We are releasing Star Elastic - turn ONE reasoning LLM into MANY sizes with a single post-training run. 360× cheaper than pretraining a family of models. 7× better than SOTA compression. Split reasoning capability. Plus elastic budget control that beats the accuracy-latency frontier. Paper: HF models: Thread 👇
Pavlo Molchanov21,313 Aufrufe • vor 22 Tagen
Keine weiteren Inhalte verfügbar