正在加载视频...
视频加载失败
Introduce EAGLE, a new method for fast LLM decoding based on compression: - 3x🚀than vanilla - 2x🚀 than Lookahead (on its benchmark) - 1.6x🚀 than Medusa (on its benchmark) - provably maintains text distribution - trainable (in 1~2 days) and testable on RTX 3090s Playground: Blog: Code: ⚒️First Principle:... show more
118,810 次观看 • 2 年前 •via X (Twitter)
0 条评论
暂无评论
原始帖子的评论将显示在这里
