Loading video...
Video Failed to Load
Introduce EAGLE, a new method for fast LLM decoding based on compression: - 3x🚀than vanilla - 2x🚀 than Lookahead (on its benchmark) - 1.6x🚀 than Medusa (on its benchmark) - provably maintains text distribution - trainable (in 1~2 days) and testable on RTX 3090s Playground: Blog: Code: ⚒️First Principle:... show more
118,810 views • 2 years ago •via X (Twitter)
0 Comments
No comments available
Comments from the original post will appear here
