Eagle's banner
Eagle's profile picture

Eagle

@EagleCorp1,739 subscribers

The fastest AI inference in the world powering leading companies like NVIDIA, Meta, Intel, AMD and Perplexity.

Shorts

EAGLE-3 introduces two key innovations: Training-Time Testing (TTT) and multi-level feature fusion. By removing the feature prediction constraint used in previous EAGLE versions and leveraging semantic features across multiple layers, EAGLE-3 achieves higher acceptance rates, faster generation, and lossless performance. The result? - 5.6× faster than vanilla decoding (13B) Compared to EAGLE-1, EAGLE-3 delivers a 1.8× speedup on the 13B model, with the future EAGLE-4 release expected to further improve decoding efficiency. *Inference on the video conducted on 2x RTX 3090 GPUs at fp16 precision using the Vicuna 13B model.

EAGLE-3 introduces two key innovations: Training-Time Testing (TTT) and multi-level feature fusion. By removing the feature prediction constraint used in previous EAGLE versions and leveraging semantic features across multiple layers, EAGLE-3 achieves higher acceptance rates, faster generation, and lossless performance. The result? - 5.6× faster than vanilla decoding (13B) Compared to EAGLE-1, EAGLE-3 delivers a 1.8× speedup on the 13B model, with the future EAGLE-4 release expected to further improve decoding efficiency. *Inference on the video conducted on 2x RTX 3090 GPUs at fp16 precision using the Vicuna 13B model.

36,689 次观看