Video yükleniyor...
Video Yüklenemedi
RL is painfully slow 😭 — bottlenecked by super-long CoT rollout. 🔭 Sparse attention should help, but naive sparse rollout hits a brutal efficiency–stability tradeoff: A tedious trial-and-error sparsity sweep for each dense policy is required before an actual RL run. 🐤Sparrow chirps no more pain! Introduce Sparrow: Sparse... show more
76,047 görüntüleme • 12 gün önce •via X (Twitter)
0 Yorum
Yorum bulunmuyor
Orijinal gönderinin yorumları burada görünecek
