
Ashwin Gopinath
@ashwingop • 5,124 subscribers
CEO of https://t.co/i9EfIbqt9G ; Used to be a Prof. at MIT; 2x founder
Videos

Sentra just killed Google Research's TurboQuant. SpectralQuant — 5.95× KV cache compression on Mistral 7B at +7.5% perplexity overhead. TurboQuant at the same compression: +22%. 3× less degradation. 15-second calibration. One per-model, then drop-in for any HuggingFace LLM, ViT, ESM, AlphaFold Evoformer, or VideoMAE. Check out the findings and how the mechanism works below. ↓
Ashwin Gopinath59,538 views • 1 month ago
No more content to load