Loading video...
Video Failed to Load
Sentra just killed Google Research's TurboQuant. SpectralQuant — 5.95× KV cache compression on Mistral 7B at +7.5% perplexity overhead. TurboQuant at the same compression: +22%. 3× less degradation. 15-second calibration. One per-model, then drop-in for any HuggingFace LLM, ViT, ESM, AlphaFold Evoformer, or VideoMAE. Check out the findings and... show more
59,538 views • 1 month ago •via X (Twitter)
0 Comments
No comments available
Comments from the original post will appear here
