I made an interactive blog post about how JPEG image compression works:
174,223 Aufrufe
"The Truth Lies Somewhere in the Middle (of the Generated Tokens)" In autoregressive language models, mean pooling hidden states across generation yields better representations than any token alone. project page: w/ Phillip Isola and Brian Cheung
49,677 Aufrufe
LLMs, trained only on text, might already know more about other modalities than we realized; we just need to find ways elicit it. project page: w/ Phillip Isola and Brian Cheung