Loading video...
Video Failed to Load
How does Exa serve billion-scale vector search? We combine binary quantization, Matryoshka embeddings, SIMD, and IVF into a novel system that can beat alternatives like HNSW. Shreyas gave a talk today at the AI Engineer World's Fair explaining our approach! ⬇️
85,482 views • 1 year ago •via X (Twitter)
10 Comments

@shreyas4_ @aiDotEngineer I wanna be nearest neighbors w/ @shreyas4_

@shreyas4_ @aiDotEngineer i am still struggling to believe how much cracked engineering talent is coming from that one university. @shreyas4_ what's the secret sauce?

@shreyas4_ @aiDotEngineer Unreal @shreyas4_

@shreyas4_ @aiDotEngineer great talk learned a lot of new things, had this question: I think if you use binary quantization, for smaller embeddings you will get poorer results because of lossy compression(already dimension reduction is done and then BQ)

@shreyas4_ @aiDotEngineer Anyone wants to just give a quick try and Build Matryoshka Embedding based RAG in a min, Give it a try 🙂

@shreyas4_ @aiDotEngineer I'm confused why you said 8TB of memory to hold everything in RAM is too expensive. Back of the envelope Hetzner has 24 core/192GB systems for $366/mo. 8TB would be ~$200k/y or ~18k queries/$ @ 100 QPS

@shreyas4_ @aiDotEngineer Nice work. So funny how obsessed people were with HNSW…

@shreyas4_ @aiDotEngineer awesome great job guys

@shreyas4_ @aiDotEngineer i love shreyas shreyas is so cool

@shreyas4_ @aiDotEngineer love this - great insight for my product

