Video yükleniyor...
Video Yüklenemedi
sota RAG in 2024
209,306 görüntüleme • 2 yıl önce •via X (Twitter)
10 Yorum

bm25 & np.array is all you need

>increase the value of k real

hacked together database? you mean np.array

based

those in arena, they know instead of similarity searching the query with chunks, do a similarity search of fictional answer to chunks

ty for the laugh (lots of truths in this) I'm still learning, but I think it starts from the design of your vector DB though. Rather than stuffing everything in one db and using scuffed queries, trying to make 2 and 2 equal 5... It's asking more fundamental questions: -What are you querying? -What do you want to get? -How does this VDB fit into the larger application? -How can you re-organize, reformat, re-index your data to minimize the complexity of the retrieval task: aka, making distinct chunks more dissimilar to each other? -How can you optimize your queries? -Do you need to split the db into several smaller db with more targeted data? -Do you need a re-ranking system? Get a first batch of data, re-rank using a different/more precise query. Probably more efficient than having a single query trying to find a needle in a haystack, and flipping a coin each time. Of course, keeping in mind time, number of operations, handling failure cases, etc...

That’s pretty hilarious 😂 My Telegram bots work like this, but they’re just a hobby project, not a mission-critical enterprise application, so it’s not like it matters lol

This is amazing

Bloody banger man

Lolll pretty much

