正在加载视频...
视频加载失败
Traditional Chunking can lose context between chunks. (Let's explore a better way!) Enter Late Chunking… Here's how it works: Traditional Chunking • Split the text into chunks • Embed each chunk separately Late Chunking • Embed the entire text first • Split it into chunks after the embedding Advantages... show more
9 条评论

No need to choose: you can apply late chunking (to pool token embeddings) _and_ semantic chunking (to partition the document) for even better retrieval results! An example implementation that applies both techniques:

I thought this was going to be a video about cats getting fat later in life…late-chonking

hahaha

Thanks for sharing

😁 You're welcome!

thanks share

Better block to solve the problem of context loss. For context information, I think the problem is that the user‘s problem may be scattered in various parts of the article, and it needs to be answered after reading the full text. This situation seems difficult to solve?

Thanks twitter algorithm for putting this in my feed. Great share @femke_plantinga Will late chunking still work for very very long documents?

Thank you for explanations! 😎

