正在加载视频...

视频加载失败

Traditional Chunking can lose context between chunks. (Let's explore a better way!) Enter Late Chunking… Here's how it works: Traditional Chunking • Split the text into chunks • Embed each chunk separately Late Chunking • Embed the entire text first • Split it into chunks after the embedding Advantages...

19,718 次观看 • 1 年前 •via X (Twitter)

9 条评论

Laurent Sorber 的头像
Laurent Sorber1 年前

No need to choose: you can apply late chunking (to pool token embeddings) _and_ semantic chunking (to partition the document) for even better retrieval results! An example implementation that applies both techniques:

dontreadonmeow 的头像
dontreadonmeow1 年前

I thought this was going to be a video about cats getting fat later in life…late-chonking

Femke Plantinga 的头像
Femke Plantinga1 年前

hahaha

Data knight 的头像
Data knight1 年前

Thanks for sharing

Femke Plantinga 的头像
Femke Plantinga1 年前

😁 You're welcome!

Tommy Xiao 的头像
Tommy Xiao1 年前

thanks share

八一菜刀 的头像
八一菜刀1 年前

Better block to solve the problem of context loss. For context information, I think the problem is that the user‘s problem may be scattered in various parts of the article, and it needs to be answered after reading the full text. This situation seems difficult to solve?

Deedax Inc. 的头像
Deedax Inc.1 年前

Thanks twitter algorithm for putting this in my feed. Great share @femke_plantinga Will late chunking still work for very very long documents?

mert⚡️ 的头像
mert⚡️1 年前

Thank you for explanations! 😎

相关视频

Researchers built a new RAG approach that: - does not need a vector DB. - does not embed data. - involves no chunking. - performs no similarity search. And it hit 98.7% accuracy on a financial benchmark (SOTA). Here's the core problem with RAG that this new approach solves: Traditional RAG chunks documents, embeds them into vectors, and retrieves based on semantic similarity. But similarity ≠ relevance. When you ask "What were the debt trends in 2023?", a vector search returns chunks that look similar. But the actual answer might be buried in some Appendix, referenced on some page, in a section that shares zero semantic overlap with your query. Traditional RAG would likely never find it. PageIndex (open-source) solves this. Instead of chunking and embedding, PageIndex builds a hierarchical tree structure from your documents, like an intelligent table of contents. Then it uses reasoning to traverse that tree. For instance, the model doesn't ask: "What text looks similar to this query?" Instead, it asks: "Based on this document's structure, where would a human expert look for this answer?" That's a fundamentally different approach with: - No arbitrary chunking that breaks context. - No vector DB infrastructure to maintain. - Traceable retrieval to see exactly why it chose a specific section. - The ability to see in-document references ("see Table 5.3") the way a human would. But here's the deeper issue that it solves. Vector search treats every query as independent. But documents have structure and logic, like sections that reference other sections and context that builds across pages. PageIndex respects that structure instead of flattening it into embeddings. Do note that this approach may not make sense in every use case since traditional vector search is still fast, simple, and works well for many applications. But for professional documents that require domain expertise and multi-step reasoning, this tree-based, reasoning-first approach shines. For instance, PageIndex achieved 98.7% accuracy on FinanceBench, significantly outperforming traditional vector-based RAG systems on complex financial document analysis. Everything is fully open-source, so you can see the full implementation in GitHub and try it yourself. I have shared the GitHub repo in the replies!

Avi Chawla

970,893 次观看 • 4 个月前