Video wird geladen...

Video konnte nicht geladen werden

Zur Startseite

Traditional Chunking can lose context between chunks. (Let's explore a better way!) Enter Late Chunking… Here's how it works: Traditional Chunking • Split the text into chunks • Embed each chunk separately Late Chunking • Embed the entire text first • Split it into chunks after the embedding Advantages...

19,718 Aufrufe • vor 1 Jahr •via X (Twitter)

9 Kommentare

Profilbild von Laurent Sorber
Laurent Sorbervor 1 Jahr

No need to choose: you can apply late chunking (to pool token embeddings) _and_ semantic chunking (to partition the document) for even better retrieval results! An example implementation that applies both techniques:

Profilbild von dontreadonmeow
dontreadonmeowvor 1 Jahr

I thought this was going to be a video about cats getting fat later in life…late-chonking

Profilbild von Femke Plantinga
Femke Plantingavor 1 Jahr

hahaha

Profilbild von Data knight
Data knightvor 1 Jahr

Thanks for sharing

Profilbild von Femke Plantinga
Femke Plantingavor 1 Jahr

😁 You're welcome!

Profilbild von Tommy Xiao
Tommy Xiaovor 1 Jahr

thanks share

Profilbild von 八一菜刀
八一菜刀vor 1 Jahr

Better block to solve the problem of context loss. For context information, I think the problem is that the user‘s problem may be scattered in various parts of the article, and it needs to be answered after reading the full text. This situation seems difficult to solve?

Profilbild von Deedax Inc.
Deedax Inc.vor 1 Jahr

Thanks twitter algorithm for putting this in my feed. Great share @femke_plantinga Will late chunking still work for very very long documents?

Profilbild von mert⚡️
mert⚡️vor 1 Jahr

Thank you for explanations! 😎

Ähnliche Videos

Researchers built a new RAG approach that: - does not need a vector DB. - does not embed data. - involves no chunking. - performs no similarity search. And it hit 98.7% accuracy on a financial benchmark (SOTA). Here's the core problem with RAG that this new approach solves: Traditional RAG chunks documents, embeds them into vectors, and retrieves based on semantic similarity. But similarity ≠ relevance. When you ask "What were the debt trends in 2023?", a vector search returns chunks that look similar. But the actual answer might be buried in some Appendix, referenced on some page, in a section that shares zero semantic overlap with your query. Traditional RAG would likely never find it. PageIndex (open-source) solves this. Instead of chunking and embedding, PageIndex builds a hierarchical tree structure from your documents, like an intelligent table of contents. Then it uses reasoning to traverse that tree. For instance, the model doesn't ask: "What text looks similar to this query?" Instead, it asks: "Based on this document's structure, where would a human expert look for this answer?" That's a fundamentally different approach with: - No arbitrary chunking that breaks context. - No vector DB infrastructure to maintain. - Traceable retrieval to see exactly why it chose a specific section. - The ability to see in-document references ("see Table 5.3") the way a human would. But here's the deeper issue that it solves. Vector search treats every query as independent. But documents have structure and logic, like sections that reference other sections and context that builds across pages. PageIndex respects that structure instead of flattening it into embeddings. Do note that this approach may not make sense in every use case since traditional vector search is still fast, simple, and works well for many applications. But for professional documents that require domain expertise and multi-step reasoning, this tree-based, reasoning-first approach shines. For instance, PageIndex achieved 98.7% accuracy on FinanceBench, significantly outperforming traditional vector-based RAG systems on complex financial document analysis. Everything is fully open-source, so you can see the full implementation in GitHub and try it yourself. I have shared the GitHub repo in the replies!

Avi Chawla

971,375 Aufrufe • vor 4 Monaten