Загрузка видео...

Не удалось загрузить видео

На главную

Traditional Chunking can lose context between chunks. (Let's explore a better way!) Enter Late Chunking… Here's how it works: Traditional Chunking • Split the text into chunks • Embed each chunk separately Late Chunking • Embed the entire text first • Split it into chunks after the embedding Advantages...

19,718 просмотров • 1 год назад •via X (Twitter)

Комментарии: 9

Фото профиля Laurent Sorber
Laurent Sorber1 год назад

No need to choose: you can apply late chunking (to pool token embeddings) _and_ semantic chunking (to partition the document) for even better retrieval results! An example implementation that applies both techniques:

Фото профиля dontreadonmeow
dontreadonmeow1 год назад

I thought this was going to be a video about cats getting fat later in life…late-chonking

Фото профиля Femke Plantinga
Femke Plantinga1 год назад

hahaha

Фото профиля Data knight
Data knight1 год назад

Thanks for sharing

Фото профиля Femke Plantinga
Femke Plantinga1 год назад

😁 You're welcome!

Фото профиля Tommy Xiao
Tommy Xiao1 год назад

thanks share

Фото профиля 八一菜刀
八一菜刀1 год назад

Better block to solve the problem of context loss. For context information, I think the problem is that the user‘s problem may be scattered in various parts of the article, and it needs to be answered after reading the full text. This situation seems difficult to solve?

Фото профиля Deedax Inc.
Deedax Inc.1 год назад

Thanks twitter algorithm for putting this in my feed. Great share @femke_plantinga Will late chunking still work for very very long documents?

Фото профиля mert⚡️
mert⚡️1 год назад

Thank you for explanations! 😎

Похожие видео

Researchers built a new RAG approach that: - does not need a vector DB. - does not embed data. - involves no chunking. - performs no similarity search. And it hit 98.7% accuracy on a financial benchmark (SOTA). Here's the core problem with RAG that this new approach solves: Traditional RAG chunks documents, embeds them into vectors, and retrieves based on semantic similarity. But similarity ≠ relevance. When you ask "What were the debt trends in 2023?", a vector search returns chunks that look similar. But the actual answer might be buried in some Appendix, referenced on some page, in a section that shares zero semantic overlap with your query. Traditional RAG would likely never find it. PageIndex (open-source) solves this. Instead of chunking and embedding, PageIndex builds a hierarchical tree structure from your documents, like an intelligent table of contents. Then it uses reasoning to traverse that tree. For instance, the model doesn't ask: "What text looks similar to this query?" Instead, it asks: "Based on this document's structure, where would a human expert look for this answer?" That's a fundamentally different approach with: - No arbitrary chunking that breaks context. - No vector DB infrastructure to maintain. - Traceable retrieval to see exactly why it chose a specific section. - The ability to see in-document references ("see Table 5.3") the way a human would. But here's the deeper issue that it solves. Vector search treats every query as independent. But documents have structure and logic, like sections that reference other sections and context that builds across pages. PageIndex respects that structure instead of flattening it into embeddings. Do note that this approach may not make sense in every use case since traditional vector search is still fast, simple, and works well for many applications. But for professional documents that require domain expertise and multi-step reasoning, this tree-based, reasoning-first approach shines. For instance, PageIndex achieved 98.7% accuracy on FinanceBench, significantly outperforming traditional vector-based RAG systems on complex financial document analysis. Everything is fully open-source, so you can see the full implementation in GitHub and try it yourself. I have shared the GitHub repo in the replies!

Avi Chawla

971,454 просмотров • 4 месяцев назад