Loading video...
Video Failed to Load
Today, we're introducing our document parser built specifically for RAG. The parser combines the best vision, OCR, and vision language models to deliver unmatched accuracy. Try it for free today—the first 500+ pages are on us! 🧵 1/
1,308,495 views • 1 year ago •via X (Twitter)
10 Comments

For many enterprise AI use cases, document parsing is a major bottleneck to achieving sufficient RAG performance. Existing parsers treat documents as disconnected pages, hallucinate critical information, and struggle with complex modalities. These fundamental failures cascade through AI systems, putting a ceiling on end-to-end accuracy. Here's what makes our approach different: 🧵 2/

1️⃣ Holistic document understanding – Our parser automatically infers a document’s hierarchy, which enables teams to add metadata to each chunk that describes its position in the document. This allows agents to understand how different sections relate to each other across hundreds of pages. 🧵 3/

2️⃣ Minimized hallucinations – Our multi-stage pipeline minimizes severe hallucinations while providing bounding boxes and confidence levels for table extraction to audit its output. 🧵 4/

3️⃣ Superior handling of complex modalities – Technical diagrams, complex figures, and nested tables are efficiently processed to support all of your enterprise data. 🧵 5/

Read more in our blog: See code examples: Tagging a few folks who may find this interesting: @deedydas @_avichawla @akshay_pachaar @rajhans_samdani @NirDiamantAI @AndrewYNg @sh_reya @soldni @simonw @jxnlco @dorialexander

Checkout our open-sourced version of getting the hierarchy structure of the long documents:

Great work, @douwekiela!

🔥 This is truly the best document parser for LLM I had ever seen. The quality of text structures are insanely accurate, hierarchies are included, and the user experience of the playground is next level. Working with AI + documents? You gonna love this!!

I love the user experience for the document parser!

Amazing to see this out in the wild!
