正在加载视频...

视频加载失败

Today, we're introducing our document parser built specifically for RAG. The parser combines the best vision, OCR, and vision language models to deliver unmatched accuracy. Try it for free today—the first 500+ pages are on us! 🧵 1/

1,308,568 次观看 • 1 年前 •via X (Twitter)

10 条评论

Douwe Kiela 的头像
Douwe Kiela1 年前

For many enterprise AI use cases, document parsing is a major bottleneck to achieving sufficient RAG performance. Existing parsers treat documents as disconnected pages, hallucinate critical information, and struggle with complex modalities. These fundamental failures cascade through AI systems, putting a ceiling on end-to-end accuracy. Here's what makes our approach different: 🧵 2/

Douwe Kiela 的头像
Douwe Kiela1 年前

1️⃣ Holistic document understanding – Our parser automatically infers a document’s hierarchy, which enables teams to add metadata to each chunk that describes its position in the document. This allows agents to understand how different sections relate to each other across hundreds of pages. 🧵 3/

Douwe Kiela 的头像
Douwe Kiela1 年前

2️⃣ Minimized hallucinations – Our multi-stage pipeline minimizes severe hallucinations while providing bounding boxes and confidence levels for table extraction to audit its output. 🧵 4/

Douwe Kiela 的头像
Douwe Kiela1 年前

3️⃣ Superior handling of complex modalities – Technical diagrams, complex figures, and nested tables are efficiently processed to support all of your enterprise data. 🧵 5/

Douwe Kiela 的头像
Douwe Kiela1 年前

Read more in our blog: See code examples: Tagging a few folks who may find this interesting: @deedydas @_avichawla @akshay_pachaar @rajhans_samdani @NirDiamantAI @AndrewYNg @sh_reya @soldni @simonw @jxnlco @dorialexander

Mingtian 的头像
Mingtian1 年前

Checkout our open-sourced version of getting the hierarchy structure of the long documents:

elvis 的头像
elvis1 年前

Great work, @douwekiela!

Lingxi Li 的头像
Lingxi Li1 年前

🔥 This is truly the best document parser for LLM I had ever seen. The quality of text structures are insanely accurate, hierarchies are included, and the user experience of the playground is next level. Working with AI + documents? You gonna love this!!

Nina Lopatina 的头像
Nina Lopatina1 年前

I love the user experience for the document parser!

Soumitr 的头像
Soumitr1 年前

Amazing to see this out in the wild!

相关视频