Video yükleniyor...
Video Yüklenemedi
Agentic Document Extraction just got much faster! From previous 135sec median processing time down to 8sec. Extracts not just text but diagrams, charts, and form fields from PDFs to give LLM-ready output. Please see the video for details and some application ideas.
290,420 görüntüleme • 1 yıl önce •via X (Twitter)
11 Yorum

I challenge you to prove that this can accurately and precisely extract typed hierarchy JSON output from 25 PDFs and not miss a beat I don’t care if it takes an hour. And I don’t want to have to predefine the schema. It should iteratively learn, using any reasonable language model, NLP, NER processes and functions to develop the appropriate REGEX and Pydantic models to accomplish perfect fidelity of data extraction. Do you think this is possible?

This is the biggest productivity cheat code right now. Kiss reading documents goodbye. You can get an instant summary of any document with this tool.

I was building this with agentic document advance extraction, which is useful.

incredible progress. this will definitely streamline the workflow.

KEEP BUILDING!

عشق منی اندرو جان دلم میخواد بوست کنم ❤️😘

That's an insane speed-up! From 135s to just 8s?🔥 Total game-changer for document pipelines. I was working on a similar project recently but used PaddleOCR for extraction.

The reduction in processing time is impressive. Efficiency allows more room for innovation. How do we leverage this technology for broader applications? 🚀 #InnovationInsights

@AndrewYNg, how does this breakthrough change your workflow? Speed is crucial, but what about accuracy? #InnovationPotential

Amazing, will look to experiment and integrate.

The improvement drastically cuts down on waiting time.
