Jerry Liu's banner
Jerry Liu's profile picture

Jerry Liu

@jerryjliu075,193 subscribers

Parsing the world's hardest PDFs @llama_index. cofounder/CEO Careers: https://t.co/EUnMNmbCtx Enterprise: https://t.co/Ht5jwxSrQB

Shorts

Parsing PDFs at scale with LLMs is cost prohibitive. Newer models (e.g. gemini 3) are good at reading pdfs, but you burn unnecessary vision tokens even when the page is text heavy. We’ve built in a “cost-optimizer” within LlamaParse that will dynamically route pages to fast/cheap parsing depending on its complexity. Complex pages (e.g. those with tables/charts/diagrams) will still get routed to our VLM-enabled modes. This will let you save anywhere from 50-90% of parsing costs, at much higher accuracy compared to the comparable mode of feeding screenshots into VLMs. Check it out!

Parsing PDFs at scale with LLMs is cost prohibitive. Newer models (e.g. gemini 3) are good at reading pdfs, but you burn unnecessary vision tokens even when the page is text heavy. We’ve built in a “cost-optimizer” within LlamaParse that will dynamically route pages to fast/cheap parsing depending on its complexity. Complex pages (e.g. those with tables/charts/diagrams) will still get routed to our VLM-enabled modes. This will let you save anywhere from 50-90% of parsing costs, at much higher accuracy compared to the comparable mode of feeding screenshots into VLMs. Check it out!

55,642 просмотров

You might’ve known us as a “RAG framework” company - but we’ve been a best-in-class, agentic document OCR/workflow company for the past 1.5+ years! 📑🤖 We’re building the future of knowledge work over documents. Our website is awesome - check it out if you haven’t already 👇

You might’ve known us as a “RAG framework” company - but we’ve been a best-in-class, agentic document OCR/workflow company for the past 1.5+ years! 📑🤖 We’re building the future of knowledge work over documents. Our website is awesome - check it out if you haven’t already 👇

56,654 просмотров

The `rebel-large` model is awesome for relation extraction 🔗 Paired with CUDA, it’s blazing fast ⚡️. With LlamaIndex 🦙 🦙, we can now build a knowledge graph over any text data super quickly! 🕸️ Full Colab notebook showing how you can use it:

The `rebel-large` model is awesome for relation extraction 🔗 Paired with CUDA, it’s blazing fast ⚡️. With LlamaIndex 🦙 🦙, we can now build a knowledge graph over any text data super quickly! 🕸️ Full Colab notebook showing how you can use it:

181,780 просмотров

Extracting structured outputs with LLMs is easy. But doing large-scale extraction with precise citations and bounding boxes back to the source documents is way harder. With our latest release in LlamaExtract, we extract citation bounding boxes along with every single key and value within a document. You can see this in the UI. Hover over any k:v pair and you’ll be able to see the corresponding highlights in the source doc. If you’re a human reviewing a million docs (resumes, IDs, invoices, claims, contracts), this will help you 5x your ability to verify values and make sure things are correct. Check out these new extraction upgrades in LlamaCloud:

Extracting structured outputs with LLMs is easy. But doing large-scale extraction with precise citations and bounding boxes back to the source documents is way harder. With our latest release in LlamaExtract, we extract citation bounding boxes along with every single key and value within a document. You can see this in the UI. Hover over any k:v pair and you’ll be able to see the corresponding highlights in the source doc. If you’re a human reviewing a million docs (resumes, IDs, invoices, claims, contracts), this will help you 5x your ability to verify values and make sure things are correct. Check out these new extraction upgrades in LlamaCloud:

23,044 просмотров

Knowledge graphs are really cool 🧠 What’s even cooler is LLMs + knowledge graphs backed by a graph db (NebulaGraph) 🔥 This presents an entirely new stack for retrieval-augmented generation (separate from vector db + top-k)! Now possible with LlamaIndex 🦙 👇

Knowledge graphs are really cool 🧠 What’s even cooler is LLMs + knowledge graphs backed by a graph db (NebulaGraph) 🔥 This presents an entirely new stack for retrieval-augmented generation (separate from vector db + top-k)! Now possible with LlamaIndex 🦙 👇

113,738 просмотров

Automate ETL over Financial Data 📊 Most real-world financials are not “database-shaped”, and requires a ton of human effort to manipulate/copy an Excel sheet into structured formats for analysis. We recently launched LlamaSheets - a specialized AI agent that automatically structures your Excel spreadsheet into a 2D format for analysis. There are so many use cases for Excel, and accounting is a huge subcategory here. Check it out:

Automate ETL over Financial Data 📊 Most real-world financials are not “database-shaped”, and requires a ton of human effort to manipulate/copy an Excel sheet into structured formats for analysis. We recently launched LlamaSheets - a specialized AI agent that automatically structures your Excel spreadsheet into a 2D format for analysis. There are so many use cases for Excel, and accounting is a huge subcategory here. Check it out:

20,383 просмотров

Super excited to feature TWO exciting AGI projects using GPT News 🔥⚡️ 🤖 llama_agi: Automatically execute tasks towards a goal! ⚙️ auto_llama: An internet agent to fulfill tasks. Link: GPT News makes AGI projects straightforward to build. 🧵

Super excited to feature TWO exciting AGI projects using GPT News 🔥⚡️ 🤖 llama_agi: Automatically execute tasks towards a goal! ⚙️ auto_llama: An internet agent to fulfill tasks. Link: GPT News makes AGI projects straightforward to build. 🧵

76,727 просмотров

Videos

jerryjliu0's profile picture

Parse PDFs at lightspeed (this video is at 1x) Absolute cinema

Jerry Liu

125,463 просмотров • 5 дней назад

jerryjliu0's profile picture

We’re open sourcing the first document OCR benchmark for the agentic era, ParseBench. Document parsing is the foundation of every AI agent that works with real-world files. ParseBench is a benchmark that measures parsing quality specifically for agent knowledge work: ✅ It optimizes for semantic correctness (instead of exact similarity) ✅ It has the most comprehensive distribution of real-world enterprise documents It contains ~2,000 human-verified enterprise document pages with 167,000+ test rules across five dimensions that matter most: tables, charts, content faithfulness, semantic formatting, and visual grounding. We benchmarked 14 known document parsers on ParseBench, from frontier/OSS VLMs to specialized parsers to LlamaParse. Here are some of our findings: 💡 Increasing compute budget yields diminishing returns - Gemini/gpt-5-mini/haiku gain 3-5 points from minimal to high thinking, at 4x the cost. 💡 Charts are the most polarizing dimension for evaluation. Most specialized parsers score below 6%, while some VLM-based parsers do a bit better. 💡 VLMs are great at visual understanding but terrible at layout extraction. GPT-5-mini/haiku score below 10% on our visual grounding task, all specialized parsers do much better. 💡 No method crushes all 5 dimensions at once, but LlamaParse achieves the highest overall score at 84.9%, and is the leader in 4 out of the 5 dimensions. This is by far the deepest technical work that we’ve published as a company. I would encourage you to start with our blog and explore our links to Hugging Face to GitHub. All the details are in our full 35-page (!!) ArXiv whitepaper. 🌐: Blog: 📄 Paper: 💻 Code: 📊 Dataset: 🎥 YouTube:

Jerry Liu

107,433 просмотров • 1 месяц назад