Jerry Liu's banner

Jerry Liu

@jerryjliu0 • 79,097 subscribers

Parsing the world's hardest PDFs @llama_index. cofounder/CEO Careers: https://t.co/EUnMNmbCtx Enterprise: https://t.co/Ht5jwxSrQB

Shorts

Parsing PDFs at scale with LLMs is cost prohibitive. Newer models (e.g. gemini 3) are good at reading pdfs, but you burn unnecessary vision tokens even when the page is text heavy. We’ve built in a “cost-optimizer” within LlamaParse that will dynamically route pages to fast/cheap parsing depending on its complexity. Complex pages (e.g. those with tables/charts/diagrams) will still get routed to our VLM-enabled modes. This will let you save anywhere from 50-90% of parsing costs, at much higher accuracy compared to the comparable mode of feeding screenshots into VLMs. Check it out!

Parsing PDFs at scale with LLMs is cost prohibitive. Newer models (e.g. gemini 3) are good at reading pdfs, but you burn unnecessary vision tokens even when the page is text heavy. We’ve built in a “cost-optimizer” within LlamaParse that will dynamically route pages to fast/cheap parsing depending on its complexity. Complex pages (e.g. those with tables/charts/diagrams) will still get routed to our VLM-enabled modes. This will let you save anywhere from 50-90% of parsing costs, at much higher accuracy compared to the comparable mode of feeding screenshots into VLMs. Check it out!

55,848 просмотров

The `rebel-large` model is awesome for relation extraction 🔗 Paired with CUDA, it’s blazing fast ⚡️. With LlamaIndex 🦙 🦙, we can now build a knowledge graph over any text data super quickly! 🕸️ Full Colab notebook showing how you can use it:

The `rebel-large` model is awesome for relation extraction 🔗 Paired with CUDA, it’s blazing fast ⚡️. With LlamaIndex 🦙 🦙, we can now build a knowledge graph over any text data super quickly! 🕸️ Full Colab notebook showing how you can use it:

181,798 просмотров

You might’ve known us as a “RAG framework” company - but we’ve been a best-in-class, agentic document OCR/workflow company for the past 1.5+ years! 📑🤖 We’re building the future of knowledge work over documents. Our website is awesome - check it out if you haven’t already 👇

You might’ve known us as a “RAG framework” company - but we’ve been a best-in-class, agentic document OCR/workflow company for the past 1.5+ years! 📑🤖 We’re building the future of knowledge work over documents. Our website is awesome - check it out if you haven’t already 👇

56,685 просмотров

Knowledge graphs are really cool 🧠 What’s even cooler is LLMs + knowledge graphs backed by a graph db (NebulaGraph) 🔥 This presents an entirely new stack for retrieval-augmented generation (separate from vector db + top-k)! Now possible with LlamaIndex 🦙 👇

Knowledge graphs are really cool 🧠 What’s even cooler is LLMs + knowledge graphs backed by a graph db (NebulaGraph) 🔥 This presents an entirely new stack for retrieval-augmented generation (separate from vector db + top-k)! Now possible with LlamaIndex 🦙 👇

113,738 просмотров

Extracting structured outputs with LLMs is easy. But doing large-scale extraction with precise citations and bounding boxes back to the source documents is way harder. With our latest release in LlamaExtract, we extract citation bounding boxes along with every single key and value within a document. You can see this in the UI. Hover over any k:v pair and you’ll be able to see the corresponding highlights in the source doc. If you’re a human reviewing a million docs (resumes, IDs, invoices, claims, contracts), this will help you 5x your ability to verify values and make sure things are correct. Check out these new extraction upgrades in LlamaCloud:

Extracting structured outputs with LLMs is easy. But doing large-scale extraction with precise citations and bounding boxes back to the source documents is way harder. With our latest release in LlamaExtract, we extract citation bounding boxes along with every single key and value within a document. You can see this in the UI. Hover over any k:v pair and you’ll be able to see the corresponding highlights in the source doc. If you’re a human reviewing a million docs (resumes, IDs, invoices, claims, contracts), this will help you 5x your ability to verify values and make sure things are correct. Check out these new extraction upgrades in LlamaCloud:

23,044 просмотров

Super excited to feature TWO exciting AGI projects using GPT News 🔥⚡️ 🤖 llama_agi: Automatically execute tasks towards a goal! ⚙️ auto_llama: An internet agent to fulfill tasks. Link: GPT News makes AGI projects straightforward to build. 🧵

Super excited to feature TWO exciting AGI projects using GPT News 🔥⚡️ 🤖 llama_agi: Automatically execute tasks towards a goal! ⚙️ auto_llama: An internet agent to fulfill tasks. Link: GPT News makes AGI projects straightforward to build. 🧵

76,791 просмотров

Automate ETL over Financial Data 📊 Most real-world financials are not “database-shaped”, and requires a ton of human effort to manipulate/copy an Excel sheet into structured formats for analysis. We recently launched LlamaSheets - a specialized AI agent that automatically structures your Excel spreadsheet into a 2D format for analysis. There are so many use cases for Excel, and accounting is a huge subcategory here. Check it out:

Automate ETL over Financial Data 📊 Most real-world financials are not “database-shaped”, and requires a ton of human effort to manipulate/copy an Excel sheet into structured formats for analysis. We recently launched LlamaSheets - a specialized AI agent that automatically structures your Excel spreadsheet into a 2D format for analysis. There are so many use cases for Excel, and accounting is a huge subcategory here. Check it out:

21,354 просмотров

Videos

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Exclusive private shows

1.2k viewers online

Private Show

Join now for exclusive access

Free preview available • Premium content

We built the fastest PDF -> markdown parser in the world 🚀⚡️ AND it’s more accurate than any other open-source, model-free parser (pymupdf4llm, opendataloader, pdf-inspector, markitdown) on 3 standardized benchmarks: olmOCR0-bench, opendataloader-bench, ParseBench Introducing LiteParse v2.1. The v2 base version was already the fastest document->text parser on the planet, and with this new release we’ve introduced markdown. It is fully open-source (Apache 2.0) and free, is usable from CLI/Rust/Node/Python/WASM, and is also installable as a one-click agent skill. Check it out: Come check out LiteParse:

We built the fastest PDF -> markdown parser in the world 🚀⚡️ AND it’s more accurate than any other open-source, model-free parser (pymupdf4llm, opendataloader, pdf-inspector, markitdown) on 3 standardized benchmarks: olmOCR0-bench, opendataloader-bench, ParseBench Introducing LiteParse v2.1. The v2 base version was already the fastest document->text parser on the planet, and with this new release we’ve introduced markdown. It is fully open-source (Apache 2.0) and free, is usable from CLI/Rust/Node/Python/WASM, and is also installable as a one-click agent skill. Check it out: Come check out LiteParse:

329,900 просмотров • 1 месяц назад

Parse PDFs at lightspeed (this video is at 1x) Absolute cinema

Parse PDFs at lightspeed (this video is at 1x) Absolute cinema

144,017 просмотров • 1 месяц назад

Introducing LiteParse - the best model-free document parsing tool for AI agents 💫 ✅ It’s completely open-source and free. ✅ No GPU required, will process ~500 pages in 2 seconds on commodity hardware ✅ More accurate than PyPDF, PyMuPDF, Markdown. Also way more readable - see below for how we parse tables!! ✅ Supports 50+ file formats, from PDFs to Office docs to images ✅ Is designed to plug and play with Claude Code, OpenClaw, and any other AI agent with a one-line skills install. Supports native screenshotting capabilities. We spent years building up LlamaParse by orchestrating state-of-the-art VLMs over the most complex documents. Along the way we realized that you could get quite far on most docs through fast and cheap text parsing. Take a look at the video below. For really complex tables within PDFs, we output them in a spatial grid that’s both AI and human-interpretable. Any other free/light parser light PyPDF will destroy the representation of this table and output a sequential list. This is not a replacement for a VLM-based OCR tool (it requires 0 GPUs and doesn’t use models), but it is shocking how good it is to parse most documents. Huge shoutout to Logan Markewich and Clelia Bertelli (🦙/acc) for all the work here. Come check it out: Repo:

Introducing LiteParse - the best model-free document parsing tool for AI agents 💫 ✅ It’s completely open-source and free. ✅ No GPU required, will process ~500 pages in 2 seconds on commodity hardware ✅ More accurate than PyPDF, PyMuPDF, Markdown. Also way more readable - see below for how we parse tables!! ✅ Supports 50+ file formats, from PDFs to Office docs to images ✅ Is designed to plug and play with Claude Code, OpenClaw, and any other AI agent with a one-line skills install. Supports native screenshotting capabilities. We spent years building up LlamaParse by orchestrating state-of-the-art VLMs over the most complex documents. Along the way we realized that you could get quite far on most docs through fast and cheap text parsing. Take a look at the video below. For really complex tables within PDFs, we output them in a spatial grid that’s both AI and human-interpretable. Any other free/light parser light PyPDF will destroy the representation of this table and output a sequential list. This is not a replacement for a VLM-based OCR tool (it requires 0 GPUs and doesn’t use models), but it is shocking how good it is to parse most documents. Huge shoutout to Logan Markewich and Clelia Bertelli (🦙/acc) for all the work here. Come check it out: Repo:

256,399 просмотров • 4 месяцев назад

We’re open sourcing the first document OCR benchmark for the agentic era, ParseBench. Document parsing is the foundation of every AI agent that works with real-world files. ParseBench is a benchmark that measures parsing quality specifically for agent knowledge work: ✅ It optimizes for semantic correctness (instead of exact similarity) ✅ It has the most comprehensive distribution of real-world enterprise documents It contains ~2,000 human-verified enterprise document pages with 167,000+ test rules across five dimensions that matter most: tables, charts, content faithfulness, semantic formatting, and visual grounding. We benchmarked 14 known document parsers on ParseBench, from frontier/OSS VLMs to specialized parsers to LlamaParse. Here are some of our findings: 💡 Increasing compute budget yields diminishing returns - Gemini/gpt-5-mini/haiku gain 3-5 points from minimal to high thinking, at 4x the cost. 💡 Charts are the most polarizing dimension for evaluation. Most specialized parsers score below 6%, while some VLM-based parsers do a bit better. 💡 VLMs are great at visual understanding but terrible at layout extraction. GPT-5-mini/haiku score below 10% on our visual grounding task, all specialized parsers do much better. 💡 No method crushes all 5 dimensions at once, but LlamaParse achieves the highest overall score at 84.9%, and is the leader in 4 out of the 5 dimensions. This is by far the deepest technical work that we’ve published as a company. I would encourage you to start with our blog and explore our links to Hugging Face to GitHub. All the details are in our full 35-page (!!) ArXiv whitepaper. 🌐: Blog: 📄 Paper: 💻 Code: 📊 Dataset: 🎥 YouTube:

We’re open sourcing the first document OCR benchmark for the agentic era, ParseBench. Document parsing is the foundation of every AI agent that works with real-world files. ParseBench is a benchmark that measures parsing quality specifically for agent knowledge work: ✅ It optimizes for semantic correctness (instead of exact similarity) ✅ It has the most comprehensive distribution of real-world enterprise documents It contains ~2,000 human-verified enterprise document pages with 167,000+ test rules across five dimensions that matter most: tables, charts, content faithfulness, semantic formatting, and visual grounding. We benchmarked 14 known document parsers on ParseBench, from frontier/OSS VLMs to specialized parsers to LlamaParse. Here are some of our findings: 💡 Increasing compute budget yields diminishing returns - Gemini/gpt-5-mini/haiku gain 3-5 points from minimal to high thinking, at 4x the cost. 💡 Charts are the most polarizing dimension for evaluation. Most specialized parsers score below 6%, while some VLM-based parsers do a bit better. 💡 VLMs are great at visual understanding but terrible at layout extraction. GPT-5-mini/haiku score below 10% on our visual grounding task, all specialized parsers do much better. 💡 No method crushes all 5 dimensions at once, but LlamaParse achieves the highest overall score at 84.9%, and is the leader in 4 out of the 5 dimensions. This is by far the deepest technical work that we’ve published as a company. I would encourage you to start with our blog and explore our links to Hugging Face to GitHub. All the details are in our full 35-page (!!) ArXiv whitepaper. 🌐: Blog: 📄 Paper: 💻 Code: 📊 Dataset: 🎥 YouTube:

107,866 просмотров • 3 месяцев назад

We're excited to introduce the Retrieval Harness in LlamaParse - which is the 2026 version of RAG over documents Generalized agents need the right set of tools to scalably search and read through an arbitrary corpus of data (from 10 docs to 1m+ docs). They can already demonstrate great retrieval performance over a local filesystem, need a proper backend for a large collection of managed data. The Retrieval Harness exposes a diverse set of tools for various needs: 1. Hybrid Retrieval: Combine vector search with keyword search, let the agent set the alpha value to toggle between the two 2. List Files: a scalable version of `ls` to list files within an index 3. File Grep: enable regex search within a given file 4. File Read: Allow agents to read a subsection from an existing document. The agent can choose to interleave any sequence of these tools in order to complete a variety of tasks, from simple to hard. Come check it out! Blog: Sign up to LlamaParse:

We're excited to introduce the Retrieval Harness in LlamaParse - which is the 2026 version of RAG over documents Generalized agents need the right set of tools to scalably search and read through an arbitrary corpus of data (from 10 docs to 1m+ docs). They can already demonstrate great retrieval performance over a local filesystem, need a proper backend for a large collection of managed data. The Retrieval Harness exposes a diverse set of tools for various needs: 1. Hybrid Retrieval: Combine vector search with keyword search, let the agent set the alpha value to toggle between the two 2. List Files: a scalable version of `ls` to list files within an index 3. File Grep: enable regex search within a given file 4. File Read: Allow agents to read a subsection from an existing document. The agent can choose to interleave any sequence of these tools in order to complete a variety of tasks, from simple to hard. Come check it out! Blog: Sign up to LlamaParse:

18,615 просмотров • 19 дней назад

LiteParse is the best model-free, open-source document parser for AI agents. It now gets a first-class landing page on our website 💫 Our company mission is building the world's best agentic document processing platform, and liteparse is the central pillar behind our OSS efforts. It's blazing fast (and getting faster soon!), supports 50+ file formats, and is one-shot installable as an agent skill. Webpage: Come check it out:

LiteParse is the best model-free, open-source document parser for AI agents. It now gets a first-class landing page on our website 💫 Our company mission is building the world's best agentic document processing platform, and liteparse is the central pillar behind our OSS efforts. It's blazing fast (and getting faster soon!), supports 50+ file formats, and is one-shot installable as an agent skill. Webpage: Come check it out:

70,807 просмотров • 3 месяцев назад

I built a Claude Code skill that allows it to generate a deep research report over any collection of complex docs (PDFs, Word, Pptx)….and generate word-level citations and bounding boxes directly back to the source! 📝 Check out “/research-docs”. 1. It parses out text and bounding boxes from every doc with liteparse, in seconds. 2. It then generates a full HTML report of the outputs that let you see word-level citations in each page. Raw Claude obviously has deep research capabilities, but it lacks an audit trail back to the source. This skill gives you a researched report that can be audited by others. Check it out: LiteParse:

I built a Claude Code skill that allows it to generate a deep research report over any collection of complex docs (PDFs, Word, Pptx)….and generate word-level citations and bounding boxes directly back to the source! 📝 Check out “/research-docs”. 1. It parses out text and bounding boxes from every doc with liteparse, in seconds. 2. It then generates a full HTML report of the outputs that let you see word-level citations in each page. Raw Claude obviously has deep research capabilities, but it lacks an audit trail back to the source. This skill gives you a researched report that can be audited by others. Check it out: LiteParse:

77,259 просмотров • 3 месяцев назад

As frontier models (e.g. Fable 5) continue to push the task horizon of knowledge work automation, it becomes ever more important for humans to be able to audit decisions back to the source context. It is extremely easy for agents to cite an entire document or document page, but much harder for them to trace back to the exact numbers/words/figures within a page. Today we've launched granular bounding boxes within LlamaParse, which allows you to obtain visual citations of every single word in the document. This allows human users to audit exact words and figures - not just general document regions or entire pages! Come check it out:

As frontier models (e.g. Fable 5) continue to push the task horizon of knowledge work automation, it becomes ever more important for humans to be able to audit decisions back to the source context. It is extremely easy for agents to cite an entire document or document page, but much harder for them to trace back to the exact numbers/words/figures within a page. Today we've launched granular bounding boxes within LlamaParse, which allows you to obtain visual citations of every single word in the document. This allows human users to audit exact words and figures - not just general document regions or entire pages! Come check it out:

23,446 просмотров • 1 месяц назад

Introducing NotebookLlama - an open-source version of NotebookLM! 📓🦙 NotebookLlama is a full implementation of NotebookLM that includes all the capabilities that makes it so great for researchers+business users: ✅ Create a knowledge repository of documents. Has likely higher accuracy than NotebookLM since it’s using LlamaCloud under the hood for high-quality parsing/extraction over complex docs. ✅ Generate summaries and knowledge graph mind-maps 🤯 ✅ Generate podcasts thanks to ElevenLabs 🗣️ ✅ Agentic chat with docs and view metrics with OpenTelemetry This all lives within an open-source repo so you can clone/modify at will to swap in your own components! Huge shoutout to Clelia Bertelli (🦙/acc) for leading this. Repo: LlamaCloud helps power the parsing/ingestion. You can always use your own stuff too, but in the meantime you can check out LlamaCloud here:

Introducing NotebookLlama - an open-source version of NotebookLM! 📓🦙 NotebookLlama is a full implementation of NotebookLM that includes all the capabilities that makes it so great for researchers+business users: ✅ Create a knowledge repository of documents. Has likely higher accuracy than NotebookLM since it’s using LlamaCloud under the hood for high-quality parsing/extraction over complex docs. ✅ Generate summaries and knowledge graph mind-maps 🤯 ✅ Generate podcasts thanks to ElevenLabs 🗣️ ✅ Agentic chat with docs and view metrics with OpenTelemetry This all lives within an open-source repo so you can clone/modify at will to swap in your own components! Huge shoutout to Clelia Bertelli (🦙/acc) for leading this. Repo: LlamaCloud helps power the parsing/ingestion. You can always use your own stuff too, but in the meantime you can check out LlamaCloud here:

137,793 просмотров • 1 год назад

Claude Cowork is great at reading markdown files, and not great at reading PDFs 📑 I made a simple command that lets you batch parse PDFs -> markdown in an output folder, so you can just point Cowork to it. It lets the agent understand massive amounts of docs much more quickly and accurately (much lower hallucination errors over complex visuals, tables, etc.) In this example, after I parsed 100+ Supreme Court filings into markdown, cowork is able to return the answer in seconds. My wrapper is a simple extension of our very own semtools. Very DIY, will clean this up:

Claude Cowork is great at reading markdown files, and not great at reading PDFs 📑 I made a simple command that lets you batch parse PDFs -> markdown in an output folder, so you can just point Cowork to it. It lets the agent understand massive amounts of docs much more quickly and accurately (much lower hallucination errors over complex visuals, tables, etc.) In this example, after I parsed 100+ Supreme Court filings into markdown, cowork is able to return the answer in seconds. My wrapper is a simple extension of our very own semtools. Very DIY, will clean this up:

65,057 просмотров • 6 месяцев назад

Claude Code over Excel++ 🤖📊 Claude already 'works' over Excel, but in a naive manner - it writes raw python/openpyxl to analyze an Excel sheet cell-by-cell and generally lacks a semantic understanding of the content. Basically the coding abstractions used are too low-level to have the coding agent accurately do more sophisticated analysis. Our new LlamaSheets API lets you automatically segment structure complex Excel sheets into well-formatted 2D tables. This both gives Claude Code immediate semantic awareness of the sheet, and allows it to run Pandas/SQL over well-structured dataframes. We've written a guide showing you how specifically to use LlamaSheets with coding agents! Guide: Sign up to LlamaCloud:

Claude Code over Excel++ 🤖📊 Claude already 'works' over Excel, but in a naive manner - it writes raw python/openpyxl to analyze an Excel sheet cell-by-cell and generally lacks a semantic understanding of the content. Basically the coding abstractions used are too low-level to have the coding agent accurately do more sophisticated analysis. Our new LlamaSheets API lets you automatically segment structure complex Excel sheets into well-formatted 2D tables. This both gives Claude Code immediate semantic awareness of the sheet, and allows it to run Pandas/SQL over well-structured dataframes. We've written a guide showing you how specifically to use LlamaSheets with coding agents! Guide: Sign up to LlamaCloud:

75,897 просмотров • 7 месяцев назад

Give Claude Code a semantic filesystem 🗃️🛠️ Giving Claude Code access to the right CLI tools over your filesystem turns it into a general agent capable of automating far more knowledge work beyond code - it can do dynamic financial/legal/medical/technical/backoffice analysis over any subset of documents. With our latest release of semtools 💫, you can now manually or *agentically* create a persistent workspace over any subset of files. This gives Claude Code the ability to get blazing-fast, local semantic search over any data, while still allowing it to chain with commands like grep/cat/etc. so that it can load in dynamic context instead of naive top-k vector search. The coding agent can dynamically index data and use those indexes, instead of having to rebuild it every time. So you get the benefits of fast search along with agentic reasoning over CLI tools mentioned above. Come check it out!

Give Claude Code a semantic filesystem 🗃️🛠️ Giving Claude Code access to the right CLI tools over your filesystem turns it into a general agent capable of automating far more knowledge work beyond code - it can do dynamic financial/legal/medical/technical/backoffice analysis over any subset of documents. With our latest release of semtools 💫, you can now manually or agentically create a persistent workspace over any subset of files. This gives Claude Code the ability to get blazing-fast, local semantic search over any data, while still allowing it to chain with commands like grep/cat/etc. so that it can load in dynamic context instead of naive top-k vector search. The coding agent can dynamically index data and use those indexes, instead of having to rebuild it every time. So you get the benefits of fast search along with agentic reasoning over CLI tools mentioned above. Come check it out!

77,373 просмотров • 9 месяцев назад

We built a neat tool that lets you convert a directory of Powerpoint files into clean, structured markdown - that Claude Code / agent SDK / any generalized agent wrapper can easily understand. The pptx skill in Claude Code is quite basic and doesn’t have high-fidelity understanding over graphics/charts/tables. Our project Surreal Slides uses LlamaParse to convert presentations into clean structured data that you can put into a db (SurrealDB) for simple retrieval, without having to take screenshots of the data on the fly. Thanks to Clelia Bertelli (🦙/acc) for this project, check it out:

We built a neat tool that lets you convert a directory of Powerpoint files into clean, structured markdown - that Claude Code / agent SDK / any generalized agent wrapper can easily understand. The pptx skill in Claude Code is quite basic and doesn’t have high-fidelity understanding over graphics/charts/tables. Our project Surreal Slides uses LlamaParse to convert presentations into clean structured data that you can put into a db (SurrealDB) for simple retrieval, without having to take screenshots of the data on the fly. Thanks to Clelia Bertelli (🦙/acc) for this project, check it out:

39,872 просмотров • 4 месяцев назад

Building “RAG 2.0” is just making Claude Code running over your filesystem 🤖🗂️ To make this work well, you need to solve three things 1️⃣ Virtualize your filesystem to prevent the agent from messing stuff up. AgentFS by Turso is a nice example of how you can give the agent access to a copy of all your files without messing up your raw data. 2️⃣ Parse unstructured documents like PDFs, pptx, Word into an LLM-ready format. Agentic OCR solutions like LlamaParse can help here 3️⃣ Creating an agentic loop with human-in-the-loop. If you want to control the agent implementation instead of using Claude Code out of the box, you can use LlamaIndex 🦙 workflows to help orchestrate these long-running agent tasks. Shoutout Clelia Bertelli (🦙/acc), check it out! Blog: Repo:

Building “RAG 2.0” is just making Claude Code running over your filesystem 🤖🗂️ To make this work well, you need to solve three things 1️⃣ Virtualize your filesystem to prevent the agent from messing stuff up. AgentFS by Turso is a nice example of how you can give the agent access to a copy of all your files without messing up your raw data. 2️⃣ Parse unstructured documents like PDFs, pptx, Word into an LLM-ready format. Agentic OCR solutions like LlamaParse can help here 3️⃣ Creating an agentic loop with human-in-the-loop. If you want to control the agent implementation instead of using Claude Code out of the box, you can use LlamaIndex 🦙 workflows to help orchestrate these long-running agent tasks. Shoutout Clelia Bertelli (🦙/acc), check it out! Blog: Repo:

55,620 просмотров • 7 месяцев назад

Our core mission today is using AI to solve document OCR. All of our product offerings, from commercial (LlamaParse) to open-source (LiteParse, ParseBench), are fully aligned towards solving this problem. Introducing our revamped website 👇

Our core mission today is using AI to solve document OCR. All of our product offerings, from commercial (LlamaParse) to open-source (LiteParse, ParseBench), are fully aligned towards solving this problem. Introducing our revamped website 👇

26,552 просмотров • 2 месяцев назад

Tutorial: Automating KYC with AI agents 🪪🕵 I’m creating a new tutorial series of automating practical document workflows with agents. Every financial institution needs to perform KYC (know your customer) to verify a customer’s identity, and this involves manually sifting through IDs, bank statements, etc. and doing the cross-checking by hand. This is a great first use case for agentic document workflows: 1. Extract identification information from the user supplied ID (license, passport) 2. Extract fields from utility bills/bank statements and then use LLMs to cross-validate extracted fields with the extracted ID fields It obviously doesn’t cover the full e2e process and uses publicly available online data, but should be a good reference guide to get started. To make this work well, you do need high-quality document extraction with confidence scores and citations! Check out the tutorial: If you’re interested come check out LlamaParse:

Tutorial: Automating KYC with AI agents 🪪🕵 I’m creating a new tutorial series of automating practical document workflows with agents. Every financial institution needs to perform KYC (know your customer) to verify a customer’s identity, and this involves manually sifting through IDs, bank statements, etc. and doing the cross-checking by hand. This is a great first use case for agentic document workflows: 1. Extract identification information from the user supplied ID (license, passport) 2. Extract fields from utility bills/bank statements and then use LLMs to cross-validate extracted fields with the extracted ID fields It obviously doesn’t cover the full e2e process and uses publicly available online data, but should be a good reference guide to get started. To make this work well, you do need high-quality document extraction with confidence scores and citations! Check out the tutorial: If you’re interested come check out LlamaParse:

29,887 просмотров • 3 месяцев назад

I built a form-filling agent that anyone can use 💫 This is an extremely simple but useful (I hope) app. Upload a fillable form 📋, some context files, and chat with the agent to fill the form out automatically ✍️ 1️⃣ Yes it is a Claude Code SDK wrapper 2️⃣ It is better and faster than ChatGPT/Claude UI out of the box 3️⃣ We use LlamaParse to parse the context files, so you can have more trust that we are able to read context without hallucinations (e.g. messy scanned handwriting, drivers license photo, and more). This was one of my holiday Claude Code vibe-coding projects. Built with Opus 4.5, and also powered by Opus 4.5. Feeling the AGI 🫡 App is here: Repo is open-source:

I built a form-filling agent that anyone can use 💫 This is an extremely simple but useful (I hope) app. Upload a fillable form 📋, some context files, and chat with the agent to fill the form out automatically ✍️ 1️⃣ Yes it is a Claude Code SDK wrapper 2️⃣ It is better and faster than ChatGPT/Claude UI out of the box 3️⃣ We use LlamaParse to parse the context files, so you can have more trust that we are able to read context without hallucinations (e.g. messy scanned handwriting, drivers license photo, and more). This was one of my holiday Claude Code vibe-coding projects. Built with Opus 4.5, and also powered by Opus 4.5. Feeling the AGI 🫡 App is here: Repo is open-source:

48,375 просмотров • 6 месяцев назад

We Parse PDFs We spent 7 figures to put this on billboards throughout SF. I thought long and hard about putting something more creative and whimsical. But then you wouldn’t know what we do. AI agents (and humans) are consuming exponentially more documents as they do real work. They need the best quality document parser to not output garbage on downstream tasks. This is what we do today as a company. If you have any PDFs (or other documents), we parse them :) If you’re around SF in June for one of the following events, come stop by our booths: ✅ Snowflake Summit (this week, Booth 1123) ✅ Databricks Data+AI Summit (June 15-18, Booth 137) ✅ AI Engineer World Fair(June 29-July 2, Booth L-G47) You can find us by the same sign we put on our billboards! We Parse PDFs LlamaIndex 🦙

We Parse PDFs We spent 7 figures to put this on billboards throughout SF. I thought long and hard about putting something more creative and whimsical. But then you wouldn’t know what we do. AI agents (and humans) are consuming exponentially more documents as they do real work. They need the best quality document parser to not output garbage on downstream tasks. This is what we do today as a company. If you have any PDFs (or other documents), we parse them :) If you’re around SF in June for one of the following events, come stop by our booths: ✅ Snowflake Summit (this week, Booth 1123) ✅ Databricks Data+AI Summit (June 15-18, Booth 137) ✅ AI Engineer World Fair(June 29-July 2, Booth L-G47) You can find us by the same sign we put on our billboards! We Parse PDFs LlamaIndex 🦙

15,150 просмотров • 1 месяц назад

Parse text from any PDF in seconds and give it to Claude Code 📑🤖 LiteParse is our open-source, model-free document parser that lets you digitalize text from any document in seconds. This is especially useful for coding agents, which are great at reading plaintext files but terrible at reading traditional document formats (PDF, Office docs). We have a one-line installable skill that lets you plug LiteParse into Claude Code and 40+ other agents. Repo is here:

Parse text from any PDF in seconds and give it to Claude Code 📑🤖 LiteParse is our open-source, model-free document parser that lets you digitalize text from any document in seconds. This is especially useful for coding agents, which are great at reading plaintext files but terrible at reading traditional document formats (PDF, Office docs). We have a one-line installable skill that lets you plug LiteParse into Claude Code and 40+ other agents. Repo is here:

30,668 просмотров • 3 месяцев назад

Document OCR benchmarks are still an open problem Existing document OCR benchmarks are either too narrowly focused on a specific type (e.g. FinTabNet, ChartQA), or on documents that aren’t reflective of real-world tasks (e.g. OmniDocBench, OlmOCR-bench on over academic papers) ParseBench is a step towards solving this problem. * It tries to comprehensively cover real-world document distributions within the enterprise. * It contains comprehensive evaluations across 5 different dimensions (tables, charts, content faithfulness, formatting, grounding). * It tries to use metrics that optimize for agent semantic understanding rather than structural similarity. We released this yesterday, and there’s a TON of content: 1. Whitepaper 2. HF dataset 3. Github repo 4. Blog 5. Video And today, we’re excited to feature our home page website for ParseBench 💫 come check it out! Take a look at some of our other materials if you’re interested: Blog: Paper:

Document OCR benchmarks are still an open problem Existing document OCR benchmarks are either too narrowly focused on a specific type (e.g. FinTabNet, ChartQA), or on documents that aren’t reflective of real-world tasks (e.g. OmniDocBench, OlmOCR-bench on over academic papers) ParseBench is a step towards solving this problem. * It tries to comprehensively cover real-world document distributions within the enterprise. * It contains comprehensive evaluations across 5 different dimensions (tables, charts, content faithfulness, formatting, grounding). * It tries to use metrics that optimize for agent semantic understanding rather than structural similarity. We released this yesterday, and there’s a TON of content: 1. Whitepaper 2. HF dataset 3. Github repo 4. Blog 5. Video And today, we’re excited to feature our home page website for ParseBench 💫 come check it out! Take a look at some of our other materials if you’re interested: Blog: Paper:

21,657 просмотров • 3 месяцев назад