Loading video...

Video Failed to Load

There was a problem loading this video. This could be due to a temporary network issue or the video might be unavailable.

An underrated issue with document parsing for RAG / agent use cases is dealing with multi-page tables - sometimes a big table spills over into multiple pages. This breaks chunking algorithms that generally operate at the page-level or smaller, and causes LLMs to lose the full view of the... data. With LlamaParse Continuous Mode (in beta), you can now parse a document with multi-page tables and join them into a single table! This means you can now: 💡 Do contiguous chunking for RAG use cases OR 💡 Parse the table for text-to-SQL Check out our blog post highlighting this feature. Huge shoutout to Pierre-Loic Doulcet and Sacha Bron : Signup here: It's in beta, let us know your feedback!show more

Jerry Liu

71,426 subscribers

24,245 views • 1 year ago •via X (Twitter)

Education Health & Wellness Science & Technology

Anya Rossi• Live Now

Private livecam show

1 Comments

prov_rsa_full1 year ago

will take in mind considering switching from @LangChainAI

Related Videos

Our default document parsing mode is now able to parse a complex research report with multiple embedded charts on a single page. This is the cheapest document OCR model out there that can turn complex visual documents into LLM-ready markdown. This is our agentic mode in LlamaParse. It starts at ~1c per page; if you’re looking to scale consumption we offer volume discounts. Come check it out! Sign up:

Our default document parsing mode is now able to parse a complex research report with multiple embedded charts on a single page. This is the cheapest document OCR model out there that can turn complex visual documents into LLM-ready markdown. This is our agentic mode in LlamaParse. It starts at ~1c per page; if you’re looking to scale consumption we offer volume discounts. Come check it out! Sign up:

Jerry Liu

15,479 views • 5 months ago

We’re excited to officially launch LlamaParse, the first genAI-native document parsing solution. Not only is it better at parsing out images/tables/charts 📊📈 than virtually every other parser, it is now steerable through natural language instructions - output the document in whatever format you desire! It is also the only parsing solution that seamlessly allows you to build accurate RAG over complex documents, free of hallucinations 🔥 We launched it in private preview a few weeks ago and hit 2k users, 1M total PDF pages parsed. And now it’s better than ever. LlamaParse contains the following killer features: ✅ SOTA table/chart extraction ✅ Seamless integration with LlamaIndex 🦙 advanced RAG/agents ✅✨ Natural language Parsing Instructions ✅✨JSON mode and image extraction ✅✨Support for ~10 document types (.pdf, .pptx, .docx, .xml) and more Our pricing is simple: 1k free per day, and additional pages at 0.3c a page, or $3 for 1k pages. If you want advanced document RAG and/or private deployments, come get in touch with us to chat about LlamaCloud. Check out our full blog post here: LlamaParse client repo: Signup at 🦙☁️: Come talk to us:

We’re excited to officially launch LlamaParse, the first genAI-native document parsing solution. Not only is it better at parsing out images/tables/charts 📊📈 than virtually every other parser, it is now steerable through natural language instructions - output the document in whatever format you desire! It is also the only parsing solution that seamlessly allows you to build accurate RAG over complex documents, free of hallucinations 🔥 We launched it in private preview a few weeks ago and hit 2k users, 1M total PDF pages parsed. And now it’s better than ever. LlamaParse contains the following killer features: ✅ SOTA table/chart extraction ✅ Seamless integration with LlamaIndex 🦙 advanced RAG/agents ✅✨ Natural language Parsing Instructions ✅✨JSON mode and image extraction ✅✨Support for ~10 document types (.pdf, .pptx, .docx, .xml) and more Our pricing is simple: 1k free per day, and additional pages at 0.3c a page, or $3 for 1k pages. If you want advanced document RAG and/or private deployments, come get in touch with us to chat about LlamaCloud. Check out our full blog post here: LlamaParse client repo: Signup at 🦙☁️: Come talk to us:

LlamaIndex 🦙

143,136 views • 2 years ago

Introducing RAGs, a Streamlit app that allows you to create and customize your own RAG agent and then use it over your own data, all with natural language 🔥 Directly inspired by OpenAI GPTs, you can converse with an agent to help you do search/retrieval over any data you specify. The app contains three main pages: 🏠 Home Page : Have a “builder agent” build your RAG agent through natural language (you specify the data). ⚙️ RAG Config: Look at configured parameters 🤖 Use your RAG agent! Check out details below 👇 Blog: Repo:

Introducing RAGs, a Streamlit app that allows you to create and customize your own RAG agent and then use it over your own data, all with natural language 🔥 Directly inspired by OpenAI GPTs, you can converse with an agent to help you do search/retrieval over any data you specify. The app contains three main pages: 🏠 Home Page : Have a “builder agent” build your RAG agent through natural language (you specify the data). ⚙️ RAG Config: Look at configured parameters 🤖 Use your RAG agent! Check out details below 👇 Blog: Repo:

LlamaIndex 🦙

475,732 views • 2 years ago

Reducto CLI is the best way for agents to interact with document data in any workflow, and I’m really excited to see the range of use cases that are possible with this. It uses our frontier models for anything from parsing to editing, and can help automate full workflows like building spreadsheets or reports with really ugly cases. Raunak’s demo is a great example of what this unlocks

Reducto CLI is the best way for agents to interact with document data in any workflow, and I’m really excited to see the range of use cases that are possible with this. It uses our frontier models for anything from parsing to editing, and can help automate full workflows like building spreadsheets or reports with really ugly cases. Raunak’s demo is a great example of what this unlocks

Adit

16,917 views • 7 months ago

LlamaParse now has an official Agent Skill you can use across 40+ agents. With built-in instructions for parsing complex documents, including different formats, tables, charts, and images, your agents gain access to deeper document understanding, not just raw text extraction. 👇 Watch the demo 📖 Read the docs: 🚀 Get started with LlamaCloud:

LlamaParse now has an official Agent Skill you can use across 40+ agents. With built-in instructions for parsing complex documents, including different formats, tables, charts, and images, your agents gain access to deeper document understanding, not just raw text extraction. 👇 Watch the demo 📖 Read the docs: 🚀 Get started with LlamaCloud:

LlamaIndex 🦙

51,845 views • 4 months ago

PDF parsing is still painful because LLMs reorder text in complex layouts, break tables across pages, and fail on graphs or images. 💡Testing the new open-source OCRFlux model, and here the results are really good for a change. So OCRFlux is a multimodal, LLM based toolkit for converting PDFs and images into clean, readable, plain Markdown text. Because the underlying VLM is only 3B param, it runs even on a 3090 GPU. The model is available on Hugging Face . The engine that powers the OCRFlux, teaches the model to rebuild every page and then stitch fragments across pages into one clean Markdown file. It bundles one vision language model with 3B parameters that was fine-tuned from Qwen 2.5-VL-3B-Instruct for both page parsing and cross-page merging. OCRFlux reads raw page images and, guided by task prompts, outputs Markdown for each page and merges split elements across pages. The evaluation shows Edit Distance Similarity (EDS) 0.967 and cross‑page table Tree Edit Distance 0.950, so the parser is both accurate and layout aware. How it works while parsing each page - Convert into text with a natural reading order, even in the presence of multi-column layouts, figures, and insets - Support for complicated tables and equations - Automatically removes headers and footers Cross-page table/paragraph merging - Cross-page table merging - Cross-page paragraph merging A compact vision‑language models can beat bigger models once cross‑page context is added. 🧵 1/n Read on 👇

PDF parsing is still painful because LLMs reorder text in complex layouts, break tables across pages, and fail on graphs or images. 💡Testing the new open-source OCRFlux model, and here the results are really good for a change. So OCRFlux is a multimodal, LLM based toolkit for converting PDFs and images into clean, readable, plain Markdown text. Because the underlying VLM is only 3B param, it runs even on a 3090 GPU. The model is available on Hugging Face . The engine that powers the OCRFlux, teaches the model to rebuild every page and then stitch fragments across pages into one clean Markdown file. It bundles one vision language model with 3B parameters that was fine-tuned from Qwen 2.5-VL-3B-Instruct for both page parsing and cross-page merging. OCRFlux reads raw page images and, guided by task prompts, outputs Markdown for each page and merges split elements across pages. The evaluation shows Edit Distance Similarity (EDS) 0.967 and cross‑page table Tree Edit Distance 0.950, so the parser is both accurate and layout aware. How it works while parsing each page - Convert into text with a natural reading order, even in the presence of multi-column layouts, figures, and insets - Support for complicated tables and equations - Automatically removes headers and footers Cross-page table/paragraph merging - Cross-page table merging - Cross-page paragraph merging A compact vision‑language models can beat bigger models once cross‑page context is added. 🧵 1/n Read on 👇

Rohan Paul

149,292 views • 1 year ago

LLMs/general agents still struggle to make sense of messy and complex Excel data. You can't easily dump all cells into the context window, and using the code interpreter is inefficient. LlamaSheets is one of my favorite releases from last year. We've embarked on an effort to build state-of-the-art algorithms and models to segment and parse complex Excel tables - including merged cells, hierarchical rows/columns. This includes both sheet-level and table-level understanding. We think there's a ton of use cases that this can help solve (simplest example: structuring your income/P&L/cash statements to be LLM-ready), and we'd love to get your feedback. Come check it out and let us know your thoughts! Sign up: Docs:

LLMs/general agents still struggle to make sense of messy and complex Excel data. You can't easily dump all cells into the context window, and using the code interpreter is inefficient. LlamaSheets is one of my favorite releases from last year. We've embarked on an effort to build state-of-the-art algorithms and models to segment and parse complex Excel tables - including merged cells, hierarchical rows/columns. This includes both sheet-level and table-level understanding. We think there's a ton of use cases that this can help solve (simplest example: structuring your income/P&L/cash statements to be LLM-ready), and we'd love to get your feedback. Come check it out and let us know your thoughts! Sign up: Docs:

Jerry Liu

30,162 views • 7 months ago

I’ve created a full-stack financial analysis bot that can query both text + embedded tables across multiple SEC 10Ks 📊🤖 It’s made possible with `create-llama` scaffolding + LlamaIndex 🦙 advanced RAG, and I’m sharing the full template below for you to clone! It’s more sophisticated than a basic RAG setup: 🦾 Unstructured to parse embedded tables into a node graph 🦾 Recursive retriever to retrieve/query embedded tables + text 🦾 An agent to do chain of thought + document comparisons 🦾 Custom callback to stream intermediate function calls to the UI Our goal is to make building full-stack advanced RAG as easy as possible. Created a repo to host this template + future templates here: Want to submit a project? We’d love contributions! 🙌 One caveat: - Right now the index is lazily built/cached during the first query, we’re working to decouple the ingestion process - Check out `create-llama` if you haven’t already:

I’ve created a full-stack financial analysis bot that can query both text + embedded tables across multiple SEC 10Ks 📊🤖 It’s made possible with `create-llama` scaffolding + LlamaIndex 🦙 advanced RAG, and I’m sharing the full template below for you to clone! It’s more sophisticated than a basic RAG setup: 🦾 Unstructured to parse embedded tables into a node graph 🦾 Recursive retriever to retrieve/query embedded tables + text 🦾 An agent to do chain of thought + document comparisons 🦾 Custom callback to stream intermediate function calls to the UI Our goal is to make building full-stack advanced RAG as easy as possible. Created a repo to host this template + future templates here: Want to submit a project? We’d love contributions! 🙌 One caveat: - Right now the index is lazily built/cached during the first query, we’re working to decouple the ingestion process - Check out `create-llama` if you haven’t already:

Jerry Liu

131,442 views • 2 years ago

Finally, a RAG solution that works with complex documents! Real-world documents can be messy, filled with text, tables, images, and intricate flow charts. Traditional parsing and chunking methods struggle to handle these. So, what’s the solution? We need smart techniques that can intuitively chunk relevant context and understand what’s inside each chunk, whether it's text, images, or diagrams. In this video, I’ll walk you through a breakthrough technique for extracting structured information from complex documents. It's unlike any other technique you've seen before.✨ It takes any unstructured (text, tables, images, flow-charts) input and parses it into a JSON format that LLMs can easily process. I used EyeLevel.AI's GroundX platform for this – a powerful tool that allows you to build a RAG application in just 3 steps. It also comes with a nice Python SDK and can be easily deployed on-premise (K8s cluster)! Try it yourself:

Finally, a RAG solution that works with complex documents! Real-world documents can be messy, filled with text, tables, images, and intricate flow charts. Traditional parsing and chunking methods struggle to handle these. So, what’s the solution? We need smart techniques that can intuitively chunk relevant context and understand what’s inside each chunk, whether it's text, images, or diagrams. In this video, I’ll walk you through a breakthrough technique for extracting structured information from complex documents. It's unlike any other technique you've seen before.✨ It takes any unstructured (text, tables, images, flow-charts) input and parses it into a JSON format that LLMs can easily process. I used EyeLevel.AI's GroundX platform for this – a powerful tool that allows you to build a RAG application in just 3 steps. It also comes with a nice Python SDK and can be easily deployed on-premise (K8s cluster)! Try it yourself:

Akshay 🚀

102,660 views • 1 year ago

Parsing a document accurately is one thing. Proving where every value came from is another. When a compliance team reviews an AI extraction, or an auditor needs to sign off on a figure pulled from a financial filing, "it came from this document" isn't enough. They need to see exactly where. The specific cell in the table, the exact line on the page, the precise word the agent used. Most parsers can get you to a paragraph or a table block. That's where the trail ends. Today we're shipping Granular Bounding Boxes in LlamaParse — word, line, and cell level coordinates for every value in your document. The result is a complete, verifiable trail from every extracted value back to its exact source in the document. Built for audit workflows, compliance review, and any pipeline where verification isn't optional. Read the full announcement →

Parsing a document accurately is one thing. Proving where every value came from is another. When a compliance team reviews an AI extraction, or an auditor needs to sign off on a figure pulled from a financial filing, "it came from this document" isn't enough. They need to see exactly where. The specific cell in the table, the exact line on the page, the precise word the agent used. Most parsers can get you to a paragraph or a table block. That's where the trail ends. Today we're shipping Granular Bounding Boxes in LlamaParse — word, line, and cell level coordinates for every value in your document. The result is a complete, verifiable trail from every extracted value back to its exact source in the document. Built for audit workflows, compliance review, and any pipeline where verification isn't optional. Read the full announcement →

LlamaIndex 🦙

29,090 views • 1 month ago

We’re on a mission to parse the world’s hardest PDFs, and we’d love your help There are so many document types that introduce a million edge cases for current VLMs / OCR: handwritten forms, badly scanned/rotated pages, charts, diagrams, and more. We are running a contest right now for you to try to extract the hardest PDFs you can find. Come sign up on our agent builder, describe what you want to extract through natural language, upload your document, and show the results. If our platform doesn’t work, even better; this is great feedback for us to improve our service. Either way submit your project and we’d love to get your feedback! Check out LlamaCloud here:

We’re on a mission to parse the world’s hardest PDFs, and we’d love your help There are so many document types that introduce a million edge cases for current VLMs / OCR: handwritten forms, badly scanned/rotated pages, charts, diagrams, and more. We are running a contest right now for you to try to extract the hardest PDFs you can find. Come sign up on our agent builder, describe what you want to extract through natural language, upload your document, and show the results. If our platform doesn’t work, even better; this is great feedback for us to improve our service. Either way submit your project and we’d love to get your feedback! Check out LlamaCloud here:

Jerry Liu

21,005 views • 5 months ago

most people stick to just 1 or 2 tables they are familiar with when writing their SQL queries in we built the ability to search for tables, preview tables and easily sample rows from a table so you can find the ones you need + even discover new tables data discovery is a big problem even in small companies. most data doesn't get analyzed because most of company doesn't know it exists

most people stick to just 1 or 2 tables they are familiar with when writing their SQL queries in we built the ability to search for tables, preview tables and easily sample rows from a table so you can find the ones you need + even discover new tables data discovery is a big problem even in small companies. most data doesn't get analyzed because most of company doesn't know it exists

rahul

11,550 views • 1 year ago

Imagine an AI application that can type anywhere you can and use the full context of what's on your screen. This is the application we all deserve (at least if you have macOS.) Check out Omnipilot. It's an app that works with every other macOS application and uses Claude Sonet 3.5 in the background—it also supports Gemini and GPT-4o. Here is the idea: You can use the tool to ask questions about anything on your screen. Or you can use it to autocomplete the text you are typing. You don't need to copy and paste anymore or waste your time providing context to a model. It sees what you see. It works right where you are. That's pretty cool. Here are a couple of cool examples: • Use it to reply to an email • Use it in the terminal to autocomplete a command • Use it to finish a document • Use to send a message on Slack AI at the system level is bonkers. You can read a ton more on their Product Hunt launch page: Thanks to the Omnipilot team for collaborating with me on this post!

Imagine an AI application that can type anywhere you can and use the full context of what's on your screen. This is the application we all deserve (at least if you have macOS.) Check out Omnipilot. It's an app that works with every other macOS application and uses Claude Sonet 3.5 in the background—it also supports Gemini and GPT-4o. Here is the idea: You can use the tool to ask questions about anything on your screen. Or you can use it to autocomplete the text you are typing. You don't need to copy and paste anymore or waste your time providing context to a model. It sees what you see. It works right where you are. That's pretty cool. Here are a couple of cool examples: • Use it to reply to an email • Use it in the terminal to autocomplete a command • Use it to finish a document • Use to send a message on Slack AI at the system level is bonkers. You can read a ton more on their Product Hunt launch page: Thanks to the Omnipilot team for collaborating with me on this post!

Santiago

72,306 views • 2 years ago

Traditional data pipelines don't work for RAG applications. There are 3 issues with them: 1. Traditional data engineering solutions are optimized to handle structured data. RAG applications rely primarily on unstructured data. 2. The connector ecosystem to load data from unstructured data sources is very immature. 3. Traditional solutions do not offer any way to transform unstructured data into an optimized vector search index. The goal of a RAG Pipeline is to solve these problems. The number one objective is to create a reliable vector search index using factual knowledge and relevant context. This sounds easy, but it's one of the biggest challenges we face when building RAG applications. At a high level, there are four different stages in the architecture of a RAG pipeline: 1. Ingestion: Here is where the pipeline loads the information from the data source. 2. Extraction: Where the pipeline processes the input data and decides how to retrieve the text contained inside them. 3. Transform: Where the pipeline chunks the data and generates document embeddings. 4. Load: Where the pipeline creates a search index in a vector database and loads the document embeddings. There are different rabbit holes at each one of these stages. Here are three of them: 1. Ingesting data once is simple. The hard part is refreshing the vector database whenever the original data source changes. 2. Extracting the content of a plain text document is simple. The hard part is to extract content from complex documents containing tables, images, or cross-references. 3. A simple continual chunking strategy with an overlap is simple. The hard part is to find the optimal strategy for your specific knowledge base and the way you are planning to query it. In the attached video, I'll show you how you can build an enterprise-grade RAG Pipeline that solves every one of the above problems. I'll use Vectorize. They partnered with me on this post. You can use them to build RAG pipelines optimized for accurate context retrieval. If you have a few documents lying around, set up a free account and give it a try.

Traditional data pipelines don't work for RAG applications. There are 3 issues with them: 1. Traditional data engineering solutions are optimized to handle structured data. RAG applications rely primarily on unstructured data. 2. The connector ecosystem to load data from unstructured data sources is very immature. 3. Traditional solutions do not offer any way to transform unstructured data into an optimized vector search index. The goal of a RAG Pipeline is to solve these problems. The number one objective is to create a reliable vector search index using factual knowledge and relevant context. This sounds easy, but it's one of the biggest challenges we face when building RAG applications. At a high level, there are four different stages in the architecture of a RAG pipeline: 1. Ingestion: Here is where the pipeline loads the information from the data source. 2. Extraction: Where the pipeline processes the input data and decides how to retrieve the text contained inside them. 3. Transform: Where the pipeline chunks the data and generates document embeddings. 4. Load: Where the pipeline creates a search index in a vector database and loads the document embeddings. There are different rabbit holes at each one of these stages. Here are three of them: 1. Ingesting data once is simple. The hard part is refreshing the vector database whenever the original data source changes. 2. Extracting the content of a plain text document is simple. The hard part is to extract content from complex documents containing tables, images, or cross-references. 3. A simple continual chunking strategy with an overlap is simple. The hard part is to find the optimal strategy for your specific knowledge base and the way you are planning to query it. In the attached video, I'll show you how you can build an enterprise-grade RAG Pipeline that solves every one of the above problems. I'll use Vectorize. They partnered with me on this post. You can use them to build RAG pipelines optimized for accurate context retrieval. If you have a few documents lying around, set up a free account and give it a try.

Santiago

40,441 views • 1 year ago

Researchers built a new RAG approach that: - does not need a vector DB. - does not embed data. - involves no chunking. - performs no similarity search. And it hit 98.7% accuracy on a financial benchmark (SOTA). Here's the core problem with RAG that this new approach solves: Traditional RAG chunks documents, embeds them into vectors, and retrieves based on semantic similarity. But similarity ≠ relevance. When you ask "What were the debt trends in 2023?", a vector search returns chunks that look similar. But the actual answer might be buried in some Appendix, referenced on some page, in a section that shares zero semantic overlap with your query. Traditional RAG would likely never find it. PageIndex (open-source) solves this. Instead of chunking and embedding, PageIndex builds a hierarchical tree structure from your documents, like an intelligent table of contents. Then it uses reasoning to traverse that tree. For instance, the model doesn't ask: "What text looks similar to this query?" Instead, it asks: "Based on this document's structure, where would a human expert look for this answer?" That's a fundamentally different approach with: - No arbitrary chunking that breaks context. - No vector DB infrastructure to maintain. - Traceable retrieval to see exactly why it chose a specific section. - The ability to see in-document references ("see Table 5.3") the way a human would. But here's the deeper issue that it solves. Vector search treats every query as independent. But documents have structure and logic, like sections that reference other sections and context that builds across pages. PageIndex respects that structure instead of flattening it into embeddings. Do note that this approach may not make sense in every use case since traditional vector search is still fast, simple, and works well for many applications. But for professional documents that require domain expertise and multi-step reasoning, this tree-based, reasoning-first approach shines. For instance, PageIndex achieved 98.7% accuracy on FinanceBench, significantly outperforming traditional vector-based RAG systems on complex financial document analysis. Everything is fully open-source, so you can see the full implementation in GitHub and try it yourself. I have shared the GitHub repo in the replies!

Researchers built a new RAG approach that: - does not need a vector DB. - does not embed data. - involves no chunking. - performs no similarity search. And it hit 98.7% accuracy on a financial benchmark (SOTA). Here's the core problem with RAG that this new approach solves: Traditional RAG chunks documents, embeds them into vectors, and retrieves based on semantic similarity. But similarity ≠ relevance. When you ask "What were the debt trends in 2023?", a vector search returns chunks that look similar. But the actual answer might be buried in some Appendix, referenced on some page, in a section that shares zero semantic overlap with your query. Traditional RAG would likely never find it. PageIndex (open-source) solves this. Instead of chunking and embedding, PageIndex builds a hierarchical tree structure from your documents, like an intelligent table of contents. Then it uses reasoning to traverse that tree. For instance, the model doesn't ask: "What text looks similar to this query?" Instead, it asks: "Based on this document's structure, where would a human expert look for this answer?" That's a fundamentally different approach with: - No arbitrary chunking that breaks context. - No vector DB infrastructure to maintain. - Traceable retrieval to see exactly why it chose a specific section. - The ability to see in-document references ("see Table 5.3") the way a human would. But here's the deeper issue that it solves. Vector search treats every query as independent. But documents have structure and logic, like sections that reference other sections and context that builds across pages. PageIndex respects that structure instead of flattening it into embeddings. Do note that this approach may not make sense in every use case since traditional vector search is still fast, simple, and works well for many applications. But for professional documents that require domain expertise and multi-step reasoning, this tree-based, reasoning-first approach shines. For instance, PageIndex achieved 98.7% accuracy on FinanceBench, significantly outperforming traditional vector-based RAG systems on complex financial document analysis. Everything is fully open-source, so you can see the full implementation in GitHub and try it yourself. I have shared the GitHub repo in the replies!

Avi Chawla

972,565 views • 6 months ago

Introducing LiteParse - the best model-free document parsing tool for AI agents 💫 ✅ It’s completely open-source and free. ✅ No GPU required, will process ~500 pages in 2 seconds on commodity hardware ✅ More accurate than PyPDF, PyMuPDF, Markdown. Also way more readable - see below for how we parse tables!! ✅ Supports 50+ file formats, from PDFs to Office docs to images ✅ Is designed to plug and play with Claude Code, OpenClaw, and any other AI agent with a one-line skills install. Supports native screenshotting capabilities. We spent years building up LlamaParse by orchestrating state-of-the-art VLMs over the most complex documents. Along the way we realized that you could get quite far on most docs through fast and cheap text parsing. Take a look at the video below. For really complex tables within PDFs, we output them in a spatial grid that’s both AI and human-interpretable. Any other free/light parser light PyPDF will destroy the representation of this table and output a sequential list. This is not a replacement for a VLM-based OCR tool (it requires 0 GPUs and doesn’t use models), but it is shocking how good it is to parse most documents. Huge shoutout to Logan Markewich and Clelia Bertelli (🦙/acc) for all the work here. Come check it out: Repo:

Introducing LiteParse - the best model-free document parsing tool for AI agents 💫 ✅ It’s completely open-source and free. ✅ No GPU required, will process ~500 pages in 2 seconds on commodity hardware ✅ More accurate than PyPDF, PyMuPDF, Markdown. Also way more readable - see below for how we parse tables!! ✅ Supports 50+ file formats, from PDFs to Office docs to images ✅ Is designed to plug and play with Claude Code, OpenClaw, and any other AI agent with a one-line skills install. Supports native screenshotting capabilities. We spent years building up LlamaParse by orchestrating state-of-the-art VLMs over the most complex documents. Along the way we realized that you could get quite far on most docs through fast and cheap text parsing. Take a look at the video below. For really complex tables within PDFs, we output them in a spatial grid that’s both AI and human-interpretable. Any other free/light parser light PyPDF will destroy the representation of this table and output a sequential list. This is not a replacement for a VLM-based OCR tool (it requires 0 GPUs and doesn’t use models), but it is shocking how good it is to parse most documents. Huge shoutout to Logan Markewich and Clelia Bertelli (🦙/acc) for all the work here. Come check it out: Repo:

Jerry Liu

256,748 views • 4 months ago

We're listening 👂LlamaSheets is in beta and we want your feedback Spreadsheets in the wild are messy—merged cells, broken layouts, headers spanning multiple rows. LlamaSheets (now in beta) extracts regions and tables from these files and outputs clean Parquet files you can actually use. What it does: · Identifies and isolates regions in your spreadsheet · Extracts them as Parquet files (load directly into pandas/polars/DuckDB) · Generates cell-level metadata (40+ features: formatting, position, data types) · Creates titles and descriptions for sheets and regions Built for the spreadsheets nobody wants to deal with manually. We need your feedback. While in beta and actively improving based on real-world use cases. Try it out and let us know what works, what doesn't, and what you need. Get started here:

We're listening 👂LlamaSheets is in beta and we want your feedback Spreadsheets in the wild are messy—merged cells, broken layouts, headers spanning multiple rows. LlamaSheets (now in beta) extracts regions and tables from these files and outputs clean Parquet files you can actually use. What it does: · Identifies and isolates regions in your spreadsheet · Extracts them as Parquet files (load directly into pandas/polars/DuckDB) · Generates cell-level metadata (40+ features: formatting, position, data types) · Creates titles and descriptions for sheets and regions Built for the spreadsheets nobody wants to deal with manually. We need your feedback. While in beta and actively improving based on real-world use cases. Try it out and let us know what works, what doesn't, and what you need. Get started here:

LlamaIndex 🦙

35,405 views • 7 months ago

Ollama 0.2 is here! Concurrency is now enabled by default. This unlocks 2 major features: Parallel requests Ollama can now serve multiple requests at the same time, using only a little bit of additional memory for each request. This enables use cases such as: - Handling multiple chat sessions at the same time - Hosting code completion LLMs for your team - Processing different parts of a document simultaneously - Running multiple agents at the same time Run multiple models Ollama now supports loading different models at the same time. This improves several use cases: - Retrieval Augmented Generation (RAG): both the embedding and text completion models can be loaded into memory simultaneously. - Agents: multiple versions of an agent can now run simultaneously - Running large and small models side-by-side Models are automatically loaded and unloaded based on requests and how much GPU memory is available.

Ollama 0.2 is here! Concurrency is now enabled by default. This unlocks 2 major features: Parallel requests Ollama can now serve multiple requests at the same time, using only a little bit of additional memory for each request. This enables use cases such as: - Handling multiple chat sessions at the same time - Hosting code completion LLMs for your team - Processing different parts of a document simultaneously - Running multiple agents at the same time Run multiple models Ollama now supports loading different models at the same time. This improves several use cases: - Retrieval Augmented Generation (RAG): both the embedding and text completion models can be loaded into memory simultaneously. - Agents: multiple versions of an agent can now run simultaneously - Running large and small models side-by-side Models are automatically loaded and unloaded based on requests and how much GPU memory is available.

ollama

219,417 views • 2 years ago

Introducing LlamaCloud 🦙🌤️ Today we’re thrilled to introduce LlamaCloud, a managed service designed to bring production-grade data for your LLM and RAG app. Spend less time data wrangling and more time on application logic. Launching with the following components: 1️⃣ LlamaParse 📑: a proprietary parser designed to be really really good at complex documents with embedded tables. Build advanced RAG over semi-structured PDFs, and ask questions that simply aren’t possible with the naive stack. Available publicly day 1 🔥 2️⃣ Managed Ingestion/Retrieval API ⚙️: An API letting you easily ingest/retrieve data from data sources. Opening up in private beta to select enterprises. We’re excited to be joined by launch users, partners, and collaborators: Mendable @DataStax MongoDB Qdrant NVIDIA + some awesome hackathon projects at the LlamaIndex 🦙 hackathon Check out our FULL blog post on LlamaCloud and LlamaParse: LlamaParse Client Repo: Signup for a LlamaCloud account to use LlamaParse: Interested in the broader LlamaCloud offering? Come talk to us: Also we have a slick new website 🌐:

Introducing LlamaCloud 🦙🌤️ Today we’re thrilled to introduce LlamaCloud, a managed service designed to bring production-grade data for your LLM and RAG app. Spend less time data wrangling and more time on application logic. Launching with the following components: 1️⃣ LlamaParse 📑: a proprietary parser designed to be really really good at complex documents with embedded tables. Build advanced RAG over semi-structured PDFs, and ask questions that simply aren’t possible with the naive stack. Available publicly day 1 🔥 2️⃣ Managed Ingestion/Retrieval API ⚙️: An API letting you easily ingest/retrieve data from data sources. Opening up in private beta to select enterprises. We’re excited to be joined by launch users, partners, and collaborators: Mendable @DataStax MongoDB Qdrant NVIDIA + some awesome hackathon projects at the LlamaIndex 🦙 hackathon Check out our FULL blog post on LlamaCloud and LlamaParse: LlamaParse Client Repo: Signup for a LlamaCloud account to use LlamaParse: Interested in the broader LlamaCloud offering? Come talk to us: Also we have a slick new website 🌐:

LlamaIndex 🦙

141,250 views • 2 years ago