Video wird geladen...

Video konnte nicht geladen werden

Beim Laden dieses Videos ist ein Problem aufgetreten. Dies könnte an einem vorübergehenden Netzwerkproblem liegen oder das Video ist möglicherweise nicht verfügbar.

Open-sourcing Introspect: MIT-licensed Deep-Research for your internal data! Works with spreadsheets, databases, PDFs, and web search. Has a remarkably simple architecture – Sonnet agent armed with recursive tool calling and 3 default tools. Best for use-cases where you want to combine insights from SQL with unstructured data + data... show more

Rishabh Srivastava

12,550 subscribers

71,449 Aufrufe • vor 1 Jahr •via X (Twitter)

Wissenschaft & Technologie Bildung

Anya Rossi• Live Now

Private livecam show

11 Kommentare

Profilbild von Rishabh Srivastava

Rishabh Srivastavavor 1 Jahr

Github: Live Demo (log in with username admin and password admin):

Profilbild von Dr Shad Katuu 🌐

Dr Shad Katuu 🌐vor 2 Jahren

#OpenAccess article "Soup du jour – existing and emerging trends in archives and records management standardization"

Profilbild von Walter Tay

Walter Tayvor 1 Jahr

hey rishabh, i just took a quick look at the code. claude-3-7-sonnet-latest has a known issue with citations and you can actually improve it just by slightly modifying the prompt (anthropic has a note in their docs on this) looks amazing btw all the best!

Profilbild von Rishabh Srivastava

Rishabh Srivastavavor 1 Jahr

Thank you! Will fix — appreciate this v much!

Profilbild von Breno Brito

Breno Britovor 1 Jahr

This should be great to use with @obsdmd

Profilbild von Aditi Kothari

Aditi Kotharivor 1 Jahr

Cool stuff!!

Profilbild von Rodolfo Rosini ✨☕️

Rodolfo Rosini ✨☕️vor 1 Jahr

I have been looking for something like this for a while, and we would be happily pay for it for commercial use

Profilbild von Rishabh Srivastava

Rishabh Srivastavavor 1 Jahr

Super glad to hear that! We are launching a hosted version on Friday

Profilbild von ABC

ABCvor 1 Jahr

Awesome , will try this soon

Profilbild von Rishabh Srivastava

Rishabh Srivastavavor 1 Jahr

Excited for your feedback! We might still have some rough edges in a few places, but should be able to fix those soon!

Profilbild von Vaibhav (VB) Srivastav

Vaibhav (VB) Srivastavvor 1 Jahr

🔥🔥🔥

Ähnliche Videos

We built Open Gamma An Open Source Presentation Generator with Composio Tool Router and AI SDK - Generate Presentations in Google Slides instantly - Can research on the topic with web search - Can connect and fetch data from multiple data sources like Hubspot, Slack, Notion and more Completely free and open source link to the code:

We built Open Gamma An Open Source Presentation Generator with Composio Tool Router and AI SDK - Generate Presentations in Google Slides instantly - Can research on the topic with web search - Can connect and fetch data from multiple data sources like Hubspot, Slack, Notion and more Completely free and open source link to the code:

Karan Vaidya

40,455 Aufrufe • vor 6 Monaten

We’re excited to share the beta release of our tool use API! Now developers can easily equip any LLM with hosted open-source tools for web search, web crawling, and maps data (+ more coming soon). Under the hood, it's powered by MCP — but that's just an implementation detail ↓

We’re excited to share the beta release of our tool use API! Now developers can easily equip any LLM with hosted open-source tools for web search, web crawling, and maps data (+ more coming soon). Under the hood, it's powered by MCP — but that's just an implementation detail ↓

OpenTools

125,965 Aufrufe • vor 1 Jahr

Introducing Open Deep Research 🔭 An open source AI Agent that reasons large amounts of web data extracted with Open source. Powered by the AI SDK

Introducing Open Deep Research 🔭 An open source AI Agent that reasons large amounts of web data extracted with Open source. Powered by the AI SDK

Nicolas Camara

215,028 Aufrufe • vor 1 Jahr

Today, Box announced new AI Agents to work with enterprise content, powering Deep Research, Search, and enhanced Data Extraction. There’s a tremendous amount of value that’s trapped in unstructured data, from contracts to research data, that we can finally unlock with AI.

Today, Box announced new AI Agents to work with enterprise content, powering Deep Research, Search, and enhanced Data Extraction. There’s a tremendous amount of value that’s trapped in unstructured data, from contracts to research data, that we can finally unlock with AI.

Aaron Levie

60,262 Aufrufe • vor 1 Jahr

$Google open-sourced MCP Toolbox for Databases. I gave it access to everything else. For context, Google's MCP Toolbox for Databases is an open-source server that lets AI agents securely query structured databases like PostgreSQL and MySQL through the MCP protocol However, most enterprise knowledge doesn't actually live in databases. It's scattered across emails, Slack threads, GitHub repos, Salesforce records, customer reviews, and internal docs. So Agents can't see any of it, which means they're working with a fraction of the context they need. I fixed that using MindsDB. It acts as a universal SQL layer that sits on top of all your data sources: structured, semi-structured, and unstructured. This means you can query Salesforce, Gmail, GitHub, S3 files, Jira, and 200+ more sources using SQL syntax. The clever part is how it connects to the MCP Toolbox. MindsDB exposes everything through MySQL, so from the Agent's perspective, it's just running SQL and getting context back. It doesn't know or care that the data came from five different sources behind the scenes. This setup unlocks some powerful capabilities: → One SQL interface for dozens of enterprise sources → Cross-datasource joins (combine GitHub and CRM data in a single query) → Built-in ML capabilities for working with unstructured data → Simple MCP tools that now have massively expanded reach In the video below, the Agent queries GitHub data and a customer review database in one SQL query. So what used to require ETL pipelines and weeks of engineering effort now happens instantly. At the end of the day, AI agents are only as useful as the data they can access. This gives them a lot more to work with. I have shared the GitHub repo in the replies, where you can find more details about this.$

Google open-sourced MCP Toolbox for Databases. I gave it access to everything else. For context, Google's MCP Toolbox for Databases is an open-source server that lets AI agents securely query structured databases like PostgreSQL and MySQL through the MCP protocol However, most enterprise knowledge doesn't actually live in databases. It's scattered across emails, Slack threads, GitHub repos, Salesforce records, customer reviews, and internal docs. So Agents can't see any of it, which means they're working with a fraction of the context they need. I fixed that using MindsDB. It acts as a universal SQL layer that sits on top of all your data sources: structured, semi-structured, and unstructured. This means you can query Salesforce, Gmail, GitHub, S3 files, Jira, and 200+ more sources using SQL syntax. The clever part is how it connects to the MCP Toolbox. MindsDB exposes everything through MySQL, so from the Agent's perspective, it's just running SQL and getting context back. It doesn't know or care that the data came from five different sources behind the scenes. This setup unlocks some powerful capabilities: → One SQL interface for dozens of enterprise sources → Cross-datasource joins (combine GitHub and CRM data in a single query) → Built-in ML capabilities for working with unstructured data → Simple MCP tools that now have massively expanded reach In the video below, the Agent queries GitHub data and a customer review database in one SQL query. So what used to require ETL pipelines and weeks of engineering effort now happens instantly. At the end of the day, AI agents are only as useful as the data they can access. This gives them a lot more to work with. I have shared the GitHub repo in the replies, where you can find more details about this.

Akshay 🚀

39,331 Aufrufe • vor 4 Monaten

Sharing our latest short course: Building and Evaluating Data Agents, created in collaboration with Snowflake and taught by Anupam Datta (Anupam Datta) and Josh Reini (Josh Reini). A data agent extracts data from sources such as files or databases, analyzes it, and provides insights and visualizes its findings. But most data agents struggle with reliability or can't handle multi-step reasoning. In this course, you'll learn to build, trace, and evaluate a multi-agent workflow that plans tasks, pulls context from structured and unstructured data, performs web search, and summarizes or visualizes the final results. Learn more and enroll for free!

Sharing our latest short course: Building and Evaluating Data Agents, created in collaboration with Snowflake and taught by Anupam Datta (Anupam Datta) and Josh Reini (Josh Reini). A data agent extracts data from sources such as files or databases, analyzes it, and provides insights and visualizes its findings. But most data agents struggle with reliability or can't handle multi-step reasoning. In this course, you'll learn to build, trace, and evaluate a multi-agent workflow that plans tasks, pulls context from structured and unstructured data, performs web search, and summarizes or visualizes the final results. Learn more and enroll for free!

DeepLearning.AI

40,745 Aufrufe • vor 9 Monaten

Multi-Agent workflows are the future of AI. OpenAI released new Agent APIs today, and Box built an Agent that combines documents from Box and web search tools to generate answers. Enterprise devs can grab sample code from our GitHub repo to customize with their data.

Multi-Agent workflows are the future of AI. OpenAI released new Agent APIs today, and Box built an Agent that combines documents from Box and web search tools to generate answers. Enterprise devs can grab sample code from our GitHub repo to customize with their data.

Aaron Levie

102,060 Aufrufe • vor 1 Jahr

Thrilled to see Amazon Web Services making a major contribution to the open source AI community with the launch of the Strands Agents, an open source AI agents SDK! The core of Strands is the simple agentic loop that connects the model and tools together, like the two strands of DNA. This model-driven approach to agent building eliminates the need for complex agent orchestration by embracing the capabilities of state-of-the-art models to plan, chain thoughts, call tools, and reflect. Providing open source tools and interoperability with open source protocols is an important part of our strategy to enable an agentic future. Can't wait to see what you build with Strands!

Thrilled to see Amazon Web Services making a major contribution to the open source AI community with the launch of the Strands Agents, an open source AI agents SDK! The core of Strands is the simple agentic loop that connects the model and tools together, like the two strands of DNA. This model-driven approach to agent building eliminates the need for complex agent orchestration by embracing the capabilities of state-of-the-art models to plan, chain thoughts, call tools, and reflect. Providing open source tools and interoperability with open source protocols is an important part of our strategy to enable an agentic future. Can't wait to see what you build with Strands!

Swami Sivasubramanian

32,185 Aufrufe • vor 1 Jahr

INCREDIBLE!! An MCP server to browse the web like humans! Bright Data MCP server provides 30+ powerful tools that allow AI agents to access, search, crawl, and interact with the web without getting blocked. 100% open-source, works at scale!

INCREDIBLE!! An MCP server to browse the web like humans! Bright Data MCP server provides 30+ powerful tools that allow AI agents to access, search, crawl, and interact with the web without getting blocked. 100% open-source, works at scale!

Akshay 🚀

90,404 Aufrufe • vor 1 Jahr

Introducing Exa Agent: frontier web research at less than half the cost of GPT 5.5 and Opus. /agent orchestrates a mixture of cost-effective models to complete any web research task, from simple data enrichments to building gigantic lists.

Introducing Exa Agent: frontier web research at less than half the cost of GPT 5.5 and Opus. /agent orchestrates a mixture of cost-effective models to complete any web research task, from simple data enrichments to building gigantic lists.

Exa

352,589 Aufrufe • vor 10 Tagen

New JavaScript short course: Build a full-stack web application that uses RAG in JavaScript RAG Web Apps with LlamaIndex, taught by Laurie Voss, VP of Developer Relations at LlamaIndex 🦙 and npm co-founder. - Build a RAG application for querying your own data - Develop tools to interact with multiple data sources using an agent that intelligently selects the right tool for your queries - Create a full-stack web app that can chat with your data - Dig further into production-ready techniques, like how to persist your data so you aren’t constantly reindexing, and try the create-llama command line tool from LlamaIndex You can sign up here:

New JavaScript short course: Build a full-stack web application that uses RAG in JavaScript RAG Web Apps with LlamaIndex, taught by Laurie Voss, VP of Developer Relations at LlamaIndex 🦙 and npm co-founder. - Build a RAG application for querying your own data - Develop tools to interact with multiple data sources using an agent that intelligently selects the right tool for your queries - Create a full-stack web app that can chat with your data - Dig further into production-ready techniques, like how to persist your data so you aren’t constantly reindexing, and try the create-llama command line tool from LlamaIndex You can sign up here:

Andrew Ng

218,284 Aufrufe • vor 2 Jahren

Genspark AI agent just released AI Sheets. You can now upload any data and the agent automatically analyzes it, generates reports, and can research the web to find the data for you. 5 powerful use cases + how to try👇:

Genspark AI agent just released AI Sheets. You can now upload any data and the agent automatically analyzes it, generates reports, and can research the web to find the data for you. 5 powerful use cases + how to try👇:

Alvaro Cintas

110,569 Aufrufe • vor 1 Jahr

Reducto CLI is the best way for agents to interact with document data in any workflow, and I’m really excited to see the range of use cases that are possible with this. It uses our frontier models for anything from parsing to editing, and can help automate full workflows like building spreadsheets or reports with really ugly cases. Raunak’s demo is a great example of what this unlocks

Reducto CLI is the best way for agents to interact with document data in any workflow, and I’m really excited to see the range of use cases that are possible with this. It uses our frontier models for anything from parsing to editing, and can help automate full workflows like building spreadsheets or reports with really ugly cases. Raunak’s demo is a great example of what this unlocks

Adit

16,577 Aufrufe • vor 6 Monaten

Traditional data pipelines don't work for RAG applications. There are 3 issues with them: 1. Traditional data engineering solutions are optimized to handle structured data. RAG applications rely primarily on unstructured data. 2. The connector ecosystem to load data from unstructured data sources is very immature. 3. Traditional solutions do not offer any way to transform unstructured data into an optimized vector search index. The goal of a RAG Pipeline is to solve these problems. The number one objective is to create a reliable vector search index using factual knowledge and relevant context. This sounds easy, but it's one of the biggest challenges we face when building RAG applications. At a high level, there are four different stages in the architecture of a RAG pipeline: 1. Ingestion: Here is where the pipeline loads the information from the data source. 2. Extraction: Where the pipeline processes the input data and decides how to retrieve the text contained inside them. 3. Transform: Where the pipeline chunks the data and generates document embeddings. 4. Load: Where the pipeline creates a search index in a vector database and loads the document embeddings. There are different rabbit holes at each one of these stages. Here are three of them: 1. Ingesting data once is simple. The hard part is refreshing the vector database whenever the original data source changes. 2. Extracting the content of a plain text document is simple. The hard part is to extract content from complex documents containing tables, images, or cross-references. 3. A simple continual chunking strategy with an overlap is simple. The hard part is to find the optimal strategy for your specific knowledge base and the way you are planning to query it. In the attached video, I'll show you how you can build an enterprise-grade RAG Pipeline that solves every one of the above problems. I'll use Vectorize. They partnered with me on this post. You can use them to build RAG pipelines optimized for accurate context retrieval. If you have a few documents lying around, set up a free account and give it a try.

Traditional data pipelines don't work for RAG applications. There are 3 issues with them: 1. Traditional data engineering solutions are optimized to handle structured data. RAG applications rely primarily on unstructured data. 2. The connector ecosystem to load data from unstructured data sources is very immature. 3. Traditional solutions do not offer any way to transform unstructured data into an optimized vector search index. The goal of a RAG Pipeline is to solve these problems. The number one objective is to create a reliable vector search index using factual knowledge and relevant context. This sounds easy, but it's one of the biggest challenges we face when building RAG applications. At a high level, there are four different stages in the architecture of a RAG pipeline: 1. Ingestion: Here is where the pipeline loads the information from the data source. 2. Extraction: Where the pipeline processes the input data and decides how to retrieve the text contained inside them. 3. Transform: Where the pipeline chunks the data and generates document embeddings. 4. Load: Where the pipeline creates a search index in a vector database and loads the document embeddings. There are different rabbit holes at each one of these stages. Here are three of them: 1. Ingesting data once is simple. The hard part is refreshing the vector database whenever the original data source changes. 2. Extracting the content of a plain text document is simple. The hard part is to extract content from complex documents containing tables, images, or cross-references. 3. A simple continual chunking strategy with an overlap is simple. The hard part is to find the optimal strategy for your specific knowledge base and the way you are planning to query it. In the attached video, I'll show you how you can build an enterprise-grade RAG Pipeline that solves every one of the above problems. I'll use Vectorize. They partnered with me on this post. You can use them to build RAG pipelines optimized for accurate context retrieval. If you have a few documents lying around, set up a free account and give it a try.

Santiago

40,441 Aufrufe • vor 1 Jahr

Your agents can't keep up with real-time data. Especially when it's scattered across dozens of sources. Most teams waste weeks building custom connectors for every database, API, and data warehouse. Then they build ETL pipelines to sync everything. By the time your agent retrieves the data, it's already outdated. Picture this: Your Postgres database updated 5 minutes ago. Your MongoDB collection changed 2 minutes ago. Your agent is still pulling from yesterday's snapshot. This is why most production RAG systems fail. There's a better approach: MindsDB is an open-source AI platform with a federated data engine that lets you query multiple data sources in real-time using SQL - without moving any data. Here's what makes it different: ↳ Your data stays in place. No ETL pipelines or data duplication ↳ Query Postgres, MongoDB, REST APIs, and more using consistent SQL ↳ JOIN across different sources in real-time with a unified interface ↳ Works with both structured and un-structured data And here's the best part: You don't even need to write SQL. Just describe what you want in plain English, and MindsDB converts it to SQL automatically. The system does all the heavy lifting. The breakthrough for AI agents is simple: When data updates at the source, your agent gets fresh results immediately. No sync delays. No stale embeddings. No custom code for each integration. You can literally write a SQL query that joins a Postgres table with a MongoDB collection and gets live results. This is what production AI applications need but rarely get. In this video, I give you a complete walkthrough of what we just discussed and how to actually do it. Make sure you watch this till the end. I've shared the link to MindsDB's GitHub repo in the next tweet!

Your agents can't keep up with real-time data. Especially when it's scattered across dozens of sources. Most teams waste weeks building custom connectors for every database, API, and data warehouse. Then they build ETL pipelines to sync everything. By the time your agent retrieves the data, it's already outdated. Picture this: Your Postgres database updated 5 minutes ago. Your MongoDB collection changed 2 minutes ago. Your agent is still pulling from yesterday's snapshot. This is why most production RAG systems fail. There's a better approach: MindsDB is an open-source AI platform with a federated data engine that lets you query multiple data sources in real-time using SQL - without moving any data. Here's what makes it different: ↳ Your data stays in place. No ETL pipelines or data duplication ↳ Query Postgres, MongoDB, REST APIs, and more using consistent SQL ↳ JOIN across different sources in real-time with a unified interface ↳ Works with both structured and un-structured data And here's the best part: You don't even need to write SQL. Just describe what you want in plain English, and MindsDB converts it to SQL automatically. The system does all the heavy lifting. The breakthrough for AI agents is simple: When data updates at the source, your agent gets fresh results immediately. No sync delays. No stale embeddings. No custom code for each integration. You can literally write a SQL query that joins a Postgres table with a MongoDB collection and gets live results. This is what production AI applications need but rarely get. In this video, I give you a complete walkthrough of what we just discussed and how to actually do it. Make sure you watch this till the end. I've shared the link to MindsDB's GitHub repo in the next tweet!

Akshay 🚀

65,672 Aufrufe • vor 7 Monaten

🌊TrellisInsights (YC W24) makes unstructured data SQL-ready. It extracts and transforms your unstructured data to SQL-compliant tables with schema you define with natural language. Congrats on the launch, Mac Klinkachorn and Jacky Lin!

🌊TrellisInsights (YC W24) makes unstructured data SQL-ready. It extracts and transforms your unstructured data to SQL-compliant tables with schema you define with natural language. Congrats on the launch, Mac Klinkachorn and Jacky Lin!

Y Combinator

14,213 Aufrufe • vor 2 Jahren

Function calling is a powerful way to extend the capabilities of LLMs and AI agents by letting them use external tools. Our new short course Function calling and Data Extraction with LLMs, created with @NexusflowX and taught by Jiantao Jiao and Venkat, demonstrates how to prompt LLMs to form calls to external functions. You'll work with NexusRavenV2-13B, a 13B parameter open-source model that excels in function calling tasks while still being small enough to host locally. Learn to use function calling to extract structured data from unstructured text and access web APIs, and build an end-to-end application that processes customer service transcripts. You'll learn how to build LLM-powered applications that can analyze feedback, automate data entry, and enhance search. Please get started here:

Function calling is a powerful way to extend the capabilities of LLMs and AI agents by letting them use external tools. Our new short course Function calling and Data Extraction with LLMs, created with @NexusflowX and taught by Jiantao Jiao and Venkat, demonstrates how to prompt LLMs to form calls to external functions. You'll work with NexusRavenV2-13B, a 13B parameter open-source model that excels in function calling tasks while still being small enough to host locally. Learn to use function calling to extract structured data from unstructured text and access web APIs, and build an end-to-end application that processes customer service transcripts. You'll learn how to build LLM-powered applications that can analyze feedback, automate data entry, and enhance search. Please get started here:

Andrew Ng

110,420 Aufrufe • vor 2 Jahren

Web scraping is a critical skill, and yet nobody talks about it. How do you think companies are training their Large Language Models? Where do you think the data comes from? But web scraping goes beyond all of that. Imagine giving an AI agent access to any public online data in real time! I like to call this "web-scrapping on demand", and I'm pretty sure it's going to unlock unlimited power for AI applications. I recorded a quick video to show you how you can do this using Apify. I've talked about them before, and they are collaborating with me on this post. They have one of the best open-source web scraping and browser automation libraries out there: But it gets much better than this! You can use MCP to connect your AI Agents and applications to the Apify platform and use any specialized actor on demand to scrape and process online data. In the video, I used Cursor to scrape LinkedIn posts with the words "Machine Learning" in real time. Worked like a charm with no code needed! Here is a link to the platform: Think about this: You can now feed your AI applications with any public data on demand! We aren't ready for what's coming.

Web scraping is a critical skill, and yet nobody talks about it. How do you think companies are training their Large Language Models? Where do you think the data comes from? But web scraping goes beyond all of that. Imagine giving an AI agent access to any public online data in real time! I like to call this "web-scrapping on demand", and I'm pretty sure it's going to unlock unlimited power for AI applications. I recorded a quick video to show you how you can do this using Apify. I've talked about them before, and they are collaborating with me on this post. They have one of the best open-source web scraping and browser automation libraries out there: But it gets much better than this! You can use MCP to connect your AI Agents and applications to the Apify platform and use any specialized actor on demand to scrape and process online data. In the video, I used Cursor to scrape LinkedIn posts with the words "Machine Learning" in real time. Worked like a charm with no code needed! Here is a link to the platform: Think about this: You can now feed your AI applications with any public data on demand! We aren't ready for what's coming.

Santiago

101,473 Aufrufe • vor 11 Monaten

we're bringing the context of the world wide web to your data. ask anything, get answers in Hex web search for the hex agent is live powered by Parallel Web Systems was so fun to work with Travers & the parallel team on this!

we're bringing the context of the world wide web to your data. ask anything, get answers in Hex web search for the hex agent is live powered by Parallel Web Systems was so fun to work with Travers & the parallel team on this!

Olivia Koshy

13,985 Aufrufe • vor 15 Tagen

Finally! A Text-to-SQL tool that actually works! Vanna is an open-source RAG framework for complex Text-to-SQL generation. It manages dynamic data and allows custom RAG model training for greater accuracy. 100% open-source.

Finally! A Text-to-SQL tool that actually works! Vanna is an open-source RAG framework for complex Text-to-SQL generation. It manages dynamic data and allows custom RAG model training for greater accuracy. 100% open-source.

Akshay 🚀

168,600 Aufrufe • vor 1 Jahr