Video wird geladen...

Video konnte nicht geladen werden

Zur Startseite

Open-sourcing Introspect: MIT-licensed Deep-Research for your internal data! Works with spreadsheets, databases, PDFs, and web search. Has a remarkably simple architecture – Sonnet agent armed with recursive tool calling and 3 default tools. Best for use-cases where you want to combine insights from SQL with unstructured data + data...

71,351 Aufrufe • vor 1 Jahr •via X (Twitter)

11 Kommentare

Profilbild von Rishabh Srivastava
Rishabh Srivastavavor 1 Jahr

Github: Live Demo (log in with username admin and password admin):

Profilbild von Dr Shad Katuu 🌐
Dr Shad Katuu 🌐vor 2 Jahren

#OpenAccess article "Soup du jour – existing and emerging trends in archives and records management standardization"

Profilbild von Walter Tay
Walter Tayvor 1 Jahr

hey rishabh, i just took a quick look at the code. claude-3-7-sonnet-latest has a known issue with citations and you can actually improve it just by slightly modifying the prompt (anthropic has a note in their docs on this) looks amazing btw all the best!

Profilbild von Rishabh Srivastava
Rishabh Srivastavavor 1 Jahr

Thank you! Will fix — appreciate this v much!

Profilbild von Breno Brito
Breno Britovor 1 Jahr

This should be great to use with @obsdmd

Profilbild von Aditi Kothari
Aditi Kotharivor 1 Jahr

Cool stuff!!

Profilbild von Rodolfo Rosini ✨☕️
Rodolfo Rosini ✨☕️vor 1 Jahr

I have been looking for something like this for a while, and we would be happily pay for it for commercial use

Profilbild von Rishabh Srivastava
Rishabh Srivastavavor 1 Jahr

Super glad to hear that! We are launching a hosted version on Friday

Profilbild von ABC
ABCvor 1 Jahr

Awesome , will try this soon

Profilbild von Rishabh Srivastava
Rishabh Srivastavavor 1 Jahr

Excited for your feedback! We might still have some rough edges in a few places, but should be able to fix those soon!

Profilbild von Vaibhav (VB) Srivastav
Vaibhav (VB) Srivastavvor 1 Jahr

🔥🔥🔥

Ähnliche Videos

Google open-sourced MCP Toolbox for Databases. I gave it access to everything else. For context, Google's MCP Toolbox for Databases is an open-source server that lets AI agents securely query structured databases like PostgreSQL and MySQL through the MCP protocol However, most enterprise knowledge doesn't actually live in databases. It's scattered across emails, Slack threads, GitHub repos, Salesforce records, customer reviews, and internal docs. So Agents can't see any of it, which means they're working with a fraction of the context they need. I fixed that using MindsDB. It acts as a universal SQL layer that sits on top of all your data sources: structured, semi-structured, and unstructured. This means you can query Salesforce, Gmail, GitHub, S3 files, Jira, and 200+ more sources using SQL syntax. The clever part is how it connects to the MCP Toolbox. MindsDB exposes everything through MySQL, so from the Agent's perspective, it's just running SQL and getting context back. It doesn't know or care that the data came from five different sources behind the scenes. This setup unlocks some powerful capabilities: → One SQL interface for dozens of enterprise sources → Cross-datasource joins (combine GitHub and CRM data in a single query) → Built-in ML capabilities for working with unstructured data → Simple MCP tools that now have massively expanded reach In the video below, the Agent queries GitHub data and a customer review database in one SQL query. So what used to require ETL pipelines and weeks of engineering effort now happens instantly. At the end of the day, AI agents are only as useful as the data they can access. This gives them a lot more to work with. I have shared the GitHub repo in the replies, where you can find more details about this.

Akshay 🚀

39,331 Aufrufe • vor 3 Monaten

Traditional data pipelines don't work for RAG applications. There are 3 issues with them: ​ 1. Traditional data engineering solutions are optimized to handle structured data. RAG applications rely primarily on unstructured data. ​ 2. The connector ecosystem to load data from unstructured data sources is very immature. ​ 3. Traditional solutions do not offer any way to transform unstructured data into an optimized vector search index. ​ The goal of a RAG Pipeline is to solve these problems. ​ The number one objective is to create a reliable vector search index using factual knowledge and relevant context. This sounds easy, but it's one of the biggest challenges we face when building RAG applications. ​ At a high level, there are four different stages in the architecture of a RAG pipeline: ​ 1. Ingestion: Here is where the pipeline loads the information from the data source. ​ 2. Extraction: Where the pipeline processes the input data and decides how to retrieve the text contained inside them. ​ 3. Transform: Where the pipeline chunks the data and generates document embeddings. ​ 4. Load: Where the pipeline creates a search index in a vector database and loads the document embeddings. ​ There are different rabbit holes at each one of these stages. Here are three of them: ​ 1. Ingesting data once is simple. The hard part is refreshing the vector database whenever the original data source changes. ​ 2. Extracting the content of a plain text document is simple. The hard part is to extract content from complex documents containing tables, images, or cross-references. ​ 3. A simple continual chunking strategy with an overlap is simple. The hard part is to find the optimal strategy for your specific knowledge base and the way you are planning to query it. ​ In the attached video, I'll show you how you can build an enterprise-grade RAG Pipeline that solves every one of the above problems. ​ I'll use Vectorize. They partnered with me on this post. You can use them to build RAG pipelines optimized for accurate context retrieval. ​ ​ If you have a few documents lying around, set up a free account and give it a try.

Santiago

40,441 Aufrufe • vor 1 Jahr

Your agents can't keep up with real-time data. Especially when it's scattered across dozens of sources. Most teams waste weeks building custom connectors for every database, API, and data warehouse. Then they build ETL pipelines to sync everything. By the time your agent retrieves the data, it's already outdated. Picture this: Your Postgres database updated 5 minutes ago. Your MongoDB collection changed 2 minutes ago. Your agent is still pulling from yesterday's snapshot. This is why most production RAG systems fail. There's a better approach: MindsDB is an open-source AI platform with a federated data engine that lets you query multiple data sources in real-time using SQL - without moving any data. Here's what makes it different: ↳ Your data stays in place. No ETL pipelines or data duplication ↳ Query Postgres, MongoDB, REST APIs, and more using consistent SQL ↳ JOIN across different sources in real-time with a unified interface ↳ Works with both structured and un-structured data And here's the best part: You don't even need to write SQL. Just describe what you want in plain English, and MindsDB converts it to SQL automatically. The system does all the heavy lifting. The breakthrough for AI agents is simple: When data updates at the source, your agent gets fresh results immediately. No sync delays. No stale embeddings. No custom code for each integration. You can literally write a SQL query that joins a Postgres table with a MongoDB collection and gets live results. This is what production AI applications need but rarely get. In this video, I give you a complete walkthrough of what we just discussed and how to actually do it. Make sure you watch this till the end. I've shared the link to MindsDB's GitHub repo in the next tweet!

Akshay 🚀

65,672 Aufrufe • vor 7 Monaten

Data teams spend weeks on simple requests. (This AI answers them in minutes.) Most data analysis is repetitive manual tasks. Data teams spend more time on setup than actual analysis. The workflow usually looks like this: → Run some exploratory data analysis in a local Jupyter notebook or environment → Pull data from multiple disconnected sources → Write code from scratch for every analysis → Export static charts that stakeholders can't explore (or wrestle with legacy BI to create a dashboard) → Manually send updates via email or Slack when data changes → Start over for each new request Most teams accept this as "how data analysis works." While business decisions wait for insights. That's where Fabi changes the entire approach. It's a powerful, AI-native platform built for teams that want to boost productivity and supercharge their data workflows. Instead of working on separate tools and manual processes, you collaborate on analysis that automatically delivers insights where teams work. Here's what makes Fabi different: AI-Native Analysis Environment ↳ SQL and Python work together with AI assistance that handles coding and debugging automatically. Smart Automation Workflows ↳ Automatically send AI-powered reports and summaries right where business works in Slack, email, and spreadsheets. Universal Data Integration ↳ Analyze data from files, Google Sheets, Airtable, plus your data warehouse and databases in one place. Collaborative Data Apps ↳ Create interactive dashboards that stakeholders can explore and ask follow-up questions directly. What you can do with Fabi that legacy BI can't: ➟ Send AI-generated insights directly to Slack channels ➟ Automatically email data summaries to stakeholders ➟ Analyze uploaded files without complex ETL processes ➟ Collaborate on analysis like Google Docs for data ➟ Build workflows that push insights to spreadsheets Perfect for teams that want to move beyond the constraints of legacy and increase their impact. Teams using Fabi see immediate results: ✓ Insights delivered in minutes instead of days ✓ Reduced context switching between tools ✓ Stakeholders explore data independently ✓ Workflows automated to save hours of manual work From analysis to automated delivery - all in one AI-native environment. 📌 Try Fabi today: 👉 Follow Fabi.ai and marc for Fabi updates. 🔄 Repost to help other teams streamline data analysis #DataAnalysis #ModernBI #DataOps #InteractiveDashboards #FabiPartnership #SponsoredByFabi

Andrew Bolis

36,504 Aufrufe • vor 9 Monaten