Video wird geladen...

Video konnte nicht geladen werden

Beim Laden dieses Videos ist ein Problem aufgetreten. Dies könnte an einem vorübergehenden Netzwerkproblem liegen oder das Video ist möglicherweise nicht verfügbar.

Open-sourcing Introspect: MIT-licensed Deep-Research for your internal data! Works with spreadsheets, databases, PDFs, and web search. Has a remarkably simple architecture – Sonnet agent armed with recursive tool calling and 3 default tools. Best for use-cases where you want to combine insights from SQL with unstructured data + data... show more

Rishabh Srivastava

12,562 subscribers

71,549 Aufrufe • vor 1 Jahr •via X (Twitter)

Wissenschaft & Technologie Bildung

Anya Rossi• Live Now

Private livecam show

11 Kommentare

Profilbild von Rishabh Srivastava

Rishabh Srivastavavor 1 Jahr

Github: Live Demo (log in with username admin and password admin):

Profilbild von Dr Shad Katuu 🌐

Dr Shad Katuu 🌐vor 2 Jahren

#OpenAccess article "Soup du jour – existing and emerging trends in archives and records management standardization"

Profilbild von Walter Tay

Walter Tayvor 1 Jahr

hey rishabh, i just took a quick look at the code. claude-3-7-sonnet-latest has a known issue with citations and you can actually improve it just by slightly modifying the prompt (anthropic has a note in their docs on this) looks amazing btw all the best!

Profilbild von Rishabh Srivastava

Rishabh Srivastavavor 1 Jahr

Thank you! Will fix — appreciate this v much!

Profilbild von Breno Brito

Breno Britovor 1 Jahr

This should be great to use with @obsdmd

Profilbild von Aditi Kothari

Aditi Kotharivor 1 Jahr

Cool stuff!!

Profilbild von Rodolfo Rosini ✨☕️

Rodolfo Rosini ✨☕️vor 1 Jahr

I have been looking for something like this for a while, and we would be happily pay for it for commercial use

Profilbild von Rishabh Srivastava

Rishabh Srivastavavor 1 Jahr

Super glad to hear that! We are launching a hosted version on Friday

Profilbild von ABC

ABCvor 1 Jahr

Awesome , will try this soon

Profilbild von Rishabh Srivastava

Rishabh Srivastavavor 1 Jahr

Excited for your feedback! We might still have some rough edges in a few places, but should be able to fix those soon!

Profilbild von Vaibhav (VB) Srivastav

Vaibhav (VB) Srivastavvor 1 Jahr

🔥🔥🔥

Ähnliche Videos

We built Open Gamma An Open Source Presentation Generator with Composio Tool Router and AI SDK - Generate Presentations in Google Slides instantly - Can research on the topic with web search - Can connect and fetch data from multiple data sources like Hubspot, Slack, Notion and more Completely free and open source link to the code:

We built Open Gamma An Open Source Presentation Generator with Composio Tool Router and AI SDK - Generate Presentations in Google Slides instantly - Can research on the topic with web search - Can connect and fetch data from multiple data sources like Hubspot, Slack, Notion and more Completely free and open source link to the code:

Karan Vaidya

40,455 Aufrufe • vor 7 Monaten

We’re excited to share the beta release of our tool use API! Now developers can easily equip any LLM with hosted open-source tools for web search, web crawling, and maps data (+ more coming soon). Under the hood, it's powered by MCP — but that's just an implementation detail ↓

We’re excited to share the beta release of our tool use API! Now developers can easily equip any LLM with hosted open-source tools for web search, web crawling, and maps data (+ more coming soon). Under the hood, it's powered by MCP — but that's just an implementation detail ↓

OpenTools

125,965 Aufrufe • vor 1 Jahr

$Google open-sourced MCP Toolbox for Databases. I gave it access to everything else. For context, Google's MCP Toolbox for Databases is an open-source server that lets AI agents securely query structured databases like PostgreSQL and MySQL through the MCP protocol However, most enterprise knowledge doesn't actually live in databases. It's scattered across emails, Slack threads, GitHub repos, Salesforce records, customer reviews, and internal docs. So Agents can't see any of it, which means they're working with a fraction of the context they need. I fixed that using MindsDB. It acts as a universal SQL layer that sits on top of all your data sources: structured, semi-structured, and unstructured. This means you can query Salesforce, Gmail, GitHub, S3 files, Jira, and 200+ more sources using SQL syntax. The clever part is how it connects to the MCP Toolbox. MindsDB exposes everything through MySQL, so from the Agent's perspective, it's just running SQL and getting context back. It doesn't know or care that the data came from five different sources behind the scenes. This setup unlocks some powerful capabilities: → One SQL interface for dozens of enterprise sources → Cross-datasource joins (combine GitHub and CRM data in a single query) → Built-in ML capabilities for working with unstructured data → Simple MCP tools that now have massively expanded reach In the video below, the Agent queries GitHub data and a customer review database in one SQL query. So what used to require ETL pipelines and weeks of engineering effort now happens instantly. At the end of the day, AI agents are only as useful as the data they can access. This gives them a lot more to work with. I have shared the GitHub repo in the replies, where you can find more details about this.$

Google open-sourced MCP Toolbox for Databases. I gave it access to everything else. For context, Google's MCP Toolbox for Databases is an open-source server that lets AI agents securely query structured databases like PostgreSQL and MySQL through the MCP protocol However, most enterprise knowledge doesn't actually live in databases. It's scattered across emails, Slack threads, GitHub repos, Salesforce records, customer reviews, and internal docs. So Agents can't see any of it, which means they're working with a fraction of the context they need. I fixed that using MindsDB. It acts as a universal SQL layer that sits on top of all your data sources: structured, semi-structured, and unstructured. This means you can query Salesforce, Gmail, GitHub, S3 files, Jira, and 200+ more sources using SQL syntax. The clever part is how it connects to the MCP Toolbox. MindsDB exposes everything through MySQL, so from the Agent's perspective, it's just running SQL and getting context back. It doesn't know or care that the data came from five different sources behind the scenes. This setup unlocks some powerful capabilities: → One SQL interface for dozens of enterprise sources → Cross-datasource joins (combine GitHub and CRM data in a single query) → Built-in ML capabilities for working with unstructured data → Simple MCP tools that now have massively expanded reach In the video below, the Agent queries GitHub data and a customer review database in one SQL query. So what used to require ETL pipelines and weeks of engineering effort now happens instantly. At the end of the day, AI agents are only as useful as the data they can access. This gives them a lot more to work with. I have shared the GitHub repo in the replies, where you can find more details about this.

Akshay 🚀

39,331 Aufrufe • vor 5 Monaten

Sharing our latest short course: Building and Evaluating Data Agents, created in collaboration with Snowflake and taught by Anupam Datta (Anupam Datta) and Josh Reini (Josh Reini). A data agent extracts data from sources such as files or databases, analyzes it, and provides insights and visualizes its findings. But most data agents struggle with reliability or can't handle multi-step reasoning. In this course, you'll learn to build, trace, and evaluate a multi-agent workflow that plans tasks, pulls context from structured and unstructured data, performs web search, and summarizes or visualizes the final results. Learn more and enroll for free!

Sharing our latest short course: Building and Evaluating Data Agents, created in collaboration with Snowflake and taught by Anupam Datta (Anupam Datta) and Josh Reini (Josh Reini). A data agent extracts data from sources such as files or databases, analyzes it, and provides insights and visualizes its findings. But most data agents struggle with reliability or can't handle multi-step reasoning. In this course, you'll learn to build, trace, and evaluate a multi-agent workflow that plans tasks, pulls context from structured and unstructured data, performs web search, and summarizes or visualizes the final results. Learn more and enroll for free!

DeepLearning.AI

40,810 Aufrufe • vor 10 Monaten

Multi-Agent workflows are the future of AI. OpenAI released new Agent APIs today, and Box built an Agent that combines documents from Box and web search tools to generate answers. Enterprise devs can grab sample code from our GitHub repo to customize with their data.

Multi-Agent workflows are the future of AI. OpenAI released new Agent APIs today, and Box built an Agent that combines documents from Box and web search tools to generate answers. Enterprise devs can grab sample code from our GitHub repo to customize with their data.

Aaron Levie

102,062 Aufrufe • vor 1 Jahr

Thrilled to see Amazon Web Services making a major contribution to the open source AI community with the launch of the Strands Agents, an open source AI agents SDK! The core of Strands is the simple agentic loop that connects the model and tools together, like the two strands of DNA. This model-driven approach to agent building eliminates the need for complex agent orchestration by embracing the capabilities of state-of-the-art models to plan, chain thoughts, call tools, and reflect. Providing open source tools and interoperability with open source protocols is an important part of our strategy to enable an agentic future. Can't wait to see what you build with Strands!

Thrilled to see Amazon Web Services making a major contribution to the open source AI community with the launch of the Strands Agents, an open source AI agents SDK! The core of Strands is the simple agentic loop that connects the model and tools together, like the two strands of DNA. This model-driven approach to agent building eliminates the need for complex agent orchestration by embracing the capabilities of state-of-the-art models to plan, chain thoughts, call tools, and reflect. Providing open source tools and interoperability with open source protocols is an important part of our strategy to enable an agentic future. Can't wait to see what you build with Strands!

Swami Sivasubramanian

32,185 Aufrufe • vor 1 Jahr

New JavaScript short course: Build a full-stack web application that uses RAG in JavaScript RAG Web Apps with LlamaIndex, taught by Laurie Voss, VP of Developer Relations at LlamaIndex 🦙 and npm co-founder. - Build a RAG application for querying your own data - Develop tools to interact with multiple data sources using an agent that intelligently selects the right tool for your queries - Create a full-stack web app that can chat with your data - Dig further into production-ready techniques, like how to persist your data so you aren’t constantly reindexing, and try the create-llama command line tool from LlamaIndex You can sign up here:

New JavaScript short course: Build a full-stack web application that uses RAG in JavaScript RAG Web Apps with LlamaIndex, taught by Laurie Voss, VP of Developer Relations at LlamaIndex 🦙 and npm co-founder. - Build a RAG application for querying your own data - Develop tools to interact with multiple data sources using an agent that intelligently selects the right tool for your queries - Create a full-stack web app that can chat with your data - Dig further into production-ready techniques, like how to persist your data so you aren’t constantly reindexing, and try the create-llama command line tool from LlamaIndex You can sign up here:

Andrew Ng

218,284 Aufrufe • vor 2 Jahren

Reducto CLI is the best way for agents to interact with document data in any workflow, and I’m really excited to see the range of use cases that are possible with this. It uses our frontier models for anything from parsing to editing, and can help automate full workflows like building spreadsheets or reports with really ugly cases. Raunak’s demo is a great example of what this unlocks

Reducto CLI is the best way for agents to interact with document data in any workflow, and I’m really excited to see the range of use cases that are possible with this. It uses our frontier models for anything from parsing to editing, and can help automate full workflows like building spreadsheets or reports with really ugly cases. Raunak’s demo is a great example of what this unlocks

Adit

16,917 Aufrufe • vor 7 Monaten

Traditional data pipelines don't work for RAG applications. There are 3 issues with them: 1. Traditional data engineering solutions are optimized to handle structured data. RAG applications rely primarily on unstructured data. 2. The connector ecosystem to load data from unstructured data sources is very immature. 3. Traditional solutions do not offer any way to transform unstructured data into an optimized vector search index. The goal of a RAG Pipeline is to solve these problems. The number one objective is to create a reliable vector search index using factual knowledge and relevant context. This sounds easy, but it's one of the biggest challenges we face when building RAG applications. At a high level, there are four different stages in the architecture of a RAG pipeline: 1. Ingestion: Here is where the pipeline loads the information from the data source. 2. Extraction: Where the pipeline processes the input data and decides how to retrieve the text contained inside them. 3. Transform: Where the pipeline chunks the data and generates document embeddings. 4. Load: Where the pipeline creates a search index in a vector database and loads the document embeddings. There are different rabbit holes at each one of these stages. Here are three of them: 1. Ingesting data once is simple. The hard part is refreshing the vector database whenever the original data source changes. 2. Extracting the content of a plain text document is simple. The hard part is to extract content from complex documents containing tables, images, or cross-references. 3. A simple continual chunking strategy with an overlap is simple. The hard part is to find the optimal strategy for your specific knowledge base and the way you are planning to query it. In the attached video, I'll show you how you can build an enterprise-grade RAG Pipeline that solves every one of the above problems. I'll use Vectorize. They partnered with me on this post. You can use them to build RAG pipelines optimized for accurate context retrieval. If you have a few documents lying around, set up a free account and give it a try.

Traditional data pipelines don't work for RAG applications. There are 3 issues with them: 1. Traditional data engineering solutions are optimized to handle structured data. RAG applications rely primarily on unstructured data. 2. The connector ecosystem to load data from unstructured data sources is very immature. 3. Traditional solutions do not offer any way to transform unstructured data into an optimized vector search index. The goal of a RAG Pipeline is to solve these problems. The number one objective is to create a reliable vector search index using factual knowledge and relevant context. This sounds easy, but it's one of the biggest challenges we face when building RAG applications. At a high level, there are four different stages in the architecture of a RAG pipeline: 1. Ingestion: Here is where the pipeline loads the information from the data source. 2. Extraction: Where the pipeline processes the input data and decides how to retrieve the text contained inside them. 3. Transform: Where the pipeline chunks the data and generates document embeddings. 4. Load: Where the pipeline creates a search index in a vector database and loads the document embeddings. There are different rabbit holes at each one of these stages. Here are three of them: 1. Ingesting data once is simple. The hard part is refreshing the vector database whenever the original data source changes. 2. Extracting the content of a plain text document is simple. The hard part is to extract content from complex documents containing tables, images, or cross-references. 3. A simple continual chunking strategy with an overlap is simple. The hard part is to find the optimal strategy for your specific knowledge base and the way you are planning to query it. In the attached video, I'll show you how you can build an enterprise-grade RAG Pipeline that solves every one of the above problems. I'll use Vectorize. They partnered with me on this post. You can use them to build RAG pipelines optimized for accurate context retrieval. If you have a few documents lying around, set up a free account and give it a try.

Santiago

40,441 Aufrufe • vor 1 Jahr

Your agents can't keep up with real-time data. Especially when it's scattered across dozens of sources. Most teams waste weeks building custom connectors for every database, API, and data warehouse. Then they build ETL pipelines to sync everything. By the time your agent retrieves the data, it's already outdated. Picture this: Your Postgres database updated 5 minutes ago. Your MongoDB collection changed 2 minutes ago. Your agent is still pulling from yesterday's snapshot. This is why most production RAG systems fail. There's a better approach: MindsDB is an open-source AI platform with a federated data engine that lets you query multiple data sources in real-time using SQL - without moving any data. Here's what makes it different: ↳ Your data stays in place. No ETL pipelines or data duplication ↳ Query Postgres, MongoDB, REST APIs, and more using consistent SQL ↳ JOIN across different sources in real-time with a unified interface ↳ Works with both structured and un-structured data And here's the best part: You don't even need to write SQL. Just describe what you want in plain English, and MindsDB converts it to SQL automatically. The system does all the heavy lifting. The breakthrough for AI agents is simple: When data updates at the source, your agent gets fresh results immediately. No sync delays. No stale embeddings. No custom code for each integration. You can literally write a SQL query that joins a Postgres table with a MongoDB collection and gets live results. This is what production AI applications need but rarely get. In this video, I give you a complete walkthrough of what we just discussed and how to actually do it. Make sure you watch this till the end. I've shared the link to MindsDB's GitHub repo in the next tweet!

Your agents can't keep up with real-time data. Especially when it's scattered across dozens of sources. Most teams waste weeks building custom connectors for every database, API, and data warehouse. Then they build ETL pipelines to sync everything. By the time your agent retrieves the data, it's already outdated. Picture this: Your Postgres database updated 5 minutes ago. Your MongoDB collection changed 2 minutes ago. Your agent is still pulling from yesterday's snapshot. This is why most production RAG systems fail. There's a better approach: MindsDB is an open-source AI platform with a federated data engine that lets you query multiple data sources in real-time using SQL - without moving any data. Here's what makes it different: ↳ Your data stays in place. No ETL pipelines or data duplication ↳ Query Postgres, MongoDB, REST APIs, and more using consistent SQL ↳ JOIN across different sources in real-time with a unified interface ↳ Works with both structured and un-structured data And here's the best part: You don't even need to write SQL. Just describe what you want in plain English, and MindsDB converts it to SQL automatically. The system does all the heavy lifting. The breakthrough for AI agents is simple: When data updates at the source, your agent gets fresh results immediately. No sync delays. No stale embeddings. No custom code for each integration. You can literally write a SQL query that joins a Postgres table with a MongoDB collection and gets live results. This is what production AI applications need but rarely get. In this video, I give you a complete walkthrough of what we just discussed and how to actually do it. Make sure you watch this till the end. I've shared the link to MindsDB's GitHub repo in the next tweet!

Akshay 🚀

65,672 Aufrufe • vor 8 Monaten

Here is an open-source tool to generate a complete dataset. 1. Describe the data you want 2. An orchestrator agent searches the web 3. Sub-agents run in parallel to fetch the data 4. You get a structured dataset you can download For example, you can run Bigset with the query "all leica lenses being sold on amazon", or "leica stores in kyoto with their opening hours and ratings". Bigset uses TinyFish's free Search and Fetch APIs in the background. You can configure it to refresh the data on a schedule. You can self-host it with your own keys. Here is the GitHub repository: You can get free TinyFish API keys here: Thanks to the TinyFish team for partnering with me on this post.

Here is an open-source tool to generate a complete dataset. 1. Describe the data you want 2. An orchestrator agent searches the web 3. Sub-agents run in parallel to fetch the data 4. You get a structured dataset you can download For example, you can run Bigset with the query "all leica lenses being sold on amazon", or "leica stores in kyoto with their opening hours and ratings". Bigset uses TinyFish's free Search and Fetch APIs in the background. You can configure it to refresh the data on a schedule. You can self-host it with your own keys. Here is the GitHub repository: You can get free TinyFish API keys here: Thanks to the TinyFish team for partnering with me on this post.

Santiago

20,762 Aufrufe • vor 1 Monat

Function calling is a powerful way to extend the capabilities of LLMs and AI agents by letting them use external tools. Our new short course Function calling and Data Extraction with LLMs, created with @NexusflowX and taught by Jiantao Jiao and Venkat, demonstrates how to prompt LLMs to form calls to external functions. You'll work with NexusRavenV2-13B, a 13B parameter open-source model that excels in function calling tasks while still being small enough to host locally. Learn to use function calling to extract structured data from unstructured text and access web APIs, and build an end-to-end application that processes customer service transcripts. You'll learn how to build LLM-powered applications that can analyze feedback, automate data entry, and enhance search. Please get started here:

Function calling is a powerful way to extend the capabilities of LLMs and AI agents by letting them use external tools. Our new short course Function calling and Data Extraction with LLMs, created with @NexusflowX and taught by Jiantao Jiao and Venkat, demonstrates how to prompt LLMs to form calls to external functions. You'll work with NexusRavenV2-13B, a 13B parameter open-source model that excels in function calling tasks while still being small enough to host locally. Learn to use function calling to extract structured data from unstructured text and access web APIs, and build an end-to-end application that processes customer service transcripts. You'll learn how to build LLM-powered applications that can analyze feedback, automate data entry, and enhance search. Please get started here:

Andrew Ng

110,606 Aufrufe • vor 2 Jahren

Web scraping is a critical skill, and yet nobody talks about it. How do you think companies are training their Large Language Models? Where do you think the data comes from? But web scraping goes beyond all of that. Imagine giving an AI agent access to any public online data in real time! I like to call this "web-scrapping on demand", and I'm pretty sure it's going to unlock unlimited power for AI applications. I recorded a quick video to show you how you can do this using Apify. I've talked about them before, and they are collaborating with me on this post. They have one of the best open-source web scraping and browser automation libraries out there: But it gets much better than this! You can use MCP to connect your AI Agents and applications to the Apify platform and use any specialized actor on demand to scrape and process online data. In the video, I used Cursor to scrape LinkedIn posts with the words "Machine Learning" in real time. Worked like a charm with no code needed! Here is a link to the platform: Think about this: You can now feed your AI applications with any public data on demand! We aren't ready for what's coming.

Web scraping is a critical skill, and yet nobody talks about it. How do you think companies are training their Large Language Models? Where do you think the data comes from? But web scraping goes beyond all of that. Imagine giving an AI agent access to any public online data in real time! I like to call this "web-scrapping on demand", and I'm pretty sure it's going to unlock unlimited power for AI applications. I recorded a quick video to show you how you can do this using Apify. I've talked about them before, and they are collaborating with me on this post. They have one of the best open-source web scraping and browser automation libraries out there: But it gets much better than this! You can use MCP to connect your AI Agents and applications to the Apify platform and use any specialized actor on demand to scrape and process online data. In the video, I used Cursor to scrape LinkedIn posts with the words "Machine Learning" in real time. Worked like a charm with no code needed! Here is a link to the platform: Think about this: You can now feed your AI applications with any public data on demand! We aren't ready for what's coming.

Santiago

101,473 Aufrufe • vor 1 Jahr

Data teams spend weeks on simple requests. (This AI answers them in minutes.) Most data analysis is repetitive manual tasks. Data teams spend more time on setup than actual analysis. The workflow usually looks like this: → Run some exploratory data analysis in a local Jupyter notebook or environment → Pull data from multiple disconnected sources → Write code from scratch for every analysis → Export static charts that stakeholders can't explore (or wrestle with legacy BI to create a dashboard) → Manually send updates via email or Slack when data changes → Start over for each new request Most teams accept this as "how data analysis works." While business decisions wait for insights. That's where Fabi changes the entire approach. It's a powerful, AI-native platform built for teams that want to boost productivity and supercharge their data workflows. Instead of working on separate tools and manual processes, you collaborate on analysis that automatically delivers insights where teams work. Here's what makes Fabi different: AI-Native Analysis Environment ↳ SQL and Python work together with AI assistance that handles coding and debugging automatically. Smart Automation Workflows ↳ Automatically send AI-powered reports and summaries right where business works in Slack, email, and spreadsheets. Universal Data Integration ↳ Analyze data from files, Google Sheets, Airtable, plus your data warehouse and databases in one place. Collaborative Data Apps ↳ Create interactive dashboards that stakeholders can explore and ask follow-up questions directly. What you can do with Fabi that legacy BI can't: ➟ Send AI-generated insights directly to Slack channels ➟ Automatically email data summaries to stakeholders ➟ Analyze uploaded files without complex ETL processes ➟ Collaborate on analysis like Google Docs for data ➟ Build workflows that push insights to spreadsheets Perfect for teams that want to move beyond the constraints of legacy and increase their impact. Teams using Fabi see immediate results: ✓ Insights delivered in minutes instead of days ✓ Reduced context switching between tools ✓ Stakeholders explore data independently ✓ Workflows automated to save hours of manual work From analysis to automated delivery - all in one AI-native environment. 📌 Try Fabi today: 👉 Follow Fabi.ai and marc for Fabi updates. 🔄 Repost to help other teams streamline data analysis #DataAnalysis #ModernBI #DataOps #InteractiveDashboards #FabiPartnership #SponsoredByFabi

Data teams spend weeks on simple requests. (This AI answers them in minutes.) Most data analysis is repetitive manual tasks. Data teams spend more time on setup than actual analysis. The workflow usually looks like this: → Run some exploratory data analysis in a local Jupyter notebook or environment → Pull data from multiple disconnected sources → Write code from scratch for every analysis → Export static charts that stakeholders can't explore (or wrestle with legacy BI to create a dashboard) → Manually send updates via email or Slack when data changes → Start over for each new request Most teams accept this as "how data analysis works." While business decisions wait for insights. That's where Fabi changes the entire approach. It's a powerful, AI-native platform built for teams that want to boost productivity and supercharge their data workflows. Instead of working on separate tools and manual processes, you collaborate on analysis that automatically delivers insights where teams work. Here's what makes Fabi different: AI-Native Analysis Environment ↳ SQL and Python work together with AI assistance that handles coding and debugging automatically. Smart Automation Workflows ↳ Automatically send AI-powered reports and summaries right where business works in Slack, email, and spreadsheets. Universal Data Integration ↳ Analyze data from files, Google Sheets, Airtable, plus your data warehouse and databases in one place. Collaborative Data Apps ↳ Create interactive dashboards that stakeholders can explore and ask follow-up questions directly. What you can do with Fabi that legacy BI can't: ➟ Send AI-generated insights directly to Slack channels ➟ Automatically email data summaries to stakeholders ➟ Analyze uploaded files without complex ETL processes ➟ Collaborate on analysis like Google Docs for data ➟ Build workflows that push insights to spreadsheets Perfect for teams that want to move beyond the constraints of legacy and increase their impact. Teams using Fabi see immediate results: ✓ Insights delivered in minutes instead of days ✓ Reduced context switching between tools ✓ Stakeholders explore data independently ✓ Workflows automated to save hours of manual work From analysis to automated delivery - all in one AI-native environment. 📌 Try Fabi today: 👉 Follow Fabi.ai and marc for Fabi updates. 🔄 Repost to help other teams streamline data analysis #DataAnalysis #ModernBI #DataOps #InteractiveDashboards #FabiPartnership #SponsoredByFabi

Andrew Bolis

36,504 Aufrufe • vor 10 Monaten

🚀 Discover PublicAI Product Ecosystem – Your Gateway to contributes data to train AI and share revenues! 🚀 Unleash the power of PublicAI's Data Hunter Chrome Extension and transform your social media browsing into a treasure trove of valuable data. 📊 Upload your ChatGPT conversations with a simple click and earn rewards for your valuable contributions. 🔍 Use Data Hub to harness your intuition and refine data through community consensus. 🤖 Harness community-approved data to train AI models and craft your own AI Agent. Embark on the AI journey with PublicAI and redefine the future of AI interaction! 🌐🤖 #PublicAI

🚀 Discover PublicAI Product Ecosystem – Your Gateway to contributes data to train AI and share revenues! 🚀 Unleash the power of PublicAI's Data Hunter Chrome Extension and transform your social media browsing into a treasure trove of valuable data. 📊 Upload your ChatGPT conversations with a simple click and earn rewards for your valuable contributions. 🔍 Use Data Hub to harness your intuition and refine data through community consensus. 🤖 Harness community-approved data to train AI models and craft your own AI Agent. Embark on the AI journey with PublicAI and redefine the future of AI interaction! 🌐🤖 #PublicAI

PublicAI

19,805 Aufrufe • vor 2 Jahren

Get AI-powered insights, straight from your Node. Your nodes are being upgraded with built-in AI—helping you decode news, crypto & market trends in seconds. Powered by Nodepay’s real-time data retrieval infrastructure, this tool transforms live web content into clear, actionable insights. But here’s the real alpha… It’s just the beginning of a new intelligence layer—where bandwidth, data, and insights converge 🤫

Get AI-powered insights, straight from your Node. Your nodes are being upgraded with built-in AI—helping you decode news, crypto & market trends in seconds. Powered by Nodepay’s real-time data retrieval infrastructure, this tool transforms live web content into clear, actionable insights. But here’s the real alpha… It’s just the beginning of a new intelligence layer—where bandwidth, data, and insights converge 🤫

Nodepay

91,389 Aufrufe • vor 1 Jahr

Here's how I would learn data engineering basics in 2025: - Find a data source you care about (examples: gaming APIs, stock market, web scraping, etc) - Use Python to interact and ingest your source. Initially just write the data to a CSV. - Setup an account with Snowflake or Google BigQuery. - update your Python script to load a table in Snowflake/BigQuery - schedule your script with CRON in the cloud with a service like Heroku. - build aggregations and visualizations on top of your ingested data Only thing this misses is data quality and complex job orchestration which you can learn later! How would you learn data engineering nowadays?

Here's how I would learn data engineering basics in 2025: - Find a data source you care about (examples: gaming APIs, stock market, web scraping, etc) - Use Python to interact and ingest your source. Initially just write the data to a CSV. - Setup an account with Snowflake or Google BigQuery. - update your Python script to load a table in Snowflake/BigQuery - schedule your script with CRON in the cloud with a service like Heroku. - build aggregations and visualizations on top of your ingested data Only thing this misses is data quality and complex job orchestration which you can learn later! How would you learn data engineering nowadays?

Zach Wilson

20,368 Aufrufe • vor 1 Jahr

Enrollment is now open for the Data Engineering Professional Certificate! Data engineers are the architects of modern organizations, ensuring data is reliable, accessible, and ready for analytics and machine learning. This professional certificate is tailored to equip you with the critical skills, through frameworks and hands-on practice, to excel in this role. Taught by industry expert Joe Reis, co-author of the best-selling book "Fundamentals of Data Engineering," along with 17 guest instructors from the data field, you will gain expertise to start and further your career in the high-demand field of data engineering. Key focus areas: 🗂️ Data Engineering Lifecycle: Learn the important stages of building an efficient data pipeline that creates business value. 📥 Data Ingestion: Learn how to efficiently gather data from various sources. 💾 Data Storage: Master the techniques for storing data securely and cost-effectively. 🔄 Data Transformation: Understand how to clean, organize, and prepare data for analysis and machine learning. 🏗️ Data Architecture Design: Build robust architectures that support scalable, efficient data workflows. 📊 Serving Data: Ensure that data is available to stakeholders when and where they need it to drive business decisions. Enroll now!

Enrollment is now open for the Data Engineering Professional Certificate! Data engineers are the architects of modern organizations, ensuring data is reliable, accessible, and ready for analytics and machine learning. This professional certificate is tailored to equip you with the critical skills, through frameworks and hands-on practice, to excel in this role. Taught by industry expert Joe Reis, co-author of the best-selling book "Fundamentals of Data Engineering," along with 17 guest instructors from the data field, you will gain expertise to start and further your career in the high-demand field of data engineering. Key focus areas: 🗂️ Data Engineering Lifecycle: Learn the important stages of building an efficient data pipeline that creates business value. 📥 Data Ingestion: Learn how to efficiently gather data from various sources. 💾 Data Storage: Master the techniques for storing data securely and cost-effectively. 🔄 Data Transformation: Understand how to clean, organize, and prepare data for analysis and machine learning. 🏗️ Data Architecture Design: Build robust architectures that support scalable, efficient data workflows. 📊 Serving Data: Ensure that data is available to stakeholders when and where they need it to drive business decisions. Enroll now!

DeepLearning.AI

20,833 Aufrufe • vor 1 Jahr

Today, Box is announcing major new AI agent capabilities to let customers tap into the full value of their unstructured data. First, we’re announcing all new updates to the Box AI Studio to make it even easier to build AI agents that tap into your enterprise content for any job function, business process, or industry specific use case. We are also expanding our set of foundational agents that customers will be able to use to work with their enterprise content, including new features like search and research on unstructured data. Next, we’re announcing Box Extract to enable customers to use AI agents seamlessly for complex data extraction from any type of document or content. This makes it easier than ever to pull out data from contracts, invoices, research data, marketing assets, medical charts, and more. Finally, we’re introducing Box Automate, a new workflow automation solution within Box that lets you deploy AI agents across enterprise content-centric workflows. With Box Automate, you can design your business process in a simple drag and drop builder and then drop in AI agents at any step in the process. This ensures agents execute tasks at the right steps in a workflow every time. Best of all, our AI agents and workflow tools are designed to work across any system our customers work within, whether it’s leveraging pre-built integrations, Box APIs, or the new Box MCP Server. Ultimately, all of these capabilities come together to transform how companies can work with their enterprise content. Software has historically only been good at automating work that deals with structured data, which is why ERP, CRM, and HR systems have been mainstays of enterprise software for so long. The data in these systems fits neatly into a database, and the workflows are very ripe for automation. But it turns out most of the work in the world deals with unstructured data. It’s ideating through research documents, working with a client on contracts, reviewing details for a new product launch, looking at a patient’s healthcare record to make a diagnosis, working through due diligence documents for an M&A deal, and so on. For the first time ever, we can begin to bring all new insights and automation to this work with AI agents. At Box, we’re incredibly excited to be on this journey to help customers transform how they work with their most important data.

Today, Box is announcing major new AI agent capabilities to let customers tap into the full value of their unstructured data. First, we’re announcing all new updates to the Box AI Studio to make it even easier to build AI agents that tap into your enterprise content for any job function, business process, or industry specific use case. We are also expanding our set of foundational agents that customers will be able to use to work with their enterprise content, including new features like search and research on unstructured data. Next, we’re announcing Box Extract to enable customers to use AI agents seamlessly for complex data extraction from any type of document or content. This makes it easier than ever to pull out data from contracts, invoices, research data, marketing assets, medical charts, and more. Finally, we’re introducing Box Automate, a new workflow automation solution within Box that lets you deploy AI agents across enterprise content-centric workflows. With Box Automate, you can design your business process in a simple drag and drop builder and then drop in AI agents at any step in the process. This ensures agents execute tasks at the right steps in a workflow every time. Best of all, our AI agents and workflow tools are designed to work across any system our customers work within, whether it’s leveraging pre-built integrations, Box APIs, or the new Box MCP Server. Ultimately, all of these capabilities come together to transform how companies can work with their enterprise content. Software has historically only been good at automating work that deals with structured data, which is why ERP, CRM, and HR systems have been mainstays of enterprise software for so long. The data in these systems fits neatly into a database, and the workflows are very ripe for automation. But it turns out most of the work in the world deals with unstructured data. It’s ideating through research documents, working with a client on contracts, reviewing details for a new product launch, looking at a patient’s healthcare record to make a diagnosis, working through due diligence documents for an M&A deal, and so on. For the first time ever, we can begin to bring all new insights and automation to this work with AI agents. At Box, we’re incredibly excited to be on this journey to help customers transform how they work with their most important data.

Aaron Levie

91,863 Aufrufe • vor 10 Monaten

Today, we’re pushing a major update to Edison Analysis, our data analysis agent, which is tuned for scientific research and SOTA across data analysis benchmarks. In contrast to Kosmos, which runs for 6-12 hours and produces tens of thousands of lines of code, Edison Analysis runs for seconds to minutes and is best for specific, well-defined computational tasks. It is available both on our platform under the Analysis tab, and via API, and costs only one credit per run, so it is available to users on both free and paid tiers. Edison Analysis is a modified version of the data analysis agent Kosmos uses in its trajectories. Try it out! One of the most important improvements over our previous data analysis agents has been the addition of a specialized data retrieval tool. Edison Analysis can either use this tool to access data, or can pull data down directly via API. To evaluate this tool, we ranked the most commonly used public data repositories across recent papers from BioRxiv, and created a new benchmark that measures the ability of a language agent system to retrieve raw data from those sources. Edison Analysis gets 71% on this benchmark, and we’ll be working to increase this over time. You can read more about our benchmarks in the our blog post, link below. Some features worth highlighting: 1. Edison Analysis produces a report on the analysis it runs, along with a Jupyter notebook that you can download to reproduce the analysis yourself. Every figure it produces is linked back to the specific lines of code used to produce the figure, to make it easy to reproduce. 2. It works well with both Python and R. 3. One of the best uses for Edison Analysis is to use it to retrieve datasets that you can then analyze with Kosmos. We have a bunch of major improvements to Edison Analysis coming in the next few months that we’re excited to share. In the meantime, congratulations to the team, especially Ludovico Mitchener, Jon Laurent, Conor Igoe , Alex Andonian, and many more.

Today, we’re pushing a major update to Edison Analysis, our data analysis agent, which is tuned for scientific research and SOTA across data analysis benchmarks. In contrast to Kosmos, which runs for 6-12 hours and produces tens of thousands of lines of code, Edison Analysis runs for seconds to minutes and is best for specific, well-defined computational tasks. It is available both on our platform under the Analysis tab, and via API, and costs only one credit per run, so it is available to users on both free and paid tiers. Edison Analysis is a modified version of the data analysis agent Kosmos uses in its trajectories. Try it out! One of the most important improvements over our previous data analysis agents has been the addition of a specialized data retrieval tool. Edison Analysis can either use this tool to access data, or can pull data down directly via API. To evaluate this tool, we ranked the most commonly used public data repositories across recent papers from BioRxiv, and created a new benchmark that measures the ability of a language agent system to retrieve raw data from those sources. Edison Analysis gets 71% on this benchmark, and we’ll be working to increase this over time. You can read more about our benchmarks in the our blog post, link below. Some features worth highlighting: 1. Edison Analysis produces a report on the analysis it runs, along with a Jupyter notebook that you can download to reproduce the analysis yourself. Every figure it produces is linked back to the specific lines of code used to produce the figure, to make it easy to reproduce. 2. It works well with both Python and R. 3. One of the best uses for Edison Analysis is to use it to retrieve datasets that you can then analyze with Kosmos. We have a bunch of major improvements to Edison Analysis coming in the next few months that we’re excited to share. In the meantime, congratulations to the team, especially Ludovico Mitchener, Jon Laurent, Conor Igoe , Alex Andonian, and many more.

Sam Rodriques

61,895 Aufrufe • vor 8 Monaten