Загрузка видео...

Не удалось загрузить видео

На главную

New video from Shreya Shankar on data processing with LLMs at scale, an underrated topic! Shreya starts with a real use case: public defenders analyzing case files for racial bias (4:08). Hundreds of pages per defendant. Court transcripts, police reports, news articles. Running GPT-5 on everything costs a fortune....

35,510 просмотров • 6 месяцев назад •via X (Twitter)

Комментарии: 0

Нет доступных комментариев

Здесь появятся комментарии из оригинального поста

Похожие видео

How can you solve complex tasks using a Large Language Model? Here is a 2-minute introduction to everything you need to know to 10x the quality of your results. Let's talk about three techniques, in order of complexity, starting with the easiest one: • In-Context Learning • Indexing + In-Context Learning • Fine-tuning In-Context Learning The team that trained GPT-3 found something they couldn't explain: You can condition a model using examples of how you want it to behave. I included an example prompt in the attached video. You can "teach" the model how you want it to interpret questions, select the correct answers, and format the results by giving a few examples. You can also give specific knowledge to the model that will be helpful when formulating answers. We call this approach "grounding the model." There's another example in the video. Indexing + In-Context Learning Unfortunately, there is a limit to how much data you can include in a prompt. We call this the "context size." One version of GPT-4 supports a context of approximately 6,000 words, while the other supports 25,000 words. Although this sounds like a lot, many applications need more than that. Imagine you wrote a book and want to build an application to answer any questions about your story. What happens if your book is longer than the context? That's where Indexing comes in. Using a model, you can turn every book passage into an embedding. These are vectors, numbers that "encode" the passage's text. You can then store these embeddings in a particular database that supports fast retrieval of these vectors. You can then turn any question into an embedding and search the database for the list of passages that are similar to that query. Instead of using the entire book to ask the model, you can now use the relevant passages as in-context information, effectively working around the context size limitation. Fine-tuning Fine-tuning can give you an extra boost to get reliable outputs from your LLM. It is, however, the most complex approach on the list. There are different approaches to fine-tuning a model with your data. A popular technique is to process your data with your LLM and use the outputs to train a new classifier that solves your specific task. Notice that here you aren't modifying the LLM. Instead, you are chaining it with your trained classifier. Another approach is to modify the parameters of the LLM using your data. Think of this as "rewiring" the model in a way that solves your particular task. The results and costs will vary depending on how many layers you want to fine-tune from the original model. Many companies think that fine-tuning is the solution to their problems. In my experience, many will benefit from exploring the other two approaches. I love explaining Machine Learning and Artificial Intelligence ideas. If you enjoy in-depth content like this, follow me Santiago so you don't miss what comes next.

Santiago

384,482 просмотров • 3 лет назад

Your agents can't keep up with real-time data. Especially when it's scattered across dozens of sources. Most teams waste weeks building custom connectors for every database, API, and data warehouse. Then they build ETL pipelines to sync everything. By the time your agent retrieves the data, it's already outdated. Picture this: Your Postgres database updated 5 minutes ago. Your MongoDB collection changed 2 minutes ago. Your agent is still pulling from yesterday's snapshot. This is why most production RAG systems fail. There's a better approach: MindsDB is an open-source AI platform with a federated data engine that lets you query multiple data sources in real-time using SQL - without moving any data. Here's what makes it different: ↳ Your data stays in place. No ETL pipelines or data duplication ↳ Query Postgres, MongoDB, REST APIs, and more using consistent SQL ↳ JOIN across different sources in real-time with a unified interface ↳ Works with both structured and un-structured data And here's the best part: You don't even need to write SQL. Just describe what you want in plain English, and MindsDB converts it to SQL automatically. The system does all the heavy lifting. The breakthrough for AI agents is simple: When data updates at the source, your agent gets fresh results immediately. No sync delays. No stale embeddings. No custom code for each integration. You can literally write a SQL query that joins a Postgres table with a MongoDB collection and gets live results. This is what production AI applications need but rarely get. In this video, I give you a complete walkthrough of what we just discussed and how to actually do it. Make sure you watch this till the end. I've shared the link to MindsDB's GitHub repo in the next tweet!

Akshay 🚀

65,672 просмотров • 7 месяцев назад

Google open-sourced MCP Toolbox for Databases. I gave it access to everything else. For context, Google's MCP Toolbox for Databases is an open-source server that lets AI agents securely query structured databases like PostgreSQL and MySQL through the MCP protocol However, most enterprise knowledge doesn't actually live in databases. It's scattered across emails, Slack threads, GitHub repos, Salesforce records, customer reviews, and internal docs. So Agents can't see any of it, which means they're working with a fraction of the context they need. I fixed that using MindsDB. It acts as a universal SQL layer that sits on top of all your data sources: structured, semi-structured, and unstructured. This means you can query Salesforce, Gmail, GitHub, S3 files, Jira, and 200+ more sources using SQL syntax. The clever part is how it connects to the MCP Toolbox. MindsDB exposes everything through MySQL, so from the Agent's perspective, it's just running SQL and getting context back. It doesn't know or care that the data came from five different sources behind the scenes. This setup unlocks some powerful capabilities: → One SQL interface for dozens of enterprise sources → Cross-datasource joins (combine GitHub and CRM data in a single query) → Built-in ML capabilities for working with unstructured data → Simple MCP tools that now have massively expanded reach In the video below, the Agent queries GitHub data and a customer review database in one SQL query. So what used to require ETL pipelines and weeks of engineering effort now happens instantly. At the end of the day, AI agents are only as useful as the data they can access. This gives them a lot more to work with. I have shared the GitHub repo in the replies, where you can find more details about this.

Akshay 🚀

39,331 просмотров • 4 месяцев назад

Traditional data pipelines don't work for RAG applications. There are 3 issues with them: ​ 1. Traditional data engineering solutions are optimized to handle structured data. RAG applications rely primarily on unstructured data. ​ 2. The connector ecosystem to load data from unstructured data sources is very immature. ​ 3. Traditional solutions do not offer any way to transform unstructured data into an optimized vector search index. ​ The goal of a RAG Pipeline is to solve these problems. ​ The number one objective is to create a reliable vector search index using factual knowledge and relevant context. This sounds easy, but it's one of the biggest challenges we face when building RAG applications. ​ At a high level, there are four different stages in the architecture of a RAG pipeline: ​ 1. Ingestion: Here is where the pipeline loads the information from the data source. ​ 2. Extraction: Where the pipeline processes the input data and decides how to retrieve the text contained inside them. ​ 3. Transform: Where the pipeline chunks the data and generates document embeddings. ​ 4. Load: Where the pipeline creates a search index in a vector database and loads the document embeddings. ​ There are different rabbit holes at each one of these stages. Here are three of them: ​ 1. Ingesting data once is simple. The hard part is refreshing the vector database whenever the original data source changes. ​ 2. Extracting the content of a plain text document is simple. The hard part is to extract content from complex documents containing tables, images, or cross-references. ​ 3. A simple continual chunking strategy with an overlap is simple. The hard part is to find the optimal strategy for your specific knowledge base and the way you are planning to query it. ​ In the attached video, I'll show you how you can build an enterprise-grade RAG Pipeline that solves every one of the above problems. ​ I'll use Vectorize. They partnered with me on this post. You can use them to build RAG pipelines optimized for accurate context retrieval. ​ ​ If you have a few documents lying around, set up a free account and give it a try.

Santiago

40,441 просмотров • 1 год назад

This workflow combining Loom’s AI features + a custom ChatGPT GPT is saving me hours. Instead of creating onboarding Docs for new team members, I film a Loom → generate SOP → train a GPT to answer questions Game changer for businesses to delegate faster. Here's how to do it: First, I record a video of whatever task I want to delegate to the new team member with Loom. The more in-depth, the better, but I just used a 7-minute video. Then, I use Loom's new AI 'Write a document' feature. Upgrading to Loom AI from the standard Loom plan cost me $2. Loom AI can generate an entire SOP, PR description, step-by-step guide, QA, and more from a simple Loom video in <5 seconds. In the past, I’ve spent 2 hours+ hand-writing each one of these docs to onboard new team members, so Loom AI is already a massive timesaver. But it gets even better! Next, we can take that data from the SOP document, and we use it as 'Knowledge' to train a Custom GPT that can answer the new team members' questions. The more SOP docs/Knowledge you feed the GPT, the better. But one is fine if that's all you have because the GPT will pull any unknown answers from the web or its training data. Here are the prompt Instructions you want to put into the Custom GPT (copy and paste this): You are an expert CEO, specialized in onboarding and training new team members. Using the Knowledge provided, you will help new team members with any questions or stipulations they may have about their new role. Stick as true to the data provided as possible, but if= they ask any questions that the Knowledge base does not have a specific answer for, you are permitted to use your pre-trained data and/or web browsing capabilities. That's it! It can't replace you entirely, but it'll save you 90% of the time you would've wasted on writing an SOP doc and answering questions. Simple AI workflows here and there really add up. There's also a workflow to help with the job screening process, but I'll save that for another day :^)

Rowan Cheung

129,424 просмотров • 2 лет назад

Major program launch: Data Analytics Professional Certificate! This large, five-course sequence takes you all the way to being job-ready as a data analyst, and shows how to use Generative AI as a thought partner to enhance your work in this role. Offered by on Coursera, this is taught by Sean Barnes, Ph.D., a Data Science & Engineering Leader at Netflix. Analyzing data remains one of the most important skills in where the world is going with AI. This comprehensive certificate takes you all the way to being job-ready. Each course comes with practical projects demonstrated in real-world contexts, such as analyzing sales data for a Korean bakery, video game sales trends across different regions, or identifying factors impacting customer retention for a communications company. You'll also work on estimating fire distribution for forest fire prevention, analyzing how a diamond's properties affect its market value, and developing predictive models for retail sales analysis, carbon emissions, and coral reef conservation. Here's some of what you'll learn: - How to define data and categorize it into its many types such as discrete & continuous numerical, structured & unstructured, time series, categorical, and know what insights can be derived from the different types of data categories. - How to differentiate between data-related job roles and their responsibilities, and how data flows through an organization from the moment of capture to decision-making. - How to perform data processing functions and apply conditional formatting in spreadsheets to extract business value from your data using statistical calculations and best practices for visualizing and interpreting data. - How to use LLMs for stakeholder analysis, data exploration, and data visualization. - Best practices for using LLMs for as a thought partner to data analysis work By the end of this professional certificate program, you will have learned core statistical concepts, analysis techniques, and visualization methodologies that will serve as the foundation for working as a data analyst. The world needs more data analysts, especially ones who know how to use modern generative AI. With data science roles projected to grow 36% by 2033, the skills taught in this program create new professional opportunities in data. Sign up here!

Andrew Ng

84,686 просмотров • 1 год назад

RLM is the most import foundation of my Pi Harness (other than Pi of course). It's seeded with late interaction retrieval results (thanks to @lightonai for pylate). The Agent initiates it with query then.. 𝐒𝐞𝐭𝐮𝐩 A python REPL is created and seeded with: 1. Late interaction search to pre-filter. Instead of doing top 3/5/10, it's top hundreds of documents. This is set into a `context` variable. 2. Python functions are loaded in to do more searches if `context` variable isn't enough. And to make llm calls with cheaper models in parallel batches. 𝐈𝐭𝐞𝐫𝐚𝐭𝐢𝐨𝐧 𝐋𝐨𝐨𝐩 From there, an LLM iterates in the REPL based on the query. It's just like exploring in a jupyter notebook. The LLM writes prose (like a markdown cell) and code to be run in the REPL each turn. This allows the LLM to sort, filter, and synthesize information. It can fan out and ask smaller models to summarize, combine, contrast, or do anything else to documents to help it understand the data. After several turns the LLM reponds with the final answer. Either because it found the answer, or hit the budget limit. Context as a Python variable, LLM as the programmer, REPL as the runtime. 𝐖𝐡𝐲 𝐃𝐨𝐞𝐬 𝐓𝐡𝐢𝐬 𝐖𝐨𝐫𝐤 1. Richer Shell. Agents (and subagents) work by intermixing code and prose/thinking. But they use static scripts or bash that run and exit and start over each tool call. That's not ideal for exploration and synthesis of data. For that, state is useful to continue building and exploring the data as you learn more. There's a reason jupyter notebooks have been popular with data scientists. 2. Keeps main agent context clean. The better context you have the better the agent will perform (duh!). This means three thing: better human input, less missing search results, and less incorrect search results. Letting the agent iterate allows it to synthesize just what is needed and nothing else. All bad paths or peeks at something that turns out to be irrelevant stays out of main agent context. 3. Stack the good ideas! People often compare late interaction search vs RLM. Or static vs dynamic languages. Or agentic search vs semantic search. But...You can just use them all together for what they're each good at. Use them all for the area they're really great for. Read the full post which has more detail about how and why.

Isaac Flath

40,212 просмотров • 1 месяц назад

The most interesting part for me is where Andrej Karpathy describes why LLMs aren't able to learn like humans. As you would expect, he comes up with a wonderfully evocative phrase to describe RL: “sucking supervision bits through a straw.” A single end reward gets broadcast across every token in a successful trajectory, upweighting even wrong or irrelevant turns that lead to the right answer. > “Humans don't use reinforcement learning, as I've said before. I think they do something different. Reinforcement learning is a lot worse than the average person thinks. Reinforcement learning is terrible. It just so happens that everything that we had before is much worse.” So what do humans do instead? > “The book I’m reading is a set of prompts for me to do synthetic data generation. It's by manipulating that information that you actually gain that knowledge. We have no equivalent of that with LLMs; they don't really do that.” > “I'd love to see during pretraining some kind of a stage where the model thinks through the material and tries to reconcile it with what it already knows. There's no equivalent of any of this. This is all research.” Why can’t we just add this training to LLMs today? > “There are very subtle, hard to understand reasons why it's not trivial. If I just give synthetic generation of the model thinking about a book, you look at it and you're like, 'This looks great. Why can't I train on it?' You could try, but the model will actually get much worse if you continue trying.” > “Say we have a chapter of a book and I ask an LLM to think about it. It will give you something that looks very reasonable. But if I ask it 10 times, you'll notice that all of them are the same.” > “You're not getting the richness and the diversity and the entropy from these models as you would get from humans. How do you get synthetic data generation to work despite the collapse and while maintaining the entropy? It is a research problem.” How do humans get around model collapse? > “These analogies are surprisingly good. Humans collapse during the course of their lives. Children haven't overfit yet. They will say stuff that will shock you. Because they're not yet collapsed. But we [adults] are collapsed. We end up revisiting the same thoughts, we end up saying more and more of the same stuff, the learning rates go down, the collapse continues to get worse, and then everything deteriorates.” In fact, there’s an interesting paper arguing that dreaming evolved to assist generalization, and resist overfitting to daily learning - look up The Overfitted Brain by Erik Hoel. I asked Karpathy: Isn’t it interesting that humans learn best at a part of their lives (childhood) whose actual details they completely forget, adults still learn really well but have terrible memory about the particulars of the things they read or watch, and LLMs can memorize arbitrary details about text that no human could but are currently pretty bad at generalization? > “[Fallible human memory] is a feature, not a bug, because it forces you to only learn the generalizable components. LLMs are distracted by all the memory that they have of the pre-trained documents. That's why when I talk about the cognitive core, I actually want to remove the memory. I'd love to have them have less memory so that they have to look things up and they only maintain the algorithms for thought, and the idea of an experiment, and all this cognitive glue for acting.”

Dwarkesh Patel

1,050,107 просмотров • 8 месяцев назад

New short course: Evaluating AI Agents! Evals are important for driving AI system improvements, and in this course you'll learn to systematically assess and improve an AI agent’s performance. This is built in partnership with Arize AI and taught by John Gilhuly, Head of Developer Relations, and , Director of Product. I've often found evals to be a critical tool in the agent development process - they can be the difference between picking the right thing to work on vs. wasting weeks of effort. Whether you’re building a shopping assistant, coding agent, or research assistant, having a structured evaluation process helps you refine its performance systematically, rather than relying on random trial and error. This course shows you how to structure your evals to assess the performance of each component of an agent and its end-to-end performance. For each component, you select the appropriate evaluators, test examples, and performance metrics. This helps you identify areas for improvement both during development and in production. (If you're familiar with error analysis in supervised learning, think of this as adapting those ideas to agentic workflows.) In this course, you'll build an AI agent, and add observability to visualize and debug its steps. You’ll learn about code-based evals, in which you write code explicitly to test a certain step, as well as LLM-as-a-Judge evals, in which you prompt an LLM to efficiently come up with ways to evaluate more open-ended outputs. In detail, you’ll: - Understand key differences between evaluating LLM-based systems and traditional software testing. - Add observability to an agent by collecting traces of the steps taken by the agent and visualizing them - Choose the appropriate evaluator - code-based, LLM-as-a-Judge, human-annotation based - for each component. - Compute a convergence score to evaluate if your agent can respond to a query in an efficient number of steps. - Run structured experiments to improve the agent’s performance by exploring changes to the prompt, LLM model, or the agent’s logic. - Understand how to deploy these evaluation techniques to monitor the agent’s performance in production. By the end of this course, you’ll know how to trace AI agents, systematically evaluate them, and improve their performance. Please sign up here:

Andrew Ng

126,355 просмотров • 1 год назад

I hear so often from the Dommes I work with that they struggle with people online fetichizing them and simply seeing them for how sexy and beautiful they are. They project their fantasies and their desires onto you. That stops immediately once you move the attention from you to them. From 'look at me' to 'I see you'. What does that look like? When you create content, think of them and what this scene or that narrative is evoking. What will they learn from you? What they want is not to passively watch how sexy you are, but for you to train them, to give them instructions, to teach them, to guide them, to be in charge, to command them. This is not being an object but the main subject. The Authority figure. How is your content already doing that. The sexy photos can still be there, they are important to already capture des attention. But what you do with that attention once you have it, is where the power dynamic is established. Positioning yourself as more than a stunning Goddess, but actually a woman who has a voice, opinions, perspective, a philosophy, a way to doing things, teaching them what you like, how you like it, why you like it, already makes them want to be that for you. You hold the attention, you hold the power, so you direct it. And for that, you want them to know you get them and you know what lives within them... that creates the desire for you to be the one exposing it. You instantly build trust. Not because you demanded it, but because you earned it: you showed them you know what you are doing. You have experience, you understand them. They are not told to come see you, they are seduced into it. They desire it. And they will work for it. This will attract better clients (real subs) and instead of you trying to get their attention, they will work to earn yours. If you want to learn more about power dynamics, building a brand as a Pro or the psychology behind BDSM, you can now access all my trainings and classes in one place for a fraction of the cost of The Dominatrix Academy. And you can reinvest the total amount towards the Program. Message me [SECRET] for the details. This offer is not available on my website.

Ms. Malissia

14,297 просмотров • 1 месяц назад

“You didn’t read it”.. Burry’s favorite response to any criticism of his $PLTR article Okay, how about we address a direct paragraph from you about Palantir’s ontology: “What Palantir calls its ontology is essentially this data retrieval for use through a common platform. But what if, as the paper points out, LLMs can still confabulate around a piece of data, misinterpret it, ignore it, etc.? In that case, Palantir's ontology cannot overcome the core hallucination problem in the underlying LLMs AIP uses. Hallucinations and overconfidence are fatal to tasks such as legal reasoning, scientific reasoning, medical decision support, military targeting, and other truly mission critical tasks requiring 100% precision and confidence grounded in real data. The paper notes no current mitigation - including the RAG architecture central to AlP - can reliably solve this problem.” What if this. What if that. If my aunt had a dick, she’d be my uncle. We can do “what if” for eternity. How about we look at the impact? - You question the validity of Palantir’s software helping find Bin Laden. Okay, how about the confirmation here of Palantir’s software being used during the Venezuela operation? Thoughts? They could’ve swapped Claude for Grok. Doesn’t matter. They could not have swapped Palantir’s software for something else, though. The LLMs that you point out Palantir does not have, are a commodity. - How about this other example of the 18th Airborne Corps' artillery brigade reducing its targeting process from 724 minutes to 20 minutes with Palantir’s Maven Smart System? Is the ontology hallucinating there? - How about the Navy’s ShipOS, built on Palantir, decreasing schedule planning from 160 hours to 10 minutes? - What about the director of NATO’s Task Force Maven citing how critical the ontology here? “To capitalize on Al applications, an ontology and lineage for data is needed. Al applications don't understand context or meaning; they understand structure. A data ontology provides the machine With a common language and framework for defining concepts, their attributes and their relationships (for example, classifying a 'jet' as an 'air platform' with specific 'weapon systems'). Without this shared structure, an Al model trained on one system's terminology wouldn't be able to integrate data from another. MSS establishes this common schema across NATO systems, unlocking the possibility of interoperable Al applications. This interoperability and trust are paramount in a warfighting context.”: Which is more credible? Your Stanford paper, or a director at NATO who actually uses the software? You wrote 10,000 words, so it’s impossible to dissect the whole thing. If one tries to do so, you will just say they didn’t read it. So let’s take a look at this ontology paragraph specifically. Do you have any proof of the ontology failing? Or just this “what if” from a Stanford paper? There’s infinite examples of it being transformational.. that’s for sure.

Jack Prescott

24,154 просмотров • 4 месяцев назад