正在加载视频...

视频加载失败

New short course on Building Applications with Vector Databases, taught by Pinecone’s Tim Tully! At the heart of a vector database is the ability to store a collection of vectors and then query against that, meaning input a new vector and find similar ones. This is useful for many...

137,034 次观看 • 2 年前 •via X (Twitter)

10 条评论

goutam 的头像
goutam2 年前

@pinecone @timt need your support

Colin Campbell 的头像
Colin Campbell2 年前

@pinecone @timt This looks really cool! Learning about vectors and semantic caching blew my mind. I put together a little UMAP visualization of a semantic cache which I think is relevant here.

Amer Amayreh 的头像
Amer Amayreh2 年前

@pinecone @timt Thanks

Cyril Coste #DigitalTransformation 的头像
Cyril Coste #DigitalTransformation2 年前

@pinecone @timt Great idea! Vector databases are perfect for AI-driven recommendations, image recognition, and more!

Bruce 的头像
Bruce2 年前

@pinecone @timt All full courses from DeepLearningAI are available for learning on Coursnap, along with access to comprehensive learning materials:outline, summary, highlights and selected shorts.

gregorylent 的头像
gregorylent2 年前

@pinecone @timt wonder how this will work in mandarin ..

Abhinav Elimineti 𝕏 的头像
Abhinav Elimineti 𝕏2 年前

@pinecone @timt Amazing

Engr Samson 的头像
Engr Samson2 年前

@pinecone @timt In August last year we were informed that 2024 is the year of AI Expecting more of such projects that explore the usefulness of AI

laoda 的头像
laoda2 年前

@pinecone @timt the trouble is that none of the similarity search packages is of satisfaction

Muhammad Ahmod 的头像
Muhammad Ahmod2 年前

@pinecone @timt Hi, can semantic search or any other feature of vector databases be used for sanction data? I find that sanction data store are still based on csv and xml and the data schemas is crazy. Especially when it comes to multiple aliases and multiple names and…

相关视频

Announcing a new Coursera course: Retrieval Augmented Generation (RAG) You'll learn to build high performance, production-ready RAG systems in this hands-on, in-depth course created by and taught by Zain, experienced AI and ML engineer, researcher, and educator. RAG is a critical component today of many LLM-based applications in customer support, internal company Q&A systems, even many of the leading chatbots that use web search to answer your questions. This course teaches you in-depth how to make RAG work well. LLMs can produce generic or outdated responses, especially when asked specialized questions not covered in its training data. RAG is the most widely used technique for addressing this. It brings in data from new data sources, such as internal documents or recent news, to give the LLM the relevant context to private, recent, or specialized information. This lets it generate more grounded and accurate responses. In this course, you’ll learn to design and implement every part of a RAG system, from retrievers to vector databases to generation to evals. You’ll learn about the fundamental principles behind RAG and how to optimize it at both the component and whole-system levels. As AI evolves, RAG is evolving too. New models can handle longer context windows, reason more effectively, and can be parts of complex agentic workflows. One exciting growth area is Agentic RAG, in which an AI agent at runtime (rather than it being hardcoded at development time) autonomously decides what data to retrieve, and when/how to go deeper. Even with this evolution, access to high-quality data at runtime is essential, which is why RAG is a key part of so many applications. You'll learn via hands-on experiences to: - Build a RAG system with retrieval and prompt augmentation - Compare retrieval methods like BM25, semantic search, and Reciprocal Rank Fusion - Chunk, index, and retrieve documents using a Weaviate vector database and a news dataset - Develop a chatbot, using open-source LLMs hosted by Together AI, for a fictional store that answers product and FAQ questions - Use evals to drive improving reliability, and incorporate multi-modal data RAG is an important foundational technique. Become good at it through this course! Please sign up here:

Andrew Ng

124,314 次观看 • 10 个月前

Build and customize complex AI applications with a flexible framework in this new short course, Building AI Applications with Haystack. Created in collaboration with deepset, makers of Haystack, and taught by Tuana, who is the developer relations lead for Haystack at deepset. Generative AI technology is changing rapidly and it can be challenging to integrate APIs from different LLMs, vector databases, and various tools such as web search. In this course, you will learn how to use the Haystack framework to make your development process more modular, allowing you to manage complexity and focus more on building your application. In detail, you’ll: - Build a RAG pipeline using Haystack’s main building blocks – components, pipelines, and document stores. - Create custom components in your pipeline by building a Hacker News summarizer that extends your app’s ability to access APIs. - Use conditional routing to create a branching pipeline with a fallback to web search mechanism when the LLM does not have the necessary context to respond to the user's query. - Build a self-reflecting agent for named entity recognition that loops using an output validator custom component. - Create a chat agent using OpenAI's function-calling capabilities which allow you to provide Haystack pipelines as tools to the LLM, enhancing that agent's capabilities. By the end of this course, you will learn a high-level orchestration framework that can help make your applications flexible, extendible, and maintainable, even as the technology stack changes, new user needs arise, and you add new features to your application. Please sign up here:

Andrew Ng

53,779 次观看 • 1 年前

Tokenization -- turning text into a sequence of integers -- is a key part of generative AI, and most API providers charge per million tokens. How does tokenization work? Learn the details of tokenization and RAG optimization in Retrieval Optimization: From Tokenization to Vector Quantization, created in collaboration with Qdrant and taught by its Developer Relations Lead, Kacper Łukawski. This course focuses on Retrieval augmented generation (RAG), which has two steps: First, a retriever finds relevant information; then, the generator uses what’s retrieved as context to produce a response. You’ll learn to optimize the first step (the retriever) by understanding how tokenization works and how it impacts the relevance of your search. In addition, you will also learn to measure and improve retrieval quality, speed, and memory. In detail, you’ll: - Learn about the internal workings of the embedding models and how your text turns into vectors. - Understand how several tokenizers, such as Byte-Pair Encoding, WordPiece, Unigram, and SentencePiece work. - Explore common challenges with tokenizers, such as unknown tokens, domain-specific identifiers, and numerical values, that can negatively affect your vector search. - Understand how to measure the quality of your search across relevance, ranking, and score-related metrics. - Understand how the main parameters in "HNSW", a graph-based algorithm, affect the relevance and speed of vector search, and how to tune its parameters. - Experiment with the three major quantization methods – product, scalar, and binary – and learn how they impact memory requirements, search quality, and speed. By the end of this course, you’ll have a solid understanding of how tokenization functions and how to optimize vector search in your RAG systems. Please sign up here!

Andrew Ng

146,200 次观看 • 1 年前

Traditional data pipelines don't work for RAG applications. There are 3 issues with them: ​ 1. Traditional data engineering solutions are optimized to handle structured data. RAG applications rely primarily on unstructured data. ​ 2. The connector ecosystem to load data from unstructured data sources is very immature. ​ 3. Traditional solutions do not offer any way to transform unstructured data into an optimized vector search index. ​ The goal of a RAG Pipeline is to solve these problems. ​ The number one objective is to create a reliable vector search index using factual knowledge and relevant context. This sounds easy, but it's one of the biggest challenges we face when building RAG applications. ​ At a high level, there are four different stages in the architecture of a RAG pipeline: ​ 1. Ingestion: Here is where the pipeline loads the information from the data source. ​ 2. Extraction: Where the pipeline processes the input data and decides how to retrieve the text contained inside them. ​ 3. Transform: Where the pipeline chunks the data and generates document embeddings. ​ 4. Load: Where the pipeline creates a search index in a vector database and loads the document embeddings. ​ There are different rabbit holes at each one of these stages. Here are three of them: ​ 1. Ingesting data once is simple. The hard part is refreshing the vector database whenever the original data source changes. ​ 2. Extracting the content of a plain text document is simple. The hard part is to extract content from complex documents containing tables, images, or cross-references. ​ 3. A simple continual chunking strategy with an overlap is simple. The hard part is to find the optimal strategy for your specific knowledge base and the way you are planning to query it. ​ In the attached video, I'll show you how you can build an enterprise-grade RAG Pipeline that solves every one of the above problems. ​ I'll use Vectorize. They partnered with me on this post. You can use them to build RAG pipelines optimized for accurate context retrieval. ​ ​ If you have a few documents lying around, set up a free account and give it a try.

Santiago

40,441 次观看 • 1 年前

New short course Multimodal RAG: Chat with Videos, developed with Intel and taught by vasudevlal! In this course, you’ll work with LLaVA (Large Language and Vision Assistant), a Large Vision Language Model (LVLM) that can process both images and text. For example, given an image of a person doing a handstand on a skateboard at the beach, LLaVA doesn't just caption the scene, it’s able to predict possible outcomes, like the person losing balance or falling off. By understanding not just what's in a video frame, but what might happen next, your application can provide more insightful answers to questions about video. You'll build a full multimodal RAG pipeline that can chat about video content: - Use the BridgeTower model to create joint text-image embeddings in a 512-dimensional multimodal semantic space. - Learn video processing techniques to extract keyframes, generate transcripts using Whisper, and create captions. - Use the LanceDB vector database to store and retrieve high-dimensional multimodal embeddings. - Integrate the LLaVA model, combining CLIP's (Contrastive Language Image Pretraining) vision transformer with Llama, for advanced visual-textual reasoning. Your final system will ingest video data, generate embeddings for frames and text, perform similarity searches for relevant content, and use the retrieved multimodal context to inform LVLM-based response generation. The result is a system capable of answering nuanced questions about video content, effectively chatting about the video it has processed. Please sign up here!

Andrew Ng

107,548 次观看 • 1 年前