Загрузка видео...

Не удалось загрузить видео

Возникла проблема при загрузке этого видео. Это может быть связано с временными проблемами сети или видео может быть недоступно.

На главную

How do computers understand data? With semantic search! Instead of just matching keywords, it understands context using vector embeddings. Here’s how: 1) Convert data (text, images, etc.) into vectors (embeddings) 2) Store these vectors in a vector database 3) Search by meaning, not just the keywords Semantic search makes... show more

Femke Plantinga

12,901 subscribers

23,911 просмотров • 1 год назад •via X (Twitter)

Anya Rossi• Live Now

Private livecam show

Комментарии: 8

Фото профиля shyamik 📊♻️

shyamik 📊♻️1 год назад

Great work 👏

Фото профиля Femke Plantinga

Femke Plantinga1 год назад

Thanks! 😄

Фото профиля Uche

Uche1 год назад

Great presentation. I enjoyed it

Фото профиля Sdal

Sdal1 год назад

Understand data? Really. Or pattern matching?

Фото профиля Aklının yönetim kurulu başkanı

Aklının yönetim kurulu başkanı1 год назад

@femke_plantinga you are so beautiful. I am afraid of being in love with you. 🙈

Фото профиля mariodeleon

mariodeleon1 год назад

@memdotai mem it #Ai

Фото профиля Dav

Dav1 год назад

No entendí nada

Фото профиля bruno maggi

bruno maggi1 год назад

👏👏👏

Похожие видео

Traditional (SQL) databases rely primarily on keyword-based searches to retrieve information. These searches match the exact words or phrases in your query to the text stored in the database. While effective for many applications, this method has limitations when it comes to understanding context or finding relevant information that doesn’t include the exact keywords. Hybrid search combines the strengths of traditional keyword-based BM25 search with the advanced capabilities of semantic search. To effectively implement a hybrid search, a vector database is essential. Vector databases go beyond just words; they understand the meaning behind the data. They transform data such as text, images, or audio into numerical representations called vectors. These vector embeddings enable the database to find similar items, even if they don't share exact keywords. When you integrate hybrid search with Retrieval-Augmented Generation (RAG) systems, you can achieve higher accuracy in retrieved context and better output in generated responses. Learn more about RAG systems in this video with Victoria Slocum:

Traditional (SQL) databases rely primarily on keyword-based searches to retrieve information. These searches match the exact words or phrases in your query to the text stored in the database. While effective for many applications, this method has limitations when it comes to understanding context or finding relevant information that doesn’t include the exact keywords. Hybrid search combines the strengths of traditional keyword-based BM25 search with the advanced capabilities of semantic search. To effectively implement a hybrid search, a vector database is essential. Vector databases go beyond just words; they understand the meaning behind the data. They transform data such as text, images, or audio into numerical representations called vectors. These vector embeddings enable the database to find similar items, even if they don't share exact keywords. When you integrate hybrid search with Retrieval-Augmented Generation (RAG) systems, you can achieve higher accuracy in retrieved context and better output in generated responses. Learn more about RAG systems in this video with Victoria Slocum:

Femke Plantinga

140,618 просмотров • 2 лет назад

New short course on Building Applications with Vector Databases, taught by Pinecone’s Tim Tully! At the heart of a vector database is the ability to store a collection of vectors and then query against that, meaning input a new vector and find similar ones. This is useful for many AI applications. In this course, you'll learn how to use vector databases to build: (i) Semantic Search: Create a text search tool that goes beyond keyword matching, and instead focuses on the meaning of content. (ii) RAG (retrieval augmented generation): Enhance your LLM output by incorporating context from sources the model wasn't trained on. (iii) Recommender System: Combine semantic search and RAG to recommend topics, and demonstrate it with a news article recommender. (iv) Hybrid Search: Build an application that finds items using both images and descriptive text -- by combining both sparse and dense vector representations of the data -- using an eCommerce dataset as an example. (v) Image Similarity: Use image vector embeddings to create an app to compare facial features, using a database of public figures to determine the likeness between them. (vi) Anomaly Detection: Build an anomaly detection app that identifies unusual patterns in network communication logs. I hope you’ll enjoy learning how to build all these types of applications! Please sign up here:

New short course on Building Applications with Vector Databases, taught by Pinecone’s Tim Tully! At the heart of a vector database is the ability to store a collection of vectors and then query against that, meaning input a new vector and find similar ones. This is useful for many AI applications. In this course, you'll learn how to use vector databases to build: (i) Semantic Search: Create a text search tool that goes beyond keyword matching, and instead focuses on the meaning of content. (ii) RAG (retrieval augmented generation): Enhance your LLM output by incorporating context from sources the model wasn't trained on. (iii) Recommender System: Combine semantic search and RAG to recommend topics, and demonstrate it with a news article recommender. (iv) Hybrid Search: Build an application that finds items using both images and descriptive text -- by combining both sparse and dense vector representations of the data -- using an eCommerce dataset as an example. (v) Image Similarity: Use image vector embeddings to create an app to compare facial features, using a database of public figures to determine the likeness between them. (vi) Anomaly Detection: Build an anomaly detection app that identifies unusual patterns in network communication logs. I hope you’ll enjoy learning how to build all these types of applications! Please sign up here:

Andrew Ng

137,091 просмотров • 2 лет назад

To prove how scalable Upstash Vector is, we indexed the entire Wikipedia in 11 languages (144m vectors) in a single DB. ◆ Over 700GB of data ◆ Fast semantic search ◆ Chat with Wikipedia We got your back for a fast app that scales🫡 Quick demo 👇

To prove how scalable Upstash Vector is, we indexed the entire Wikipedia in 11 languages (144m vectors) in a single DB. ◆ Over 700GB of data ◆ Fast semantic search ◆ Chat with Wikipedia We got your back for a fast app that scales🫡 Quick demo 👇

Upstash

31,069 просмотров • 1 год назад

This is, by far, one of the best uses of modern AI. If you don't use embeddings when querying your database, you are definitely leaving a lot on the table. In this video, I'll show you how to run semantic searches using OpenAI and PostgreSQL. It's all thanks to Pgai, an open-source PostgreSQL extension: Here's what will happen: 1. We'll create a simple table with news articles 2. We'll generate embeddings for those articles 3. We'll run queries on top of those embeddings For this video, I generated the embeddings using a simple query, but pgai Vectorizer would do the same automatically as new information makes it into the database. This is awesome! If you have a PostgreSQL database with data you are searching over, you should start experimenting with semantic searches immediately. For most use cases, a combination of full-text search + semantic search is the best approach. If you don't have a PostgreSQL database around, you can try free for 30 days using Timescale: Thanks to the Timescale (now TigerData) team for partnering with me on this post!

This is, by far, one of the best uses of modern AI. If you don't use embeddings when querying your database, you are definitely leaving a lot on the table. In this video, I'll show you how to run semantic searches using OpenAI and PostgreSQL. It's all thanks to Pgai, an open-source PostgreSQL extension: Here's what will happen: 1. We'll create a simple table with news articles 2. We'll generate embeddings for those articles 3. We'll run queries on top of those embeddings For this video, I generated the embeddings using a simple query, but pgai Vectorizer would do the same automatically as new information makes it into the database. This is awesome! If you have a PostgreSQL database with data you are searching over, you should start experimenting with semantic searches immediately. For most use cases, a combination of full-text search + semantic search is the best approach. If you don't have a PostgreSQL database around, you can try free for 30 days using Timescale: Thanks to the Timescale (now TigerData) team for partnering with me on this post!

Santiago

109,517 просмотров • 1 год назад

I don’t think people realize the scale of data behind the JBP Search Engine. Every single episode since Episode 1 has been transcribed and converted into embeddings for semantic search. This isn’t keyword matching. It understands the context of your query and returns the exact moment you’re looking for…not just a list of episodes where something was mentioned. 1,415 episodes. ~3 hours each. Over 600GB of locally stored audio + additional metadata powering search quality. Could you do it? Maybe, but just know, I’ve personally spent thousands of dollars in compute to make this possible. There’s no way around it when dealing with this much data.

I don’t think people realize the scale of data behind the JBP Search Engine. Every single episode since Episode 1 has been transcribed and converted into embeddings for semantic search. This isn’t keyword matching. It understands the context of your query and returns the exact moment you’re looking for…not just a list of episodes where something was mentioned. 1,415 episodes. ~3 hours each. Over 600GB of locally stored audio + additional metadata powering search quality. Could you do it? Maybe, but just know, I’ve personally spent thousands of dollars in compute to make this possible. There’s no way around it when dealing with this much data.

burner account

130,795 просмотров • 2 месяцев назад

🚀 Introducing Salesforce's game-changer: Data Cloud Vector Database + Einstein Copilot Semantic Search! ✨ Harness both structured & unstructured data for smarter AI responses. No more costly training! Real-time data-driven insights with RAG, complete with citations to curb AI hallucinations. Trust in enterprise AI just got a major upgrade, thanks to Salesforce! ❤️

🚀 Introducing Salesforce's game-changer: Data Cloud Vector Database + Einstein Copilot Semantic Search! ✨ Harness both structured & unstructured data for smarter AI responses. No more costly training! Real-time data-driven insights with RAG, complete with citations to curb AI hallucinations. Trust in enterprise AI just got a major upgrade, thanks to Salesforce! ❤️

Marc Benioff

113,100 просмотров • 2 лет назад

Today we announced the new Einstein Copilot Search and Salesforce Vector Database in Data Cloud to power semantic search and retrieval augmented generation. This will enhance Einstein Copilot's ability to understand, generate outputs, and automate actions across a wide variety of use cases, contexts, and data/content types. LLMs can't be relied on to recall specific data they were trained on, so the way to make them work in the enterprise, where accuracy is paramount, is to feed them with the right data using hybrid keyword and vector search-powered RAG. By bringing all this seamlessly into our apps and Einstein Copilot UX as well as Hyperforce cloud infrastructure, we can offer high performance, low latency, addressing data privacy, security, and residency requirements🌟

Today we announced the new Einstein Copilot Search and Salesforce Vector Database in Data Cloud to power semantic search and retrieval augmented generation. This will enhance Einstein Copilot's ability to understand, generate outputs, and automate actions across a wide variety of use cases, contexts, and data/content types. LLMs can't be relied on to recall specific data they were trained on, so the way to make them work in the enterprise, where accuracy is paramount, is to feed them with the right data using hybrid keyword and vector search-powered RAG. By bringing all this seamlessly into our apps and Einstein Copilot UX as well as Hyperforce cloud infrastructure, we can offer high performance, low latency, addressing data privacy, security, and residency requirements🌟

Clara Shih

49,620 просмотров • 2 лет назад

Learn to optimize RAG for cost and performance in our new short course, Prompt Compression and Query Optimization, created with MongoDB and taught by Richmond Alake. This course teaches you to combine traditional database capabilities with vector search using MongoDB for RAG. You'll learn these techniques: - Vector search: For semantic matching of user queries - Filtering using metadata: Pre- and post-filtering to narrow search results - Projections: Selecting only necessary fields to minimize data returned - Boosting: Reranking results to improve relevance - Prompt compression: Using a small LLM to compress context, significantly reducing token count and processing costs These methods address scaling, performance, and security challenges in large-scale RAG applications. You can sign up here:

Learn to optimize RAG for cost and performance in our new short course, Prompt Compression and Query Optimization, created with MongoDB and taught by Richmond Alake. This course teaches you to combine traditional database capabilities with vector search using MongoDB for RAG. You'll learn these techniques: - Vector search: For semantic matching of user queries - Filtering using metadata: Pre- and post-filtering to narrow search results - Projections: Selecting only necessary fields to minimize data returned - Boosting: Reranking results to improve relevance - Prompt compression: Using a small LLM to compress context, significantly reducing token count and processing costs These methods address scaling, performance, and security challenges in large-scale RAG applications. You can sign up here:

Andrew Ng

71,710 просмотров • 2 лет назад

New short course: Building Multimodal Search and RAG", by Weaviate AI Database's Sebastia(N_) Witalec ✊🏽✊🏾✊🏿. Contrastive learning is used to train models to map vectors into an embedding space by pulling similar concepts closer together and pushing dissimilar concepts away from each other. This technique is also used to train multimodal embedding models that capture semantic similarity across different modalities like text, images, and audio. These multimodal embeddings can be used to build multimodal search and RAG systems. In this course, you'll learn how contrastive learning works, and how to add multimodality to RAG – so your models can draw on diverse, relevant context to answer questions. For example, a query about a financial report might synthesize information from text snippets, graphs, tables, and slides. You will also learn how visual instruction tuning lets you integrate image understanding into language models, and build a multi-vector recommender system using Weaviate’s open-source vector database. Please sign up here:

New short course: Building Multimodal Search and RAG", by Weaviate AI Database's Sebastia(N_) Witalec ✊🏽✊🏾✊🏿. Contrastive learning is used to train models to map vectors into an embedding space by pulling similar concepts closer together and pushing dissimilar concepts away from each other. This technique is also used to train multimodal embedding models that capture semantic similarity across different modalities like text, images, and audio. These multimodal embeddings can be used to build multimodal search and RAG systems. In this course, you'll learn how contrastive learning works, and how to add multimodality to RAG – so your models can draw on diverse, relevant context to answer questions. For example, a query about a financial report might synthesize information from text snippets, graphs, tables, and slides. You will also learn how visual instruction tuning lets you integrate image understanding into language models, and build a multi-vector recommender system using Weaviate’s open-source vector database. Please sign up here:

Andrew Ng

104,371 просмотров • 2 лет назад

Traditional data pipelines don't work for RAG applications. There are 3 issues with them: 1. Traditional data engineering solutions are optimized to handle structured data. RAG applications rely primarily on unstructured data. 2. The connector ecosystem to load data from unstructured data sources is very immature. 3. Traditional solutions do not offer any way to transform unstructured data into an optimized vector search index. The goal of a RAG Pipeline is to solve these problems. The number one objective is to create a reliable vector search index using factual knowledge and relevant context. This sounds easy, but it's one of the biggest challenges we face when building RAG applications. At a high level, there are four different stages in the architecture of a RAG pipeline: 1. Ingestion: Here is where the pipeline loads the information from the data source. 2. Extraction: Where the pipeline processes the input data and decides how to retrieve the text contained inside them. 3. Transform: Where the pipeline chunks the data and generates document embeddings. 4. Load: Where the pipeline creates a search index in a vector database and loads the document embeddings. There are different rabbit holes at each one of these stages. Here are three of them: 1. Ingesting data once is simple. The hard part is refreshing the vector database whenever the original data source changes. 2. Extracting the content of a plain text document is simple. The hard part is to extract content from complex documents containing tables, images, or cross-references. 3. A simple continual chunking strategy with an overlap is simple. The hard part is to find the optimal strategy for your specific knowledge base and the way you are planning to query it. In the attached video, I'll show you how you can build an enterprise-grade RAG Pipeline that solves every one of the above problems. I'll use Vectorize. They partnered with me on this post. You can use them to build RAG pipelines optimized for accurate context retrieval. If you have a few documents lying around, set up a free account and give it a try.

Traditional data pipelines don't work for RAG applications. There are 3 issues with them: 1. Traditional data engineering solutions are optimized to handle structured data. RAG applications rely primarily on unstructured data. 2. The connector ecosystem to load data from unstructured data sources is very immature. 3. Traditional solutions do not offer any way to transform unstructured data into an optimized vector search index. The goal of a RAG Pipeline is to solve these problems. The number one objective is to create a reliable vector search index using factual knowledge and relevant context. This sounds easy, but it's one of the biggest challenges we face when building RAG applications. At a high level, there are four different stages in the architecture of a RAG pipeline: 1. Ingestion: Here is where the pipeline loads the information from the data source. 2. Extraction: Where the pipeline processes the input data and decides how to retrieve the text contained inside them. 3. Transform: Where the pipeline chunks the data and generates document embeddings. 4. Load: Where the pipeline creates a search index in a vector database and loads the document embeddings. There are different rabbit holes at each one of these stages. Here are three of them: 1. Ingesting data once is simple. The hard part is refreshing the vector database whenever the original data source changes. 2. Extracting the content of a plain text document is simple. The hard part is to extract content from complex documents containing tables, images, or cross-references. 3. A simple continual chunking strategy with an overlap is simple. The hard part is to find the optimal strategy for your specific knowledge base and the way you are planning to query it. In the attached video, I'll show you how you can build an enterprise-grade RAG Pipeline that solves every one of the above problems. I'll use Vectorize. They partnered with me on this post. You can use them to build RAG pipelines optimized for accurate context retrieval. If you have a few documents lying around, set up a free account and give it a try.

Santiago

40,441 просмотров • 1 год назад

What to expect this October? We’re building OptimAI Search Engine - the next leap in our network. Not just search. Not just keywords. But contextual search & analysis across the open web and social platforms (X, LinkedIn, Telegram, and more). Powered by the OptimAI community-driven Data Network, it cuts through expensive APIs and unlocks: ✨ Access to the freshest, high-quality, real-time data ✨ Context-aware insights that adapt to your needs ✨ Affordable, open infrastructure for everyone - not just the few This is how we build the foundation for the Agentic AI Systems of tomorrow: intelligent, accessible, and powered by people, not corporations. The future of search is decentralized. And it’s coming soon, powered OptimAI Network. Together we make impact.

What to expect this October? We’re building OptimAI Search Engine - the next leap in our network. Not just search. Not just keywords. But contextual search & analysis across the open web and social platforms (X, LinkedIn, Telegram, and more). Powered by the OptimAI community-driven Data Network, it cuts through expensive APIs and unlocks: ✨ Access to the freshest, high-quality, real-time data ✨ Context-aware insights that adapt to your needs ✨ Affordable, open infrastructure for everyone - not just the few This is how we build the foundation for the Agentic AI Systems of tomorrow: intelligent, accessible, and powered by people, not corporations. The future of search is decentralized. And it’s coming soon, powered OptimAI Network. Together we make impact.

OptimAI Network

43,650 просмотров • 10 месяцев назад

Tokenization -- turning text into a sequence of integers -- is a key part of generative AI, and most API providers charge per million tokens. How does tokenization work? Learn the details of tokenization and RAG optimization in Retrieval Optimization: From Tokenization to Vector Quantization, created in collaboration with Qdrant and taught by its Developer Relations Lead, Kacper Łukawski. This course focuses on Retrieval augmented generation (RAG), which has two steps: First, a retriever finds relevant information; then, the generator uses what’s retrieved as context to produce a response. You’ll learn to optimize the first step (the retriever) by understanding how tokenization works and how it impacts the relevance of your search. In addition, you will also learn to measure and improve retrieval quality, speed, and memory. In detail, you’ll: - Learn about the internal workings of the embedding models and how your text turns into vectors. - Understand how several tokenizers, such as Byte-Pair Encoding, WordPiece, Unigram, and SentencePiece work. - Explore common challenges with tokenizers, such as unknown tokens, domain-specific identifiers, and numerical values, that can negatively affect your vector search. - Understand how to measure the quality of your search across relevance, ranking, and score-related metrics. - Understand how the main parameters in "HNSW", a graph-based algorithm, affect the relevance and speed of vector search, and how to tune its parameters. - Experiment with the three major quantization methods – product, scalar, and binary – and learn how they impact memory requirements, search quality, and speed. By the end of this course, you’ll have a solid understanding of how tokenization functions and how to optimize vector search in your RAG systems. Please sign up here!

Tokenization -- turning text into a sequence of integers -- is a key part of generative AI, and most API providers charge per million tokens. How does tokenization work? Learn the details of tokenization and RAG optimization in Retrieval Optimization: From Tokenization to Vector Quantization, created in collaboration with Qdrant and taught by its Developer Relations Lead, Kacper Łukawski. This course focuses on Retrieval augmented generation (RAG), which has two steps: First, a retriever finds relevant information; then, the generator uses what’s retrieved as context to produce a response. You’ll learn to optimize the first step (the retriever) by understanding how tokenization works and how it impacts the relevance of your search. In addition, you will also learn to measure and improve retrieval quality, speed, and memory. In detail, you’ll: - Learn about the internal workings of the embedding models and how your text turns into vectors. - Understand how several tokenizers, such as Byte-Pair Encoding, WordPiece, Unigram, and SentencePiece work. - Explore common challenges with tokenizers, such as unknown tokens, domain-specific identifiers, and numerical values, that can negatively affect your vector search. - Understand how to measure the quality of your search across relevance, ranking, and score-related metrics. - Understand how the main parameters in "HNSW", a graph-based algorithm, affect the relevance and speed of vector search, and how to tune its parameters. - Experiment with the three major quantization methods – product, scalar, and binary – and learn how they impact memory requirements, search quality, and speed. By the end of this course, you’ll have a solid understanding of how tokenization functions and how to optimize vector search in your RAG systems. Please sign up here!

Andrew Ng

146,313 просмотров • 1 год назад

🚨🚨🚨 We have released a database with the most searched keywords on the App Store, free for everyone! Just yesterday, we sent a message to all Astro ASO Tool users informing them that we had completed the integration of a database with one million keywords into our backend. This database allows us to access increasingly reliable data on popularity Today, we want to take a further step forward by making this database with 1 million keywords with a popularity value > 5 publicly available to everyone on our website! In this video, Alice Ercolani explains how to use this tool to discover new keywords to track in Astro! Link to the tool in the next post!

🚨🚨🚨 We have released a database with the most searched keywords on the App Store, free for everyone! Just yesterday, we sent a message to all Astro ASO Tool users informing them that we had completed the integration of a database with one million keywords into our backend. This database allows us to access increasingly reliable data on popularity Today, we want to take a further step forward by making this database with 1 million keywords with a popularity value > 5 publicly available to everyone on our website! In this video, Alice Ercolani explains how to use this tool to discover new keywords to track in Astro! Link to the tool in the next post!

Matteo Spada

53,383 просмотров • 9 месяцев назад

Give Claude Code a semantic filesystem 🗃️🛠️ Giving Claude Code access to the right CLI tools over your filesystem turns it into a general agent capable of automating far more knowledge work beyond code - it can do dynamic financial/legal/medical/technical/backoffice analysis over any subset of documents. With our latest release of semtools 💫, you can now manually or *agentically* create a persistent workspace over any subset of files. This gives Claude Code the ability to get blazing-fast, local semantic search over any data, while still allowing it to chain with commands like grep/cat/etc. so that it can load in dynamic context instead of naive top-k vector search. The coding agent can dynamically index data and use those indexes, instead of having to rebuild it every time. So you get the benefits of fast search along with agentic reasoning over CLI tools mentioned above. Come check it out!

Give Claude Code a semantic filesystem 🗃️🛠️ Giving Claude Code access to the right CLI tools over your filesystem turns it into a general agent capable of automating far more knowledge work beyond code - it can do dynamic financial/legal/medical/technical/backoffice analysis over any subset of documents. With our latest release of semtools 💫, you can now manually or agentically create a persistent workspace over any subset of files. This gives Claude Code the ability to get blazing-fast, local semantic search over any data, while still allowing it to chain with commands like grep/cat/etc. so that it can load in dynamic context instead of naive top-k vector search. The coding agent can dynamically index data and use those indexes, instead of having to rebuild it every time. So you get the benefits of fast search along with agentic reasoning over CLI tools mentioned above. Come check it out!

Jerry Liu

77,373 просмотров • 10 месяцев назад

Filesystems vs Vector search is the new MCP vs CLI. Claude uses agentic search. And Dens Sumesh at mintlify like filesystems too. but Retrieval is still used by Notion, Cursor, and others. We hit the cafes of SF to see what people want. - The debate is hot. Filesystems won by 1 point - filesystems feel very simple and intuitive. - RAG requires embedding, vector search and other stuff filesystems won. Introducing SMFS - Supermemory Filesystem we brought the best of these worlds into one single product - it's a filesystem, but agent can also do semantic search using grep. live today. Try it! Works with any sandbox, you mount and sync with cloud, it has a sync engine built in, all filetypes supported - even images and videos can be grepped. So, what do you choose - Filesystems, or vector search? we have both!

Filesystems vs Vector search is the new MCP vs CLI. Claude uses agentic search. And Dens Sumesh at mintlify like filesystems too. but Retrieval is still used by Notion, Cursor, and others. We hit the cafes of SF to see what people want. - The debate is hot. Filesystems won by 1 point - filesystems feel very simple and intuitive. - RAG requires embedding, vector search and other stuff filesystems won. Introducing SMFS - Supermemory Filesystem we brought the best of these worlds into one single product - it's a filesystem, but agent can also do semantic search using grep. live today. Try it! Works with any sandbox, you mount and sync with cloud, it has a sync engine built in, all filetypes supported - even images and videos can be grepped. So, what do you choose - Filesystems, or vector search? we have both!

Dhravya Shah

153,119 просмотров • 2 месяцев назад

How do professional RAG applications chunk their text? Let’s cover some Advanced Chunking Techniques. In our latest video, we cover simple chunking methods like splitting documents into sentences or sections. But these methods often miss out on ensuring each chunk has independent meaning. Semantic chunking solved exactly this! By measuring the semantic similarity between sentences using vector embeddings, we can combine similar sentences into meaningful chunks. With LLM-based chunking, large language models help break down text effectively, although it can be slow and costly. And what about the newest Late Chunking? Which keeps context intact across chunks—more on that soon. 👀 In this video, we cover these advanced techniques in detail. Watch it to learn more. A big shoutout to Daniel Williams for helping create this video! 💚

How do professional RAG applications chunk their text? Let’s cover some Advanced Chunking Techniques. In our latest video, we cover simple chunking methods like splitting documents into sentences or sections. But these methods often miss out on ensuring each chunk has independent meaning. Semantic chunking solved exactly this! By measuring the semantic similarity between sentences using vector embeddings, we can combine similar sentences into meaningful chunks. With LLM-based chunking, large language models help break down text effectively, although it can be slow and costly. And what about the newest Late Chunking? Which keeps context intact across chunks—more on that soon. 👀 In this video, we cover these advanced techniques in detail. Watch it to learn more. A big shoutout to Daniel Williams for helping create this video! 💚

Femke Plantinga

29,660 просмотров • 1 год назад

Matryoshka dolls 🪆 = the key to AI efficiency. Gemini Embedding 2 leverages Matryoshka Representation Learning (MRL) so you can: 🔹 Dynamically truncate vectors for high-speed candidate matching without losing precision 🔹Slash database costs by choosing a smaller storage footprint without re-indexing 🔹 Adapt to any latency budget or accuracy requirement 📈💸 Learn how this helps build more efficient RAG, semantic search, and more:

Matryoshka dolls 🪆 = the key to AI efficiency. Gemini Embedding 2 leverages Matryoshka Representation Learning (MRL) so you can: 🔹 Dynamically truncate vectors for high-speed candidate matching without losing precision 🔹Slash database costs by choosing a smaller storage footprint without re-indexing 🔹 Adapt to any latency budget or accuracy requirement 📈💸 Learn how this helps build more efficient RAG, semantic search, and more:

Google for Developers

10,746 просмотров • 2 месяцев назад

I cant believe this guy just made a permanent solution to context bloat and open sourced it all! when we tested this tool (Context+) for solving an issue on the OpenCode repository, the agent using this tool used ~6.5k fewer tokens, found the code and fixed it in half the time! the results were surprising: 6 to 10k tokens saved per prompt, completed task in ~2 minutes while the agent running without the tool took ~4 mins for the same and got stuck in loops bro built an entire beast by using all the modern tools that we could think of: undo trees, semantic search by meaning (by haskellforall), advanced refactoring, blast radius, advanced file context trees, restore points... i can keep going on semantic code search and context trees are the future of agentic coding and this tool proves it the feature i loved the most is semantic search and how it gets things done 2x faster with least possible tokens it makes an agent that actually knows what it’s doing and not just guessing, it makes meaning from your code similar to RAG. if you aren't optimizing your context, you are just burning money the developer says this tool is still under development, it can have unexpected behavior and the docs need updates but the video shows the reality of how fast it can be github: get here:

I cant believe this guy just made a permanent solution to context bloat and open sourced it all! when we tested this tool (Context+) for solving an issue on the OpenCode repository, the agent using this tool used ~6.5k fewer tokens, found the code and fixed it in half the time! the results were surprising: 6 to 10k tokens saved per prompt, completed task in ~2 minutes while the agent running without the tool took ~4 mins for the same and got stuck in loops bro built an entire beast by using all the modern tools that we could think of: undo trees, semantic search by meaning (by haskellforall), advanced refactoring, blast radius, advanced file context trees, restore points... i can keep going on semantic code search and context trees are the future of agentic coding and this tool proves it the feature i loved the most is semantic search and how it gets things done 2x faster with least possible tokens it makes an agent that actually knows what it’s doing and not just guessing, it makes meaning from your code similar to RAG. if you aren't optimizing your context, you are just burning money the developer says this tool is still under development, it can have unexpected behavior and the docs need updates but the video shows the reality of how fast it can be github: get here:

forloop

226,054 просмотров • 5 месяцев назад

We just released "Large Language Models with Semantic Search”, built with Cohere, and taught by Jay Alammar and Luis Serrano. Search is a key part of many applications. Say, you need to retrieve documents or products in response to a user query; how can LLMs help? You’ll learn about (i) Embeddings, to retrieve a collection of documents loosely related to a query, and (ii) LLM assisted re-ranking, to rank them precisely according to relevance. You’ll also go through code showing how to tie all this together to build a complete search system for retrieving relevant Wikipedia articles. Please check it out!

We just released "Large Language Models with Semantic Search”, built with Cohere, and taught by Jay Alammar and Luis Serrano. Search is a key part of many applications. Say, you need to retrieve documents or products in response to a user query; how can LLMs help? You’ll learn about (i) Embeddings, to retrieve a collection of documents loosely related to a query, and (ii) LLM assisted re-ranking, to rank them precisely according to relevance. You’ll also go through code showing how to tie all this together to build a complete search system for retrieving relevant Wikipedia articles. Please check it out!

Andrew Ng

596,792 просмотров • 3 лет назад

Here’s how I would learn data engineering in 2025: 1. The basics: - learn SQL — SELECT, FROM, WHERE, GROUP BY, JOIN, HAVING, etc - learn Python — data structures: objects, arrays, tuples, namedtuples — algorithms: recursion, loops 2. Intermediate - learn distributed compute — pick up PySpark or Snowflake or BigQuery - learn data make architecture — pick up iceberg or delta lake - learn job orchestration — pick up Airflow or Mage - learn data quality — pick up Great expectations 3. Advanced - learn the data modeling techniques — one big table vs kimball vs Inmon vs data vault techniques - learn machine learning features and vector databases — pick up pinecone and how to fine tune LLMs with high quality data My newsletter has a deeper roadmap here:

Here’s how I would learn data engineering in 2025: 1. The basics: - learn SQL — SELECT, FROM, WHERE, GROUP BY, JOIN, HAVING, etc - learn Python — data structures: objects, arrays, tuples, namedtuples — algorithms: recursion, loops 2. Intermediate - learn distributed compute — pick up PySpark or Snowflake or BigQuery - learn data make architecture — pick up iceberg or delta lake - learn job orchestration — pick up Airflow or Mage - learn data quality — pick up Great expectations 3. Advanced - learn the data modeling techniques — one big table vs kimball vs Inmon vs data vault techniques - learn machine learning features and vector databases — pick up pinecone and how to fine tune LLMs with high quality data My newsletter has a deeper roadmap here:

Zach Wilson

29,420 просмотров • 1 год назад