Loading video...

Video Failed to Load

There was a problem loading this video. This could be due to a temporary network issue or the video might be unavailable.

LangChain: Chat with Your Data, a new free short course created with Harrison Chase, is now available! In this 1 hour course, you’ll learn how to build one of the most requested LLM-based applications: Answering questions using information from a document or collection of documents (often called Retrieval Augmented... show more

Andrew Ng

1,663,636 subscribers

384,282 views • 3 years ago •via X (Twitter)

Education Science & Technology News & Politics

Anya Rossi• Live Now

Private livecam show

9 Comments

Vlad Romanov3 years ago

@hwchase17 That's interesting; I know firsthand that many manufacturing companies struggle to filter through all the data they're collecting.

Dr Shrivastava A(Generative AI Consultant)3 years ago

@hwchase17 Please further add something to try your function for open source LLM like falcon other than OPENAI. @hwchase17

🧩DemoGPT3 years ago

@hwchase17 Great tutorial. @LangChainAI is the easiest way to build code base for LLM. DemoGPT is the easiest way to build LangChain applications. You can easily try it and give star ⭐⭐⭐ on GitHub:

rohit3 years ago

@hwchase17 who learns from courses now? shit like these are dead 💀

Austin Black3 years ago

@hwchase17 Sick, signing up now, thanks @AndrewYNg and @hwchase17 really helpful for me at this stage.

Don't know who3 years ago

@hwchase17 I don't think a course is required for this, I have been doing this for the last 6 months now... Easy to do !!

Michael McLeod 🐙 🐦3 years ago

@hwchase17 LangchainAI making another great move! congrats on the collab guys!

saba | building open-source AI3 years ago

@hwchase17 Love the concept! At Khoj, we're building an app for people to chat with their documents on their own computer, ideally without any technical knowledge. Here's a demo instance where you can chat with @logseq documentation --

Daniel Gallagher3 years ago

@hwchase17 Great course, thanks for putting it together!

Related Videos

Vector databases are a key part of many LLM applications that need search or data retrieval, for example with Retrieval Augmented Generation (RAG). Learn how they work + how to use them in our new short course, taught by Weaviate AI Database's Sebastia(N_) Witalec ✊🏽✊🏾✊🏿!

Vector databases are a key part of many LLM applications that need search or data retrieval, for example with Retrieval Augmented Generation (RAG). Learn how they work + how to use them in our new short course, taught by Weaviate AI Database's Sebastia(N_) Witalec ✊🏽✊🏾✊🏿!

Andrew Ng

420,683 views • 2 years ago

New AI Agentic course! Learn to use LangGraph to build single and multi-agent LLM applications in AI Agents in LangGraph. This short course, taught by LangChain LangChain founder Harrison Chase Harrison Chase and Tavily founder @weiss_rotem, shows how to integrate agentic search to enhance an agent's knowledge with query-focused answers in predictable formats. Also learn to implement agentic memory to save state for reasoning and debugging, and see how human-in-the-loop input can guide agents at key junctures. You'll build an agent from scratch, then reconstruct it with LangGraph to thoroughly understand the framework. Finally, you'll build a sophisticated essay-writing agent that incorporates all the learnings from the course. Sign up here!

New AI Agentic course! Learn to use LangGraph to build single and multi-agent LLM applications in AI Agents in LangGraph. This short course, taught by LangChain LangChain founder Harrison Chase Harrison Chase and Tavily founder @weiss_rotem, shows how to integrate agentic search to enhance an agent's knowledge with query-focused answers in predictable formats. Also learn to implement agentic memory to save state for reasoning and debugging, and see how human-in-the-loop input can guide agents at key junctures. You'll build an agent from scratch, then reconstruct it with LangGraph to thoroughly understand the framework. Finally, you'll build a sophisticated essay-writing agent that incorporates all the learnings from the course. Sign up here!

Andrew Ng

151,484 views • 2 years ago

New short course on advanced retrieval for RAG (retrieval augmented generation)! RAG fetches relevant documents to give context to an LLM. In Advanced Retrieval for AI with Chroma, taught by Chroma founder anton 🇺🇸, you’ll learn: (i) Query expansion using an LLM to rewrite and improve a query, by either generating either additional relevant queries or a hypothetical answer to the query. (ii) Reranking using a cross-encoder - a model trained to measure similarity between two inputs presented simultaneously. Reranking reorders retrieved documents based on the cross-encoder similarity measure. (iii) Constructing and training an Embedding Adaptor, which is a model that adapts the embedding values to be more relevant to your use case. Each of these techniques can help you build much better RAG systems. Please sign up for the course here:

New short course on advanced retrieval for RAG (retrieval augmented generation)! RAG fetches relevant documents to give context to an LLM. In Advanced Retrieval for AI with Chroma, taught by Chroma founder anton 🇺🇸, you’ll learn: (i) Query expansion using an LLM to rewrite and improve a query, by either generating either additional relevant queries or a hypothetical answer to the query. (ii) Reranking using a cross-encoder - a model trained to measure similarity between two inputs presented simultaneously. Reranking reorders retrieved documents based on the cross-encoder similarity measure. (iii) Constructing and training an Embedding Adaptor, which is a model that adapts the embedding values to be more relevant to your use case. Each of these techniques can help you build much better RAG systems. Please sign up for the course here:

Andrew Ng

191,219 views • 2 years ago

Announcing a new Coursera course: Retrieval Augmented Generation (RAG) You'll learn to build high performance, production-ready RAG systems in this hands-on, in-depth course created by and taught by Zain, experienced AI and ML engineer, researcher, and educator. RAG is a critical component today of many LLM-based applications in customer support, internal company Q&A systems, even many of the leading chatbots that use web search to answer your questions. This course teaches you in-depth how to make RAG work well. LLMs can produce generic or outdated responses, especially when asked specialized questions not covered in its training data. RAG is the most widely used technique for addressing this. It brings in data from new data sources, such as internal documents or recent news, to give the LLM the relevant context to private, recent, or specialized information. This lets it generate more grounded and accurate responses. In this course, you’ll learn to design and implement every part of a RAG system, from retrievers to vector databases to generation to evals. You’ll learn about the fundamental principles behind RAG and how to optimize it at both the component and whole-system levels. As AI evolves, RAG is evolving too. New models can handle longer context windows, reason more effectively, and can be parts of complex agentic workflows. One exciting growth area is Agentic RAG, in which an AI agent at runtime (rather than it being hardcoded at development time) autonomously decides what data to retrieve, and when/how to go deeper. Even with this evolution, access to high-quality data at runtime is essential, which is why RAG is a key part of so many applications. You'll learn via hands-on experiences to: - Build a RAG system with retrieval and prompt augmentation - Compare retrieval methods like BM25, semantic search, and Reciprocal Rank Fusion - Chunk, index, and retrieve documents using a Weaviate vector database and a news dataset - Develop a chatbot, using open-source LLMs hosted by Together AI, for a fictional store that answers product and FAQ questions - Use evals to drive improving reliability, and incorporate multi-modal data RAG is an important foundational technique. Become good at it through this course! Please sign up here:

Announcing a new Coursera course: Retrieval Augmented Generation (RAG) You'll learn to build high performance, production-ready RAG systems in this hands-on, in-depth course created by and taught by Zain, experienced AI and ML engineer, researcher, and educator. RAG is a critical component today of many LLM-based applications in customer support, internal company Q&A systems, even many of the leading chatbots that use web search to answer your questions. This course teaches you in-depth how to make RAG work well. LLMs can produce generic or outdated responses, especially when asked specialized questions not covered in its training data. RAG is the most widely used technique for addressing this. It brings in data from new data sources, such as internal documents or recent news, to give the LLM the relevant context to private, recent, or specialized information. This lets it generate more grounded and accurate responses. In this course, you’ll learn to design and implement every part of a RAG system, from retrievers to vector databases to generation to evals. You’ll learn about the fundamental principles behind RAG and how to optimize it at both the component and whole-system levels. As AI evolves, RAG is evolving too. New models can handle longer context windows, reason more effectively, and can be parts of complex agentic workflows. One exciting growth area is Agentic RAG, in which an AI agent at runtime (rather than it being hardcoded at development time) autonomously decides what data to retrieve, and when/how to go deeper. Even with this evolution, access to high-quality data at runtime is essential, which is why RAG is a key part of so many applications. You'll learn via hands-on experiences to: - Build a RAG system with retrieval and prompt augmentation - Compare retrieval methods like BM25, semantic search, and Reciprocal Rank Fusion - Chunk, index, and retrieve documents using a Weaviate vector database and a news dataset - Develop a chatbot, using open-source LLMs hosted by Together AI, for a fictional store that answers product and FAQ questions - Use evals to drive improving reliability, and incorporate multi-modal data RAG is an important foundational technique. Become good at it through this course! Please sign up here:

Andrew Ng

124,458 views • 11 months ago

We just released "Large Language Models with Semantic Search”, built with Cohere, and taught by Jay Alammar and Luis Serrano. Search is a key part of many applications. Say, you need to retrieve documents or products in response to a user query; how can LLMs help? You’ll learn about (i) Embeddings, to retrieve a collection of documents loosely related to a query, and (ii) LLM assisted re-ranking, to rank them precisely according to relevance. You’ll also go through code showing how to tie all this together to build a complete search system for retrieving relevant Wikipedia articles. Please check it out!

We just released "Large Language Models with Semantic Search”, built with Cohere, and taught by Jay Alammar and Luis Serrano. Search is a key part of many applications. Say, you need to retrieve documents or products in response to a user query; how can LLMs help? You’ll learn about (i) Embeddings, to retrieve a collection of documents loosely related to a query, and (ii) LLM assisted re-ranking, to rank them precisely according to relevance. You’ll also go through code showing how to tie all this together to build a complete search system for retrieving relevant Wikipedia articles. Please check it out!

Andrew Ng

595,837 views • 2 years ago

New short course on Building Applications with Vector Databases, taught by Pinecone’s Tim Tully! At the heart of a vector database is the ability to store a collection of vectors and then query against that, meaning input a new vector and find similar ones. This is useful for many AI applications. In this course, you'll learn how to use vector databases to build: (i) Semantic Search: Create a text search tool that goes beyond keyword matching, and instead focuses on the meaning of content. (ii) RAG (retrieval augmented generation): Enhance your LLM output by incorporating context from sources the model wasn't trained on. (iii) Recommender System: Combine semantic search and RAG to recommend topics, and demonstrate it with a news article recommender. (iv) Hybrid Search: Build an application that finds items using both images and descriptive text -- by combining both sparse and dense vector representations of the data -- using an eCommerce dataset as an example. (v) Image Similarity: Use image vector embeddings to create an app to compare facial features, using a database of public figures to determine the likeness between them. (vi) Anomaly Detection: Build an anomaly detection app that identifies unusual patterns in network communication logs. I hope you’ll enjoy learning how to build all these types of applications! Please sign up here:

New short course on Building Applications with Vector Databases, taught by Pinecone’s Tim Tully! At the heart of a vector database is the ability to store a collection of vectors and then query against that, meaning input a new vector and find similar ones. This is useful for many AI applications. In this course, you'll learn how to use vector databases to build: (i) Semantic Search: Create a text search tool that goes beyond keyword matching, and instead focuses on the meaning of content. (ii) RAG (retrieval augmented generation): Enhance your LLM output by incorporating context from sources the model wasn't trained on. (iii) Recommender System: Combine semantic search and RAG to recommend topics, and demonstrate it with a news article recommender. (iv) Hybrid Search: Build an application that finds items using both images and descriptive text -- by combining both sparse and dense vector representations of the data -- using an eCommerce dataset as an example. (v) Image Similarity: Use image vector embeddings to create an app to compare facial features, using a database of public figures to determine the likeness between them. (vi) Anomaly Detection: Build an anomaly detection app that identifies unusual patterns in network communication logs. I hope you’ll enjoy learning how to build all these types of applications! Please sign up here:

Andrew Ng

137,073 views • 2 years ago

Our new short course, Knowledge Graphs for RAG, is now available! Knowledge graphs are a data structure that is great at capturing complex relationships between data of multiple types. By enabling more sophisticated retrieval of text than similarity search alone, knowledge graphs can improve the context you pass to the LLM and the performance of your RAG applications. In this course, taught by Andreas Kollegger of Neo4j, you’ll - Explore how knowledge graphs work by building a graph of public financial documents from scratch - Learn to write queries that retrieve text and data from the graph and use it to enhance the context you pass to an LLM chatbot - Combine a knowledge graph with a question-answer chain to build better RAG-powered chat systems Sign up here!

Our new short course, Knowledge Graphs for RAG, is now available! Knowledge graphs are a data structure that is great at capturing complex relationships between data of multiple types. By enabling more sophisticated retrieval of text than similarity search alone, knowledge graphs can improve the context you pass to the LLM and the performance of your RAG applications. In this course, taught by Andreas Kollegger of Neo4j, you’ll - Explore how knowledge graphs work by building a graph of public financial documents from scratch - Learn to write queries that retrieve text and data from the graph and use it to enhance the context you pass to an LLM chatbot - Combine a knowledge graph with a question-answer chain to build better RAG-powered chat systems Sign up here!

Andrew Ng

244,266 views • 2 years ago

🔗 New LangChain Academy Course: Introduction to LangChain (Python) 🔗 Learn how to build with LangChain – our open source framework that makes it easy to start building agents with any model provider. In this course, you’ll create agents that can reason, use tools, and take action, and learn how to debug their behavior with LangSmith. Along the way, you’ll: - Build an agent with the `create_agent` abstraction - Use LangChain’s core building blocks: Models, Messages, Memory, and Tools - Customize your agent with middleware - Debug your agent with LangSmith Observability & Studio By the end of the course, you’ll have assembled a full team of personal assistants. Enroll for free ➡️

🔗 New LangChain Academy Course: Introduction to LangChain (Python) 🔗 Learn how to build with LangChain – our open source framework that makes it easy to start building agents with any model provider. In this course, you’ll create agents that can reason, use tools, and take action, and learn how to debug their behavior with LangSmith. Along the way, you’ll: - Build an agent with the `create_agent` abstraction - Use LangChain’s core building blocks: Models, Messages, Memory, and Tools - Customize your agent with middleware - Debug your agent with LangSmith Observability & Studio By the end of the course, you’ll have assembled a full team of personal assistants. Enroll for free ➡️

LangChain

41,016 views • 6 months ago

Our first Generative AI short course in JavaScript! GitHub recently reported that JavaScript is again the world’s most popular programming language. To support web developers exploring and developing with generative AI, we just launched a new short course in JavaScript taught by Jacob Lee, founding engineer at . In Build LLM Apps with LangChain.js you’ll learn elements common in AI development, including: (i) Using data loaders to pull data from common sources such as PDFs, websites, and databases (ii) Prompts, which are used to provide the LLM context (iii) Modules to support RAG such as text splitters and integrations with vector stores (iv) Working with different models to write applications that are not vendor-specific (v) Parsers, which extract and format the output for your downstream code to process You’ll also build with the LangChain Expression Language, which lets you easily compose sequences (also called chains) of modules to perform complex tasks using LLMs. Putting all this together, you’ll also work on a conversational question-answering LLM application capable of using external data as context. Please sign up here:

Our first Generative AI short course in JavaScript! GitHub recently reported that JavaScript is again the world’s most popular programming language. To support web developers exploring and developing with generative AI, we just launched a new short course in JavaScript taught by Jacob Lee, founding engineer at . In Build LLM Apps with LangChain.js you’ll learn elements common in AI development, including: (i) Using data loaders to pull data from common sources such as PDFs, websites, and databases (ii) Prompts, which are used to provide the LLM context (iii) Modules to support RAG such as text splitters and integrations with vector stores (iv) Working with different models to write applications that are not vendor-specific (v) Parsers, which extract and format the output for your downstream code to process You’ll also build with the LangChain Expression Language, which lets you easily compose sequences (also called chains) of modules to perform complex tasks using LLMs. Putting all this together, you’ll also work on a conversational question-answering LLM application capable of using external data as context. Please sign up here:

Andrew Ng

284,275 views • 2 years ago

LangChain Academy is live! Our first course — Introduction to LangGraph — teaches you the in-and-outs of building a reliable AI agent. In this course, you’ll learn how to: 🛠️ Build agents with LangGraph's graph-based workflows 🔄 Use memory + human-in-the-loop for smarter, self-corrective agents 📚 Create your own AI assistant that can perform knowledge tasks Enroll now for free ➡️ Bring LangChain Academy to your company ➡️

LangChain Academy is live! Our first course — Introduction to LangGraph — teaches you the in-and-outs of building a reliable AI agent. In this course, you’ll learn how to: 🛠️ Build agents with LangGraph's graph-based workflows 🔄 Use memory + human-in-the-loop for smarter, self-corrective agents 📚 Create your own AI assistant that can perform knowledge tasks Enroll now for free ➡️ Bring LangChain Academy to your company ➡️

LangChain

105,434 views • 1 year ago

New short course! Quality and Safety for LLM Applications, created with WhyLabs (an AI Fund portfolio company) and taught by Bernease Herman, shows how you can mitigate hallucinations, data leakage, and jailbreaks. Come learn more in the course, available now!

New short course! Quality and Safety for LLM Applications, created with WhyLabs (an AI Fund portfolio company) and taught by Bernease Herman, shows how you can mitigate hallucinations, data leakage, and jailbreaks. Come learn more in the course, available now!

Andrew Ng

105,877 views • 2 years ago

I’m excited to announce a new course with DeepLearning.AI - Building Agentic RAG 💫 In this course, you’ll learn how to build a research assistant that can reason over multiple documents and answer complex questions. You’ll also learn how to step through the execution of the agent and steer it with human feedback. This represents a big step beyond any standard RAG pipeline, which is mostly good for simple questions over a small set of documents. Learn the layers first and then put them together: ✅ Routing ✅ Tool Use ✅ Multi-step reasoning with Memory ✅ Tool retrieval ✅ Debugging + user input Check it out!

I’m excited to announce a new course with DeepLearning.AI - Building Agentic RAG 💫 In this course, you’ll learn how to build a research assistant that can reason over multiple documents and answer complex questions. You’ll also learn how to step through the execution of the agent and steer it with human feedback. This represents a big step beyond any standard RAG pipeline, which is mostly good for simple questions over a small set of documents. Learn the layers first and then put them together: ✅ Routing ✅ Tool Use ✅ Multi-step reasoning with Memory ✅ Tool retrieval ✅ Debugging + user input Check it out!

Jerry Liu

76,280 views • 2 years ago

Build and customize complex AI applications with a flexible framework in this new short course, Building AI Applications with Haystack. Created in collaboration with deepset, makers of Haystack, and taught by Tuana, who is the developer relations lead for Haystack at deepset. Generative AI technology is changing rapidly and it can be challenging to integrate APIs from different LLMs, vector databases, and various tools such as web search. In this course, you will learn how to use the Haystack framework to make your development process more modular, allowing you to manage complexity and focus more on building your application. In detail, you’ll: - Build a RAG pipeline using Haystack’s main building blocks – components, pipelines, and document stores. - Create custom components in your pipeline by building a Hacker News summarizer that extends your app’s ability to access APIs. - Use conditional routing to create a branching pipeline with a fallback to web search mechanism when the LLM does not have the necessary context to respond to the user's query. - Build a self-reflecting agent for named entity recognition that loops using an output validator custom component. - Create a chat agent using OpenAI's function-calling capabilities which allow you to provide Haystack pipelines as tools to the LLM, enhancing that agent's capabilities. By the end of this course, you will learn a high-level orchestration framework that can help make your applications flexible, extendible, and maintainable, even as the technology stack changes, new user needs arise, and you add new features to your application. Please sign up here:

Build and customize complex AI applications with a flexible framework in this new short course, Building AI Applications with Haystack. Created in collaboration with deepset, makers of Haystack, and taught by Tuana, who is the developer relations lead for Haystack at deepset. Generative AI technology is changing rapidly and it can be challenging to integrate APIs from different LLMs, vector databases, and various tools such as web search. In this course, you will learn how to use the Haystack framework to make your development process more modular, allowing you to manage complexity and focus more on building your application. In detail, you’ll: - Build a RAG pipeline using Haystack’s main building blocks – components, pipelines, and document stores. - Create custom components in your pipeline by building a Hacker News summarizer that extends your app’s ability to access APIs. - Use conditional routing to create a branching pipeline with a fallback to web search mechanism when the LLM does not have the necessary context to respond to the user's query. - Build a self-reflecting agent for named entity recognition that loops using an output validator custom component. - Create a chat agent using OpenAI's function-calling capabilities which allow you to provide Haystack pipelines as tools to the LLM, enhancing that agent's capabilities. By the end of this course, you will learn a high-level orchestration framework that can help make your applications flexible, extendible, and maintainable, even as the technology stack changes, new user needs arise, and you add new features to your application. Please sign up here:

Andrew Ng

53,788 views • 1 year ago

Building a reliable RAG system doesn’t stop at retrieval and generation, you need observability too. In the Retrieval Augmented Generation course, you'll explore how LLM observability platforms can help you: - Trace prompts through each step of the pipeline - Log and evaluate component behavior - Run experiments and monitor system performance over time The lesson uses Phoenix (by Arize) as an open-source example, but the techniques apply broadly, and you'll also learn where traditional tools like Grafana or Datadog fit in. 🧠 Learn how to monitor and improve your RAG systems in the full course:

Building a reliable RAG system doesn’t stop at retrieval and generation, you need observability too. In the Retrieval Augmented Generation course, you'll explore how LLM observability platforms can help you: - Trace prompts through each step of the pipeline - Log and evaluate component behavior - Run experiments and monitor system performance over time The lesson uses Phoenix (by Arize) as an open-source example, but the techniques apply broadly, and you'll also learn where traditional tools like Grafana or Datadog fit in. 🧠 Learn how to monitor and improve your RAG systems in the full course:

DeepLearning.AI

14,702 views • 10 months ago

New short course: Building Your Own Database Agent, created with Microsoft Microsoft Azure's Adrián González Sánchez (He/Him/His). You'll learn to build an AI assistant that translates natural language questions into SQL queries. Querying with natural language empowers everyone in your organization, from business leaders to developers, to access data insights directly, using plain English. You’ll use the Azure OpenAI Service and LangChain to implement retrieval augmented generation and function calling, and leverage the Assistants API. You'll gain hands-on experience building an agent that can reason over both CSV files and SQL databases. Please sign up here!

New short course: Building Your Own Database Agent, created with Microsoft Microsoft Azure's Adrián González Sánchez (He/Him/His). You'll learn to build an AI assistant that translates natural language questions into SQL queries. Querying with natural language empowers everyone in your organization, from business leaders to developers, to access data insights directly, using plain English. You’ll use the Azure OpenAI Service and LangChain to implement retrieval augmented generation and function calling, and leverage the Assistants API. You'll gain hands-on experience building an agent that can reason over both CSV files and SQL databases. Please sign up here!

Andrew Ng

109,879 views • 2 years ago

New course! Generative AI with Large Language Models, created with Amazon Web Services and hosted on Coursera. This course goes deep into the technical foundations of LLMs and how to use them. You can sign up here: You’ll work through the full life-cycle of a generative AI project, and learn specific techniques like RLHF; zero-shot, one-shot, and few-shot learning with LLMs; advanced prompting frameworks like ReAct; even fine-tuning LLMs, and gain hands-on practice with all of these techniques. Instructors Antje Barth Chris Fregly Shelbee Eigenbrode and Mike G Chambers all do incredible Generative AI work at AWS, and have supported many companies to build creative LLM applications. They bring tremendous practical LLM expertise to this course. I'm confident you’ll finish this course with a deeper understanding of how LLMs work, and how to use them. I hope you enjoy the course!

New course! Generative AI with Large Language Models, created with Amazon Web Services and hosted on Coursera. This course goes deep into the technical foundations of LLMs and how to use them. You can sign up here: You’ll work through the full life-cycle of a generative AI project, and learn specific techniques like RLHF; zero-shot, one-shot, and few-shot learning with LLMs; advanced prompting frameworks like ReAct; even fine-tuning LLMs, and gain hands-on practice with all of these techniques. Instructors Antje Barth Chris Fregly Shelbee Eigenbrode and Mike G Chambers all do incredible Generative AI work at AWS, and have supported many companies to build creative LLM applications. They bring tremendous practical LLM expertise to this course. I'm confident you’ll finish this course with a deeper understanding of how LLMs work, and how to use them. I hope you enjoy the course!

Andrew Ng

467,882 views • 3 years ago

New Short Course: Getting Structured LLM Output! Learn how to get structured outputs from your LLM applications in this course, built in partnership with .txt, and taught by Will Kurt, a Founding Engineer, and , Developer Relations Engineer. It's challenging for software to automatically parse through an LLM's freeform text outputs. Structured outputs—like JSON—solve this by converting natural language into consistent, clear, data that a machine can read and process. This course teaches you how to generate structured outputs while building several use cases, including a social media analysis agent. You’ll learn about structured outputs and efficient ways to generate outputs in your defined schema or format. You’ll begin by using structured output APIs, then use re-prompting libraries like “instructor” to generate structured output. Finally, you’ll learn how constrained decoding works; this is a very clever technique in which constraints are applied on each subsequent token generated, blocking any tokens that don’t fit your defined schema. In detail, you’ll: - Learn why structured outputs are important, how they allow for scalable software development, and the different approaches to generate them, including vendor-provided APIs, re-prompting libraries, and structured generation. - Build a simple social media agent using OpenAI’s structured output API, learn how to define a model's desired structured output using Pydantic, and perform basic programming with your outputs, such as importing structured data into a data frame using pandas. - Learn how to use the open-source library "instructor," which checks the structured output of the model and re-prompts the model until it validates the desired output, and explore the limitations of this approach. - Understand how structured generation by the “outlines” library works by modifying LLM logits, on a per-generated-token basis based on the desired format, to give a particular output structure. - Learn how regular expressions, which outlines works with, are represented as finite-state machines, and how they can be used to develop a range of structured outputs beyond JSON. By the end of this course, you’ll have broadened your knowledge of the approaches you can use to get structured outputs from your LLM applications. Please sign up here:

New Short Course: Getting Structured LLM Output! Learn how to get structured outputs from your LLM applications in this course, built in partnership with .txt, and taught by Will Kurt, a Founding Engineer, and , Developer Relations Engineer. It's challenging for software to automatically parse through an LLM's freeform text outputs. Structured outputs—like JSON—solve this by converting natural language into consistent, clear, data that a machine can read and process. This course teaches you how to generate structured outputs while building several use cases, including a social media analysis agent. You’ll learn about structured outputs and efficient ways to generate outputs in your defined schema or format. You’ll begin by using structured output APIs, then use re-prompting libraries like “instructor” to generate structured output. Finally, you’ll learn how constrained decoding works; this is a very clever technique in which constraints are applied on each subsequent token generated, blocking any tokens that don’t fit your defined schema. In detail, you’ll: - Learn why structured outputs are important, how they allow for scalable software development, and the different approaches to generate them, including vendor-provided APIs, re-prompting libraries, and structured generation. - Build a simple social media agent using OpenAI’s structured output API, learn how to define a model's desired structured output using Pydantic, and perform basic programming with your outputs, such as importing structured data into a data frame using pandas. - Learn how to use the open-source library "instructor," which checks the structured output of the model and re-prompts the model until it validates the desired output, and explore the limitations of this approach. - Understand how structured generation by the “outlines” library works by modifying LLM logits, on a per-generated-token basis based on the desired format, to give a particular output structure. - Learn how regular expressions, which outlines works with, are represented as finite-state machines, and how they can be used to develop a range of structured outputs beyond JSON. By the end of this course, you’ll have broadened your knowledge of the approaches you can use to get structured outputs from your LLM applications. Please sign up here:

Andrew Ng

89,703 views • 1 year ago

Tokenization -- turning text into a sequence of integers -- is a key part of generative AI, and most API providers charge per million tokens. How does tokenization work? Learn the details of tokenization and RAG optimization in Retrieval Optimization: From Tokenization to Vector Quantization, created in collaboration with Qdrant and taught by its Developer Relations Lead, Kacper Łukawski. This course focuses on Retrieval augmented generation (RAG), which has two steps: First, a retriever finds relevant information; then, the generator uses what’s retrieved as context to produce a response. You’ll learn to optimize the first step (the retriever) by understanding how tokenization works and how it impacts the relevance of your search. In addition, you will also learn to measure and improve retrieval quality, speed, and memory. In detail, you’ll: - Learn about the internal workings of the embedding models and how your text turns into vectors. - Understand how several tokenizers, such as Byte-Pair Encoding, WordPiece, Unigram, and SentencePiece work. - Explore common challenges with tokenizers, such as unknown tokens, domain-specific identifiers, and numerical values, that can negatively affect your vector search. - Understand how to measure the quality of your search across relevance, ranking, and score-related metrics. - Understand how the main parameters in "HNSW", a graph-based algorithm, affect the relevance and speed of vector search, and how to tune its parameters. - Experiment with the three major quantization methods – product, scalar, and binary – and learn how they impact memory requirements, search quality, and speed. By the end of this course, you’ll have a solid understanding of how tokenization functions and how to optimize vector search in your RAG systems. Please sign up here!

Tokenization -- turning text into a sequence of integers -- is a key part of generative AI, and most API providers charge per million tokens. How does tokenization work? Learn the details of tokenization and RAG optimization in Retrieval Optimization: From Tokenization to Vector Quantization, created in collaboration with Qdrant and taught by its Developer Relations Lead, Kacper Łukawski. This course focuses on Retrieval augmented generation (RAG), which has two steps: First, a retriever finds relevant information; then, the generator uses what’s retrieved as context to produce a response. You’ll learn to optimize the first step (the retriever) by understanding how tokenization works and how it impacts the relevance of your search. In addition, you will also learn to measure and improve retrieval quality, speed, and memory. In detail, you’ll: - Learn about the internal workings of the embedding models and how your text turns into vectors. - Understand how several tokenizers, such as Byte-Pair Encoding, WordPiece, Unigram, and SentencePiece work. - Explore common challenges with tokenizers, such as unknown tokens, domain-specific identifiers, and numerical values, that can negatively affect your vector search. - Understand how to measure the quality of your search across relevance, ranking, and score-related metrics. - Understand how the main parameters in "HNSW", a graph-based algorithm, affect the relevance and speed of vector search, and how to tune its parameters. - Experiment with the three major quantization methods – product, scalar, and binary – and learn how they impact memory requirements, search quality, and speed. By the end of this course, you’ll have a solid understanding of how tokenization functions and how to optimize vector search in your RAG systems. Please sign up here!

Andrew Ng

146,313 views • 1 year ago