正在加载视频...

视频加载失败

加载此视频时出现问题。这可能是由于临时网络问题，或视频可能不可用。

New course! Generative AI with Large Language Models, created with Amazon Web Services and hosted on Coursera. This course goes deep into the technical foundations of LLMs and how to use them. You can sign up here: You’ll work through the full life-cycle of a generative AI project, and... learn specific techniques like RLHF; zero-shot, one-shot, and few-shot learning with LLMs; advanced prompting frameworks like ReAct; even fine-tuning LLMs, and gain hands-on practice with all of these techniques. Instructors Antje Barth Chris Fregly Shelbee Eigenbrode and Mike G Chambers all do incredible Generative AI work at AWS, and have supported many companies to build creative LLM applications. They bring tremendous practical LLM expertise to this course. I'm confident you’ll finish this course with a deeper understanding of how LLMs work, and how to use them. I hope you enjoy the course!show more

Andrew Ng

1,717,949 subscribers

467,912 次观看 • 3 年前 •via X (Twitter)

科学技术教育健康养生

Anya Rossi• Live Now

Private livecam show

10 条评论

Amazon Web Services 的头像

Amazon Web Services3 年前

@DeepLearningAI_ @AWS @coursera We’re thrilled to see this course go live, @AndrewYNg!

Multimodal 的头像

Multimodal3 年前

@AWS @coursera Amazing course. Can see this course helping bridge the gap between novice and expert. The hands on experience with real world projects and excellent coaching from mentors will truly bring a great experience! Great work, @AndrewYNg

Coursera 的头像

Coursera3 年前

@AWS We're so excited about this new course! 😄

Jealous Stranger 的头像

Jealous Stranger3 年前

@AWS @coursera Ground breaking research -> companies using it -> Coursera course in few months, nice!

Techholic 的头像

Techholic3 年前

@AWS @coursera Hi @AndrewYNg is this course suits people with no idea about AI ?To understand the basics and take their first steps into AI world?

Jose M. Seoane 的头像

Jose M. Seoane3 年前

@juantomas @AWS @coursera Looks like we got summer homework! Thank you @AndrewYNg

Uday Teki 🇺🇸🇮🇳🏴󠁧󠁢󠁥󠁮󠁧󠁿 的头像

Uday Teki 🇺🇸🇮🇳🏴󠁧󠁢󠁥󠁮󠁧󠁿3 年前

@AWS @coursera This is awesome Andrew 👏🏽🫱🏼‍🫲🏾🙏🏾

Reppi.ai (A.I. Powered Speech To Text Notes) 的头像

Reppi.ai (A.I. Powered Speech To Text Notes)3 年前

@AWS @coursera Putting this at the top of our research plan! We will be looking to implement into our products.

Rufus 的头像

Rufus3 年前

@AWS @coursera Excited about this, Congratulations. I will definitely partake.

Min Choi 的头像

Min Choi3 年前

@DeepLearningAI_ @AWS @coursera Awesome! Checking out the course now 🔥

相关视频

New short course on Serverless LLM apps with Amazon Bedrock, taught by Amazon Web Services' Mike G Chambers! A serverless architecture enables you to quickly deploy your applications without needing to set up and manage compute servers to run your applications on, the maintenance of which can be another full-time job. In this course, you’ll learn how to do this by using an event-driven architecture to build complex AI workflows. Mike illustrate these concepts by building a cool application that automatically detects incoming customer inquiries, transcribes them with ASR (automatic speech recognition), summarizes them with an LLM using Bedrock, and deploys serverless with AWS Lambda. I hope this course makes it much easier for you to build and deploy LLM applications requiring multi-step AI workflows. Please sign up here:

New short course on Serverless LLM apps with Amazon Bedrock, taught by Amazon Web Services' Mike G Chambers! A serverless architecture enables you to quickly deploy your applications without needing to set up and manage compute servers to run your applications on, the maintenance of which can be another full-time job. In this course, you’ll learn how to do this by using an event-driven architecture to build complex AI workflows. Mike illustrate these concepts by building a cool application that automatically detects incoming customer inquiries, transcribes them with ASR (automatic speech recognition), summarizes them with an LLM using Bedrock, and deploys serverless with AWS Lambda. I hope this course makes it much easier for you to build and deploy LLM applications requiring multi-step AI workflows. Please sign up here:

Andrew Ng

108,171 次观看 • 2 年前

New short course on Fine-tuning LLMs! Many developers are moving beyond only prompting, to also fine-tuning LLMs - that is, taking a pre-trained model and training it further on your own data, which can deliver superior results inexpensively. In this course, Sharon Zhou, CEO of Lamini (disclosure: I’m a minor shareholder) shows you how to recognize when fine-tuning can be help, and how to train an open-source LLM on your own data. I hope you enjoy the course!

New short course on Fine-tuning LLMs! Many developers are moving beyond only prompting, to also fine-tuning LLMs - that is, taking a pre-trained model and training it further on your own data, which can deliver superior results inexpensively. In this course, Sharon Zhou, CEO of Lamini (disclosure: I’m a minor shareholder) shows you how to recognize when fine-tuning can be help, and how to train an open-source LLM on your own data. I hope you enjoy the course!

Andrew Ng

502,821 次观看 • 2 年前

Learn how to build an optimized LLM inference system from the ground up in our new short course, Efficiently Serving LLMs, built in collaboration with Predibase by Rubrik and taught by Travis Addair. Whether you're serving your own LLM or using a model hosting service, this course will give you a deep understanding of the optimizations required to efficiently serve many users at once. - Learn how LLMs generate text one token at a time, and how techniques like KV caching, continuous batching, and quantization speed things up and optimize memory usage for serving multiple users. - Benchmark the performance of these LLM optimizations to explore the trade-offs between quickly responding to an individual user’s request vs. serving many users at once. - Use techniques like low-rank adaptation (LoRA) to efficiently serve hundreds of unique, custom fine-tuned models on a single device, without sacrificing throughput. - Use Predibase's LoRAX framework to see optimization techniques in action on a real LLM server. Sign up here:

Learn how to build an optimized LLM inference system from the ground up in our new short course, Efficiently Serving LLMs, built in collaboration with Predibase by Rubrik and taught by Travis Addair. Whether you're serving your own LLM or using a model hosting service, this course will give you a deep understanding of the optimizations required to efficiently serve many users at once. - Learn how LLMs generate text one token at a time, and how techniques like KV caching, continuous batching, and quantization speed things up and optimize memory usage for serving multiple users. - Benchmark the performance of these LLM optimizations to explore the trade-offs between quickly responding to an individual user’s request vs. serving many users at once. - Use techniques like low-rank adaptation (LoRA) to efficiently serve hundreds of unique, custom fine-tuned models on a single device, without sacrificing throughput. - Use Predibase's LoRAX framework to see optimization techniques in action on a real LLM server. Sign up here:

Andrew Ng

104,727 次观看 • 2 年前

New course: Transformers in Practice. You'll get a practical view of how transformer-based LLMs work, so you can reason about their behavior, diagnose problems like slow inference, and make smarter decisions about deployment. This course is built in partnership with AMD and taught by Sharon Zhou. You'll see how transformers generate text one token at a time, how the model decides which earlier words matter most when predicting the next one, and how techniques like quantization speed up inference on GPUs. This is not a video-only course; interactive visualizations throughout let you play with these concepts and build intuition that sticks. Skills you'll gain: - Understand why LLMs hallucinate, and RAG and chain-of-thought shape what they generate - Look inside the model to see how attention and layers combine to predict the next token - Diagnose inference bottlenecks and learn the techniques that speed up transformers on GPUs Join and understand what's really happening inside your LLMs:

New course: Transformers in Practice. You'll get a practical view of how transformer-based LLMs work, so you can reason about their behavior, diagnose problems like slow inference, and make smarter decisions about deployment. This course is built in partnership with AMD and taught by Sharon Zhou. You'll see how transformers generate text one token at a time, how the model decides which earlier words matter most when predicting the next one, and how techniques like quantization speed up inference on GPUs. This is not a video-only course; interactive visualizations throughout let you play with these concepts and build intuition that sticks. Skills you'll gain: - Understand why LLMs hallucinate, and RAG and chain-of-thought shape what they generate - Look inside the model to see how attention and layers combine to predict the next token - Diagnose inference bottlenecks and learn the techniques that speed up transformers on GPUs Join and understand what's really happening inside your LLMs:

Andrew Ng

119,783 次观看 • 2 个月前

Our first Generative AI short course in JavaScript! GitHub recently reported that JavaScript is again the world’s most popular programming language. To support web developers exploring and developing with generative AI, we just launched a new short course in JavaScript taught by Jacob Lee, founding engineer at . In Build LLM Apps with LangChain.js you’ll learn elements common in AI development, including: (i) Using data loaders to pull data from common sources such as PDFs, websites, and databases (ii) Prompts, which are used to provide the LLM context (iii) Modules to support RAG such as text splitters and integrations with vector stores (iv) Working with different models to write applications that are not vendor-specific (v) Parsers, which extract and format the output for your downstream code to process You’ll also build with the LangChain Expression Language, which lets you easily compose sequences (also called chains) of modules to perform complex tasks using LLMs. Putting all this together, you’ll also work on a conversational question-answering LLM application capable of using external data as context. Please sign up here:

Our first Generative AI short course in JavaScript! GitHub recently reported that JavaScript is again the world’s most popular programming language. To support web developers exploring and developing with generative AI, we just launched a new short course in JavaScript taught by Jacob Lee, founding engineer at . In Build LLM Apps with LangChain.js you’ll learn elements common in AI development, including: (i) Using data loaders to pull data from common sources such as PDFs, websites, and databases (ii) Prompts, which are used to provide the LLM context (iii) Modules to support RAG such as text splitters and integrations with vector stores (iv) Working with different models to write applications that are not vendor-specific (v) Parsers, which extract and format the output for your downstream code to process You’ll also build with the LangChain Expression Language, which lets you easily compose sequences (also called chains) of modules to perform complex tasks using LLMs. Putting all this together, you’ll also work on a conversational question-answering LLM application capable of using external data as context. Please sign up here:

Andrew Ng

284,320 次观看 • 2 年前

Announcing How Transformer LLMs Work, created with Jay Alammar and Maarten Grootendorst, co-authors of the beautifully illustrated book, “Hands-On Large Language Models.” This course offers a deep dive into the inner workings of the transformer architecture that powers large language models (LLMs). The transformer architecture revolutionized generative AI; in fact, the "GPT" in ChatGPT stands for "Generative Pre-Trained Transformer." Originally introduced in the Google Brain team's groundbreaking 2017 paper "Attention Is All You Need," by Vaswani and others, transformers were a highly scalable model for machine translation tasks. Variants of this architecture now power today’s LLMs such as those from OpenAI, Google, Meta, Cohere, Anthropic and DeepSeek. In this course, you’ll learn in detail how LLMs process text. You'll also work through code examples that illustrate that transformer's individual components. In details, you’ll learn: - How the representation of language has evolved, from Bag-of-Words to Word2Vec embeddings to the transformer architecture that captures a word's meanings taking into account the context of other words in the input. - How inputs are broken down into tokens before they are sent to the language model. - The details of a transformer's main stages: Tokenization and embedding, the stack of transformer blocks, and the language model head. - The inner workings of the transformer block, including attention, which calculates relevance scores, and the feedforward layer, which incorporates stored information learned in training. - How cached calculations make transformers faster. - Some of the most recent ideas in the latest models such as Mixture-of-Experts (MoE) which uses multiple sub-models and a router on each layer to improve the quality of LLMs. By the end of this course, you’ll have a deep understanding of how LLMs actually process text and be able to read through papers describing the latest models and understand the details. Gaining this intuition will improve your approach to building LLM applications. Please sign up here:

Announcing How Transformer LLMs Work, created with Jay Alammar and Maarten Grootendorst, co-authors of the beautifully illustrated book, “Hands-On Large Language Models.” This course offers a deep dive into the inner workings of the transformer architecture that powers large language models (LLMs). The transformer architecture revolutionized generative AI; in fact, the "GPT" in ChatGPT stands for "Generative Pre-Trained Transformer." Originally introduced in the Google Brain team's groundbreaking 2017 paper "Attention Is All You Need," by Vaswani and others, transformers were a highly scalable model for machine translation tasks. Variants of this architecture now power today’s LLMs such as those from OpenAI, Google, Meta, Cohere, Anthropic and DeepSeek. In this course, you’ll learn in detail how LLMs process text. You'll also work through code examples that illustrate that transformer's individual components. In details, you’ll learn: - How the representation of language has evolved, from Bag-of-Words to Word2Vec embeddings to the transformer architecture that captures a word's meanings taking into account the context of other words in the input. - How inputs are broken down into tokens before they are sent to the language model. - The details of a transformer's main stages: Tokenization and embedding, the stack of transformer blocks, and the language model head. - The inner workings of the transformer block, including attention, which calculates relevance scores, and the feedforward layer, which incorporates stored information learned in training. - How cached calculations make transformers faster. - Some of the most recent ideas in the latest models such as Mixture-of-Experts (MoE) which uses multiple sub-models and a router on each layer to improve the quality of LLMs. By the end of this course, you’ll have a deep understanding of how LLMs actually process text and be able to read through papers describing the latest models and understand the details. Gaining this intuition will improve your approach to building LLM applications. Please sign up here:

Andrew Ng

259,421 次观看 • 1 年前

Learn to train an LLM with distributed data while ensuring privacy using federated learning in a new two-part short course, Intro to Federated Learning and Federated Fine-tuning of LLMs with Private Data, created with Flower and taught by Daniel J. Beutel and nic lane. Federated learning allows a single model to be trained across multiple devices, such as phones, or multiple organizations, such as hospitals, without the need to share data to a central server. This two-part course gives you an introduction to federated learning, and then teaches you how to fine-tune your large language model with distributed data using Flower Lab’s open source federated learning framework. You’ll learn: - How to use federated learning to train a variety of models, ranging from speech and vision models to LLMs, across distributed data while offering data privacy options to users and organizations. - Privacy Enhancing Technologies like differential privacy (DP), which obscures individual data by adding calibrated noise to query results. - Two variants of differential privacy - Central and Local - and how to choose depending on your use case. - How to measure and decrease bandwidth usage to make federated learning more practical and efficient with techniques like using pre-trained models and Parameter-Efficient Fine-Tuning - How federated LLM fine-tuning reduces the risk of leaking training data. Sign up here!

Learn to train an LLM with distributed data while ensuring privacy using federated learning in a new two-part short course, Intro to Federated Learning and Federated Fine-tuning of LLMs with Private Data, created with Flower and taught by Daniel J. Beutel and nic lane. Federated learning allows a single model to be trained across multiple devices, such as phones, or multiple organizations, such as hospitals, without the need to share data to a central server. This two-part course gives you an introduction to federated learning, and then teaches you how to fine-tune your large language model with distributed data using Flower Lab’s open source federated learning framework. You’ll learn: - How to use federated learning to train a variety of models, ranging from speech and vision models to LLMs, across distributed data while offering data privacy options to users and organizations. - Privacy Enhancing Technologies like differential privacy (DP), which obscures individual data by adding calibrated noise to query results. - Two variants of differential privacy - Central and Local - and how to choose depending on your use case. - How to measure and decrease bandwidth usage to make federated learning more practical and efficient with techniques like using pre-trained models and Parameter-Efficient Fine-Tuning - How federated LLM fine-tuning reduces the risk of leaking training data. Sign up here!

Andrew Ng

64,558 次观看 • 2 年前

New short course on Reinforcement Learning from Human Feedback! RLHF is one of the key techniques that led to the rise of modern LLMs. It is used to align LLMs with human preferences, to make them more honest, helpful and harmless, by (i) learning a reward function that mimics human preferences, as expressed in human-provided labels, then, (ii) tuning an LLM to generate outputs that receive a high reward. In this course, taught by Nikita Namjoshi, Developer Advocate for GenAI at Google Cloud, you'll learn the details of how RLHF works, including how to apply it to tune an LLM for your own applications. You'll also use an open source library to tune a base LLM to align with human preferences expressed in a training set, and evaluate the tuned model by comparing its responses before and after RLHF-tuning. Please sign up here!

New short course on Reinforcement Learning from Human Feedback! RLHF is one of the key techniques that led to the rise of modern LLMs. It is used to align LLMs with human preferences, to make them more honest, helpful and harmless, by (i) learning a reward function that mimics human preferences, as expressed in human-provided labels, then, (ii) tuning an LLM to generate outputs that receive a high reward. In this course, taught by Nikita Namjoshi, Developer Advocate for GenAI at Google Cloud, you'll learn the details of how RLHF works, including how to apply it to tune an LLM for your own applications. You'll also use an open source library to tune a base LLM to align with human preferences expressed in a training set, and evaluate the tuned model by comparing its responses before and after RLHF-tuning. Please sign up here!

Andrew Ng

205,542 次观看 • 2 年前

An exciting new course: Fine-tuning and Reinforcement Learning for LLMs: Intro to Post-training, taught by Sharon Zhou, VP of AI at AMD. Available now at Post-training is the key technique used by frontier labs to turn a base LLM--a model trained on massive unlabeled text to predict the next word/token--into a helpful, reliable assistant that can follow instructions. I've also seen many applications where post-training is what turns a demo application that works only 80% of the time into a reliable system that consistently performs. This course will teach you the most important post-training techniques! In this 5 module course, Sharon walks you through the complete post-training pipeline: supervised fine-tuning, reward modeling, RLHF, and techniques like PPO and GRPO. You'll also learn to use LoRA for efficient training, and to design evals that catch problems before and after deployment. Skills you'll gain: - Apply supervised fine-tuning and reinforcement learning (RLHF, PPO, GRPO) to align models to desired behaviors - Use LoRA for efficient fine-tuning without retraining entire models - Prepare datasets and generate synthetic data for post-training - Understand how to operate LLM production pipelines, with go/no-go decision points and feedback loops These advanced methods aren’t limited to frontier AI labs anymore, and you can now use them in your own applications. Learn here:

An exciting new course: Fine-tuning and Reinforcement Learning for LLMs: Intro to Post-training, taught by Sharon Zhou, VP of AI at AMD. Available now at Post-training is the key technique used by frontier labs to turn a base LLM--a model trained on massive unlabeled text to predict the next word/token--into a helpful, reliable assistant that can follow instructions. I've also seen many applications where post-training is what turns a demo application that works only 80% of the time into a reliable system that consistently performs. This course will teach you the most important post-training techniques! In this 5 module course, Sharon walks you through the complete post-training pipeline: supervised fine-tuning, reward modeling, RLHF, and techniques like PPO and GRPO. You'll also learn to use LoRA for efficient training, and to design evals that catch problems before and after deployment. Skills you'll gain: - Apply supervised fine-tuning and reinforcement learning (RLHF, PPO, GRPO) to align models to desired behaviors - Use LoRA for efficient fine-tuning without retraining entire models - Prepare datasets and generate synthetic data for post-training - Understand how to operate LLM production pipelines, with go/no-go decision points and feedback loops These advanced methods aren’t limited to frontier AI labs anymore, and you can now use them in your own applications. Learn here:

Andrew Ng

132,304 次观看 • 8 个月前

Many AI-savvy programmers are now coding very differently than before, by using LLMs to help with their work. You’ll learn these emerging best practices in “Pair Programming with a Large Language Model,” by Google's Laurence Moroney 🇺🇸🇮🇪 🏴󠁧󠁢󠁷󠁬󠁳󠁿. This short course covers using LLMs to simplify and improve your code, assist with debugging, and minimize technical debt by having AI document your code. The use of LLMs as a programming companion is an important shift that's well worth every developer staying on top of. Please check this out!

Many AI-savvy programmers are now coding very differently than before, by using LLMs to help with their work. You’ll learn these emerging best practices in “Pair Programming with a Large Language Model,” by Google's Laurence Moroney 🇺🇸🇮🇪 🏴󠁧󠁢󠁷󠁬󠁳󠁿. This short course covers using LLMs to simplify and improve your code, assist with debugging, and minimize technical debt by having AI document your code. The use of LLMs as a programming companion is an important shift that's well worth every developer staying on top of. Please check this out!

Andrew Ng

373,486 次观看 • 2 年前

New Course: Reinforcement Fine-Tuning LLMs with GRPO! Learn to use reinforcement learning to improve your LLM performance in this short course, built in collaboration with Predibase by Rubrik, and taught by Travis Addair, its Co-Founder and CTO, and Arnav Garg, its Senior Engineer and Machine Learning Lead. Reasoning models have been one of the most important developments in LLMs. Reinforcement Fine-Tuning (RFT) uses rewards to encourage LLMs to find solutions to multi-step reasoning tasks such as solving math problems and debugging code - without needing pre-existing training examples like in traditional supervised fine-tuning. Group Relative Policy Optimization (GRPO) is a reinforcement fine-tuning algorithm gaining rapid adoption. Developed by the DeepSeek team and used to train the R1 reasoning model, GRPO uses reward functions that you can write in Python to assign rewards to model responses. It’s beneficial for tasks with verifiable outcomes and can work well even with fewer than 100 training examples. It can also significantly improve the reasoning ability of smaller LLMs, making applications faster and more cost effective. In this course, you’ll take a technical deep dive into RFT with GRPO. You’ll learn to build reward functions that you can use in the GRPO training process to guide an LLM toward better performance on multi-step reasoning tasks. In detail, you’ll: - Learn when reinforcement fine-tuning is a better fit than supervised fine-tuning, especially for tasks involving multi-step reasoning or limited labeled data. - Understand how GRPO uses programmable reward functions as a more scalable alternative to the human feedback required for other reinforcement learning algorithms, such as RLHF and DPO. - Frame the Wordle game as a reinforcement fine-tuning problem and see how an LLM can learn to plan, analyze feedback, and improve its strategy over time. - Design reward functions that power the reinforcement fine-tuning process. - Learn techniques for evaluating more subjective tasks, such as rating the quality of a text summary, using an LLM as a judge. - Understand why reward hacking happens and how to avoid it by adding penalty functions to discourage undesirable behaviors. - Learn the four key components of the loss calculation in the GRPO algorithm: token probability distribution ratios, advantages, clipping, and KL-divergence. - Launch reinforcement fine-tuning jobs using Predibase’s hosted training services. By the end of this course, you’ll be able to build and fine-tune LLMs using reinforcement learning to improve reasoning without relying on large labeled datasets or subjective human feedback. Please sign up here:

New Course: Reinforcement Fine-Tuning LLMs with GRPO! Learn to use reinforcement learning to improve your LLM performance in this short course, built in collaboration with Predibase by Rubrik, and taught by Travis Addair, its Co-Founder and CTO, and Arnav Garg, its Senior Engineer and Machine Learning Lead. Reasoning models have been one of the most important developments in LLMs. Reinforcement Fine-Tuning (RFT) uses rewards to encourage LLMs to find solutions to multi-step reasoning tasks such as solving math problems and debugging code - without needing pre-existing training examples like in traditional supervised fine-tuning. Group Relative Policy Optimization (GRPO) is a reinforcement fine-tuning algorithm gaining rapid adoption. Developed by the DeepSeek team and used to train the R1 reasoning model, GRPO uses reward functions that you can write in Python to assign rewards to model responses. It’s beneficial for tasks with verifiable outcomes and can work well even with fewer than 100 training examples. It can also significantly improve the reasoning ability of smaller LLMs, making applications faster and more cost effective. In this course, you’ll take a technical deep dive into RFT with GRPO. You’ll learn to build reward functions that you can use in the GRPO training process to guide an LLM toward better performance on multi-step reasoning tasks. In detail, you’ll: - Learn when reinforcement fine-tuning is a better fit than supervised fine-tuning, especially for tasks involving multi-step reasoning or limited labeled data. - Understand how GRPO uses programmable reward functions as a more scalable alternative to the human feedback required for other reinforcement learning algorithms, such as RLHF and DPO. - Frame the Wordle game as a reinforcement fine-tuning problem and see how an LLM can learn to plan, analyze feedback, and improve its strategy over time. - Design reward functions that power the reinforcement fine-tuning process. - Learn techniques for evaluating more subjective tasks, such as rating the quality of a text summary, using an LLM as a judge. - Understand why reward hacking happens and how to avoid it by adding penalty functions to discourage undesirable behaviors. - Learn the four key components of the loss calculation in the GRPO algorithm: token probability distribution ratios, advantages, clipping, and KL-divergence. - Launch reinforcement fine-tuning jobs using Predibase’s hosted training services. By the end of this course, you’ll be able to build and fine-tune LLMs using reinforcement learning to improve reasoning without relying on large labeled datasets or subjective human feedback. Please sign up here:

Andrew Ng

86,457 次观看 • 1 年前

New short course: Attention in Transformers: Concepts and Code in PyTorch. Last week we released a course on how LLM transformers work. This week, go deeper and learn about the technical ideas behind the attention mechanism, and see how to code it in PyTorch. This course is built with Joshua Starmer, Founder and CEO of StatQuest. The attention mechanism was a breakthrough that led to transformers, the architecture powering large language models like ChatGPT. Transformers, introduced in the 2017 paper: "Attention is All You Need" by Viswani and others, took off because of its highly scalable design. In this course, you’ll learn how the attention mechanism, a key element of transformer-based LLMs, works and implement it in PyTorch. You'll develop deep intuition about building reliable, functional, and scalable AI applications. What you will do: - Understand the evolution of the attention mechanism, a key breakthrough that led to transformers. - Learn the relationships between word embeddings, positional embeddings, and attention. - Learn about the Query, Key, and Value matrices, and how to produce and use them in attention. - Walk through the math required to calculate self-attention and masked self-attention to learn why and how they work. - Understand the difference between self-attention and masked self-attention and how one is used in the encoder to build context-aware embeddings and the other is used in the decoder for generative outputs. - Learn the details of the encoder-decoder architecture, cross-attention, and multi-head attention and how they are all incorporated into a transformer. - Use PyTorch to code a class that implements self-attention, masked self-attention, and multi-head attention. There're lots of exciting technical details in this course. Please sign up here:

New short course: Attention in Transformers: Concepts and Code in PyTorch. Last week we released a course on how LLM transformers work. This week, go deeper and learn about the technical ideas behind the attention mechanism, and see how to code it in PyTorch. This course is built with Joshua Starmer, Founder and CEO of StatQuest. The attention mechanism was a breakthrough that led to transformers, the architecture powering large language models like ChatGPT. Transformers, introduced in the 2017 paper: "Attention is All You Need" by Viswani and others, took off because of its highly scalable design. In this course, you’ll learn how the attention mechanism, a key element of transformer-based LLMs, works and implement it in PyTorch. You'll develop deep intuition about building reliable, functional, and scalable AI applications. What you will do: - Understand the evolution of the attention mechanism, a key breakthrough that led to transformers. - Learn the relationships between word embeddings, positional embeddings, and attention. - Learn about the Query, Key, and Value matrices, and how to produce and use them in attention. - Walk through the math required to calculate self-attention and masked self-attention to learn why and how they work. - Understand the difference between self-attention and masked self-attention and how one is used in the encoder to build context-aware embeddings and the other is used in the decoder for generative outputs. - Learn the details of the encoder-decoder architecture, cross-attention, and multi-head attention and how they are all incorporated into a transformer. - Use PyTorch to code a class that implements self-attention, masked self-attention, and multi-head attention. There're lots of exciting technical details in this course. Please sign up here:

Andrew Ng

132,220 次观看 • 1 年前

New short course: Prompt Engineering with Llama 2, built in collaboration with Meta AI at Meta, and taught by Amit Sangani! Meta's Llama 2 has been game-changing for AI. Building with open source lets you control your own data, scrutinize errors, update (or not) the models as you please, and work alongside the global community advancing open models. Llama isn't a single model, it's a collection of models. In this course, you'll: - Learn the differences between different Llama 2 flavors, and when to use each. - Prompt the Llama chat models -- you'll also see how Llama's instruction tags work -- so they can help you with day-to-day tasks, like writing or summarization. - Use advanced prompting, like few-shot prompting for classification, and chain-of-thought prompting for solving logic problems. - Use specialized models in the Llama collection for specific tasks, like Code Llama to help you write, analyze, and improve code, and Llama Guard, which checks prompts and model responses for harmful content. The course also touches on how to run Llama 2 locally on your own computer. I hope you’ll take this course and try out these powerful, open models!

New short course: Prompt Engineering with Llama 2, built in collaboration with Meta AI at Meta, and taught by Amit Sangani! Meta's Llama 2 has been game-changing for AI. Building with open source lets you control your own data, scrutinize errors, update (or not) the models as you please, and work alongside the global community advancing open models. Llama isn't a single model, it's a collection of models. In this course, you'll: - Learn the differences between different Llama 2 flavors, and when to use each. - Prompt the Llama chat models -- you'll also see how Llama's instruction tags work -- so they can help you with day-to-day tasks, like writing or summarization. - Use advanced prompting, like few-shot prompting for classification, and chain-of-thought prompting for solving logic problems. - Use specialized models in the Llama collection for specific tasks, like Code Llama to help you write, analyze, and improve code, and Llama Guard, which checks prompts and model responses for harmful content. The course also touches on how to run Llama 2 locally on your own computer. I hope you’ll take this course and try out these powerful, open models!

Andrew Ng

162,842 次观看 • 2 年前

Reasoning LLMs Guide [Full Video - Unedited] 1 hr talk on reasoning LLMs and how to best use them for different applications. Share with your devs & students. I discuss lots of fun ideas like meta-prompting, LLM-as-a-Judge, use cases, prompting tips, and much more.

Reasoning LLMs Guide [Full Video - Unedited] 1 hr talk on reasoning LLMs and how to best use them for different applications. Share with your devs & students. I discuss lots of fun ideas like meta-prompting, LLM-as-a-Judge, use cases, prompting tips, and much more.

elvis

61,360 次观看 • 1 年前

Function calling is a powerful way to extend the capabilities of LLMs and AI agents by letting them use external tools. Our new short course Function calling and Data Extraction with LLMs, created with @NexusflowX and taught by Jiantao Jiao and Venkat, demonstrates how to prompt LLMs to form calls to external functions. You'll work with NexusRavenV2-13B, a 13B parameter open-source model that excels in function calling tasks while still being small enough to host locally. Learn to use function calling to extract structured data from unstructured text and access web APIs, and build an end-to-end application that processes customer service transcripts. You'll learn how to build LLM-powered applications that can analyze feedback, automate data entry, and enhance search. Please get started here:

Function calling is a powerful way to extend the capabilities of LLMs and AI agents by letting them use external tools. Our new short course Function calling and Data Extraction with LLMs, created with @NexusflowX and taught by Jiantao Jiao and Venkat, demonstrates how to prompt LLMs to form calls to external functions. You'll work with NexusRavenV2-13B, a 13B parameter open-source model that excels in function calling tasks while still being small enough to host locally. Learn to use function calling to extract structured data from unstructured text and access web APIs, and build an end-to-end application that processes customer service transcripts. You'll learn how to build LLM-powered applications that can analyze feedback, automate data entry, and enhance search. Please get started here:

Andrew Ng

110,606 次观看 • 2 年前

LangChain: Chat with Your Data, a new free short course created with Harrison Chase, is now available! In this 1 hour course, you’ll learn how to build one of the most requested LLM-based applications: Answering questions using information from a document or collection of documents (often called Retrieval Augmented Generation). You'll also learn how to use vector stores and embeddings to retrieve document chunks relevant to a query. I hope you enjoy the course!

LangChain: Chat with Your Data, a new free short course created with Harrison Chase, is now available! In this 1 hour course, you’ll learn how to build one of the most requested LLM-based applications: Answering questions using information from a document or collection of documents (often called Retrieval Augmented Generation). You'll also learn how to use vector stores and embeddings to retrieve document chunks relevant to a query. I hope you enjoy the course!

Andrew Ng

384,282 次观看 • 3 年前

LLMs can make sense of retrieved context because of how transformers work. In one of the lessons from the Retrieval Augmented Generation (RAG) course, we unpack how LLMs process augmented prompts using token embeddings, positional vectors, and multi-head attention. Understanding these internals helps you design more reliable and efficient RAG systems. Watch the breakdown and keep learning how to build production-ready RAG systems in this course, taught by Zain:

LLMs can make sense of retrieved context because of how transformers work. In one of the lessons from the Retrieval Augmented Generation (RAG) course, we unpack how LLMs process augmented prompts using token embeddings, positional vectors, and multi-head attention. Understanding these internals helps you design more reliable and efficient RAG systems. Watch the breakdown and keep learning how to build production-ready RAG systems in this course, taught by Zain:

DeepLearning.AI

11,500 次观看 • 11 个月前

The future of AI is open-source. And ollama is the easiest way to build AI applications with open-source LLMs. Here's how to build a free, private RAG app using open-source tools. We'll use: - Ollama for LLMs and embedding models - PostgreSQL for data storage and retrieval - pgai Vectorizer for embedding creation and sync (I use Nomic for embeddings and tinnyllama as my LLM but you can substitute them for any models on Ollama)

The future of AI is open-source. And ollama is the easiest way to build AI applications with open-source LLMs. Here's how to build a free, private RAG app using open-source tools. We'll use: - Ollama for LLMs and embedding models - PostgreSQL for data storage and retrieval - pgai Vectorizer for embedding creation and sync (I use Nomic for embeddings and tinnyllama as my LLM but you can substitute them for any models on Ollama)

Avthar

34,261 次观看 • 1 年前

Our course recommendation of the day is “Post-training of LLMs, ” where you’ll learn how to customize pre-trained language models using Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Online Reinforcement Learning (RL). You'll learn when to use each method, how to curate training data, and implement them in code to shape model behavior effectively. Enroll here:

Our course recommendation of the day is “Post-training of LLMs, ” where you’ll learn how to customize pre-trained language models using Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Online Reinforcement Learning (RL). You'll learn when to use each method, how to curate training data, and implement them in code to shape model behavior effectively. Enroll here:

DeepLearning.AI

29,369 次观看 • 9 个月前