Loading video...

Video Failed to Load

There was a problem loading this video. This could be due to a temporary network issue or the video might be unavailable.

Learn a development pattern to systematically improve the accuracy and reliability of LLM applications in our new short course, Improving Accuracy of LLM Applications, built in partnership with Lamini and Meta, and taught by Lamini’s CEO Sharon Zhou, and Meta’s Senior Director of Partner Engineering, Amit Sangani. (Disclosure: I... am an investor in Lamini.) The path to tuning an LLM application can be complex. In this course, you'll learn a systematic sequence of steps for improving accuracy by reducing hallucinations: - Create an evaluation dataset to measure model accuracy - Add prompt engineering and self-reflection - Fine-tune your model including "memory-tuning" which is a new method of embedding facts in an LLM Using the Llama 3-8B parameter model, you will: - Build a text-to-SQL agent with a custom schema and simulate situations where it hallucinates - Understand the difference between instruction fine-tuning, which gives pre-trained LLMs instructions to follow, and memory fine-tuning - See how Performance-Efficient Fine-tuning (PEFT) techniques like Low-Rank Adaptation (LoRA) reduce training time by 100x and Mixture of Memory Experts (MoME) reduces it even further I appreciate Meta releasing the Llama's family of open models -- this course gives an example of the unique type of work that developers can do with such models. Please sign up here:show more

Andrew Ng

1,706,342 subscribers

66,435 views • 1 year ago •via X (Twitter)

Health & Wellness Education Science & Technology

Anya Rossi• Live Now

Private livecam show

10 Comments

Ashish Raj1 year ago

@LaminiAI @Meta @realSharonZhou @AndrewYNg sir please launch a specialization on generative ai

Muhammad Junaid1 year ago

@LaminiAI @Meta @realSharonZhou just in time - best resource for next Open Source Project!

Ravi | ML Engineer1 year ago

@LaminiAI @Meta @realSharonZhou nice, can't go wrong with a course like this!

PrivateClient🟢AI1 year ago

@LaminiAI @Meta @realSharonZhou Thank you for your rapid educational developments. We are taking due notes for hard to reach places. From lab to vulnerable people accelerated. #opencourseware

Yasen362 🍥1 year ago

@LaminiAI @Meta @realSharonZhou Hi Andrew, Love your course!

Superintelligent: Useful AI Tutorials1 year ago

@LaminiAI @Meta @realSharonZhou This course sounds like a great opportunity to analyze improving LLM applications! The systematic approach and unique techniques being taught are worth exploring.

Thorsten Linz1 year ago

@LaminiAI @Meta @realSharonZhou @AndrewYNg Intriguing course on enhancing LLM reliability. Any insights shared?

GPT.Biz1 year ago

@LaminiAI @Meta @realSharonZhou This course sounds like a must if you're serious about enhancing LLM application accuracy. Highly recommend it!

AgenticApp.com1 year ago

@LaminiAI @Meta @realSharonZhou Excited for the new course on LLM accuracy! Learn from @realSharonZhou and @asangani7 about evaluation, prompt engineering, and fine-tuning to enhance your model.

naila huq1 year ago

@LaminiAI @Meta @realSharonZhou Good sql model by lama3. On point. And good peft model. I enjoyed.

Related Videos

New short course on Fine-tuning LLMs! Many developers are moving beyond only prompting, to also fine-tuning LLMs - that is, taking a pre-trained model and training it further on your own data, which can deliver superior results inexpensively. In this course, Sharon Zhou, CEO of Lamini (disclosure: I’m a minor shareholder) shows you how to recognize when fine-tuning can be help, and how to train an open-source LLM on your own data. I hope you enjoy the course!

New short course on Fine-tuning LLMs! Many developers are moving beyond only prompting, to also fine-tuning LLMs - that is, taking a pre-trained model and training it further on your own data, which can deliver superior results inexpensively. In this course, Sharon Zhou, CEO of Lamini (disclosure: I’m a minor shareholder) shows you how to recognize when fine-tuning can be help, and how to train an open-source LLM on your own data. I hope you enjoy the course!

Andrew Ng

502,821 views • 2 years ago

New short course on LLMOps! LLMOps (large language model operations) is a rapidly developing field that takes ideas from MLOps (machine learning operations) and specializes them to building and deploying LLM-based applications. In this course, taught by Google Cloud's Erwin Huizenga, you'll learn to use automation to make building, tuning and deploying an LLM-based application less manual and more efficient. You'll learn how to: - Apply supervised fine-tuning to tune an LLM to a specific task - Automate and orchestrate LLM-tuning and deployment by customizing a pre-built tuning pipeline - Apply best practices for preparing training data for supervised fine-tuning of an LLM - Create an LLMOps workflow you can adapt to other LLM-tuning jobs This course doesn't assume any prior MLOps or LLMOps experience. Sign up here to learn about this emerging field!

New short course on LLMOps! LLMOps (large language model operations) is a rapidly developing field that takes ideas from MLOps (machine learning operations) and specializes them to building and deploying LLM-based applications. In this course, taught by Google Cloud's Erwin Huizenga, you'll learn to use automation to make building, tuning and deploying an LLM-based application less manual and more efficient. You'll learn how to: - Apply supervised fine-tuning to tune an LLM to a specific task - Automate and orchestrate LLM-tuning and deployment by customizing a pre-built tuning pipeline - Apply best practices for preparing training data for supervised fine-tuning of an LLM - Create an LLMOps workflow you can adapt to other LLM-tuning jobs This course doesn't assume any prior MLOps or LLMOps experience. Sign up here to learn about this emerging field!

Andrew Ng

221,787 views • 2 years ago

An exciting new course: Fine-tuning and Reinforcement Learning for LLMs: Intro to Post-training, taught by Sharon Zhou, VP of AI at AMD. Available now at Post-training is the key technique used by frontier labs to turn a base LLM--a model trained on massive unlabeled text to predict the next word/token--into a helpful, reliable assistant that can follow instructions. I've also seen many applications where post-training is what turns a demo application that works only 80% of the time into a reliable system that consistently performs. This course will teach you the most important post-training techniques! In this 5 module course, Sharon walks you through the complete post-training pipeline: supervised fine-tuning, reward modeling, RLHF, and techniques like PPO and GRPO. You'll also learn to use LoRA for efficient training, and to design evals that catch problems before and after deployment. Skills you'll gain: - Apply supervised fine-tuning and reinforcement learning (RLHF, PPO, GRPO) to align models to desired behaviors - Use LoRA for efficient fine-tuning without retraining entire models - Prepare datasets and generate synthetic data for post-training - Understand how to operate LLM production pipelines, with go/no-go decision points and feedback loops These advanced methods aren’t limited to frontier AI labs anymore, and you can now use them in your own applications. Learn here:

An exciting new course: Fine-tuning and Reinforcement Learning for LLMs: Intro to Post-training, taught by Sharon Zhou, VP of AI at AMD. Available now at Post-training is the key technique used by frontier labs to turn a base LLM--a model trained on massive unlabeled text to predict the next word/token--into a helpful, reliable assistant that can follow instructions. I've also seen many applications where post-training is what turns a demo application that works only 80% of the time into a reliable system that consistently performs. This course will teach you the most important post-training techniques! In this 5 module course, Sharon walks you through the complete post-training pipeline: supervised fine-tuning, reward modeling, RLHF, and techniques like PPO and GRPO. You'll also learn to use LoRA for efficient training, and to design evals that catch problems before and after deployment. Skills you'll gain: - Apply supervised fine-tuning and reinforcement learning (RLHF, PPO, GRPO) to align models to desired behaviors - Use LoRA for efficient fine-tuning without retraining entire models - Prepare datasets and generate synthetic data for post-training - Understand how to operate LLM production pipelines, with go/no-go decision points and feedback loops These advanced methods aren’t limited to frontier AI labs anymore, and you can now use them in your own applications. Learn here:

Andrew Ng

132,304 views • 8 months ago

New Course: Reinforcement Fine-Tuning LLMs with GRPO! Learn to use reinforcement learning to improve your LLM performance in this short course, built in collaboration with Predibase by Rubrik, and taught by Travis Addair, its Co-Founder and CTO, and Arnav Garg, its Senior Engineer and Machine Learning Lead. Reasoning models have been one of the most important developments in LLMs. Reinforcement Fine-Tuning (RFT) uses rewards to encourage LLMs to find solutions to multi-step reasoning tasks such as solving math problems and debugging code - without needing pre-existing training examples like in traditional supervised fine-tuning. Group Relative Policy Optimization (GRPO) is a reinforcement fine-tuning algorithm gaining rapid adoption. Developed by the DeepSeek team and used to train the R1 reasoning model, GRPO uses reward functions that you can write in Python to assign rewards to model responses. It’s beneficial for tasks with verifiable outcomes and can work well even with fewer than 100 training examples. It can also significantly improve the reasoning ability of smaller LLMs, making applications faster and more cost effective. In this course, you’ll take a technical deep dive into RFT with GRPO. You’ll learn to build reward functions that you can use in the GRPO training process to guide an LLM toward better performance on multi-step reasoning tasks. In detail, you’ll: - Learn when reinforcement fine-tuning is a better fit than supervised fine-tuning, especially for tasks involving multi-step reasoning or limited labeled data. - Understand how GRPO uses programmable reward functions as a more scalable alternative to the human feedback required for other reinforcement learning algorithms, such as RLHF and DPO. - Frame the Wordle game as a reinforcement fine-tuning problem and see how an LLM can learn to plan, analyze feedback, and improve its strategy over time. - Design reward functions that power the reinforcement fine-tuning process. - Learn techniques for evaluating more subjective tasks, such as rating the quality of a text summary, using an LLM as a judge. - Understand why reward hacking happens and how to avoid it by adding penalty functions to discourage undesirable behaviors. - Learn the four key components of the loss calculation in the GRPO algorithm: token probability distribution ratios, advantages, clipping, and KL-divergence. - Launch reinforcement fine-tuning jobs using Predibase’s hosted training services. By the end of this course, you’ll be able to build and fine-tune LLMs using reinforcement learning to improve reasoning without relying on large labeled datasets or subjective human feedback. Please sign up here:

New Course: Reinforcement Fine-Tuning LLMs with GRPO! Learn to use reinforcement learning to improve your LLM performance in this short course, built in collaboration with Predibase by Rubrik, and taught by Travis Addair, its Co-Founder and CTO, and Arnav Garg, its Senior Engineer and Machine Learning Lead. Reasoning models have been one of the most important developments in LLMs. Reinforcement Fine-Tuning (RFT) uses rewards to encourage LLMs to find solutions to multi-step reasoning tasks such as solving math problems and debugging code - without needing pre-existing training examples like in traditional supervised fine-tuning. Group Relative Policy Optimization (GRPO) is a reinforcement fine-tuning algorithm gaining rapid adoption. Developed by the DeepSeek team and used to train the R1 reasoning model, GRPO uses reward functions that you can write in Python to assign rewards to model responses. It’s beneficial for tasks with verifiable outcomes and can work well even with fewer than 100 training examples. It can also significantly improve the reasoning ability of smaller LLMs, making applications faster and more cost effective. In this course, you’ll take a technical deep dive into RFT with GRPO. You’ll learn to build reward functions that you can use in the GRPO training process to guide an LLM toward better performance on multi-step reasoning tasks. In detail, you’ll: - Learn when reinforcement fine-tuning is a better fit than supervised fine-tuning, especially for tasks involving multi-step reasoning or limited labeled data. - Understand how GRPO uses programmable reward functions as a more scalable alternative to the human feedback required for other reinforcement learning algorithms, such as RLHF and DPO. - Frame the Wordle game as a reinforcement fine-tuning problem and see how an LLM can learn to plan, analyze feedback, and improve its strategy over time. - Design reward functions that power the reinforcement fine-tuning process. - Learn techniques for evaluating more subjective tasks, such as rating the quality of a text summary, using an LLM as a judge. - Understand why reward hacking happens and how to avoid it by adding penalty functions to discourage undesirable behaviors. - Learn the four key components of the loss calculation in the GRPO algorithm: token probability distribution ratios, advantages, clipping, and KL-divergence. - Launch reinforcement fine-tuning jobs using Predibase’s hosted training services. By the end of this course, you’ll be able to build and fine-tune LLMs using reinforcement learning to improve reasoning without relying on large labeled datasets or subjective human feedback. Please sign up here:

Andrew Ng

86,457 views • 1 year ago

Learn to train an LLM with distributed data while ensuring privacy using federated learning in a new two-part short course, Intro to Federated Learning and Federated Fine-tuning of LLMs with Private Data, created with Flower and taught by Daniel J. Beutel and nic lane. Federated learning allows a single model to be trained across multiple devices, such as phones, or multiple organizations, such as hospitals, without the need to share data to a central server. This two-part course gives you an introduction to federated learning, and then teaches you how to fine-tune your large language model with distributed data using Flower Lab’s open source federated learning framework. You’ll learn: - How to use federated learning to train a variety of models, ranging from speech and vision models to LLMs, across distributed data while offering data privacy options to users and organizations. - Privacy Enhancing Technologies like differential privacy (DP), which obscures individual data by adding calibrated noise to query results. - Two variants of differential privacy - Central and Local - and how to choose depending on your use case. - How to measure and decrease bandwidth usage to make federated learning more practical and efficient with techniques like using pre-trained models and Parameter-Efficient Fine-Tuning - How federated LLM fine-tuning reduces the risk of leaking training data. Sign up here!

Learn to train an LLM with distributed data while ensuring privacy using federated learning in a new two-part short course, Intro to Federated Learning and Federated Fine-tuning of LLMs with Private Data, created with Flower and taught by Daniel J. Beutel and nic lane. Federated learning allows a single model to be trained across multiple devices, such as phones, or multiple organizations, such as hospitals, without the need to share data to a central server. This two-part course gives you an introduction to federated learning, and then teaches you how to fine-tune your large language model with distributed data using Flower Lab’s open source federated learning framework. You’ll learn: - How to use federated learning to train a variety of models, ranging from speech and vision models to LLMs, across distributed data while offering data privacy options to users and organizations. - Privacy Enhancing Technologies like differential privacy (DP), which obscures individual data by adding calibrated noise to query results. - Two variants of differential privacy - Central and Local - and how to choose depending on your use case. - How to measure and decrease bandwidth usage to make federated learning more practical and efficient with techniques like using pre-trained models and Parameter-Efficient Fine-Tuning - How federated LLM fine-tuning reduces the risk of leaking training data. Sign up here!

Andrew Ng

64,558 views • 2 years ago

New short course on Reinforcement Learning from Human Feedback! RLHF is one of the key techniques that led to the rise of modern LLMs. It is used to align LLMs with human preferences, to make them more honest, helpful and harmless, by (i) learning a reward function that mimics human preferences, as expressed in human-provided labels, then, (ii) tuning an LLM to generate outputs that receive a high reward. In this course, taught by Nikita Namjoshi, Developer Advocate for GenAI at Google Cloud, you'll learn the details of how RLHF works, including how to apply it to tune an LLM for your own applications. You'll also use an open source library to tune a base LLM to align with human preferences expressed in a training set, and evaluate the tuned model by comparing its responses before and after RLHF-tuning. Please sign up here!

New short course on Reinforcement Learning from Human Feedback! RLHF is one of the key techniques that led to the rise of modern LLMs. It is used to align LLMs with human preferences, to make them more honest, helpful and harmless, by (i) learning a reward function that mimics human preferences, as expressed in human-provided labels, then, (ii) tuning an LLM to generate outputs that receive a high reward. In this course, taught by Nikita Namjoshi, Developer Advocate for GenAI at Google Cloud, you'll learn the details of how RLHF works, including how to apply it to tune an LLM for your own applications. You'll also use an open source library to tune a base LLM to align with human preferences expressed in a training set, and evaluate the tuned model by comparing its responses before and after RLHF-tuning. Please sign up here!

Andrew Ng

205,537 views • 2 years ago

Learn how to build an optimized LLM inference system from the ground up in our new short course, Efficiently Serving LLMs, built in collaboration with Predibase by Rubrik and taught by Travis Addair. Whether you're serving your own LLM or using a model hosting service, this course will give you a deep understanding of the optimizations required to efficiently serve many users at once. - Learn how LLMs generate text one token at a time, and how techniques like KV caching, continuous batching, and quantization speed things up and optimize memory usage for serving multiple users. - Benchmark the performance of these LLM optimizations to explore the trade-offs between quickly responding to an individual user’s request vs. serving many users at once. - Use techniques like low-rank adaptation (LoRA) to efficiently serve hundreds of unique, custom fine-tuned models on a single device, without sacrificing throughput. - Use Predibase's LoRAX framework to see optimization techniques in action on a real LLM server. Sign up here:

Learn how to build an optimized LLM inference system from the ground up in our new short course, Efficiently Serving LLMs, built in collaboration with Predibase by Rubrik and taught by Travis Addair. Whether you're serving your own LLM or using a model hosting service, this course will give you a deep understanding of the optimizations required to efficiently serve many users at once. - Learn how LLMs generate text one token at a time, and how techniques like KV caching, continuous batching, and quantization speed things up and optimize memory usage for serving multiple users. - Benchmark the performance of these LLM optimizations to explore the trade-offs between quickly responding to an individual user’s request vs. serving many users at once. - Use techniques like low-rank adaptation (LoRA) to efficiently serve hundreds of unique, custom fine-tuned models on a single device, without sacrificing throughput. - Use Predibase's LoRAX framework to see optimization techniques in action on a real LLM server. Sign up here:

Andrew Ng

104,727 views • 2 years ago

"Introducing Multimodal Llama 3.2": As promised two weeks ago, here's the short course on Meta's latest open model! This short course is created with Meta and taught by Amit Sangani, Director of AI Partner Engineering at Meta. Meta’s Llama family of models is leading the way in open models, allowing anyone to download, customize, fine-tune, or build new applications on top of them. Learn about the vision capabilities of the Llama 3.2, and use it for image classification, prompting, tokenization, tool-calling. You'll also learn about the open-source Llama stack, which gives building blocks for many different stages of the LLM application life cycle. In detail, you’ll: - Learn what are the features of Meta's four newest models, and when to use which Llama model. - Learn best practices for multimodal prompting, with applications to advanced image reasoning, illustrated by many examples: Understanding errors on a car dashboard, adding up the total of photographed restaurant receipts, grading written math homework. - Use different roles—system, user, assistant, ipython—in the Llama 3.1 and 3.2 models and the prompt format that identifies those roles. - Understand how Llama uses the tiktoken tokenizer, and how it has expanded to a 128k vocabulary size that improves encoding efficiency and multilingual support. - Learn how to prompt Llama to call built-in and custom tools (functions) with examples for web search and solving math equations. - Learn about Llama Stack, a standardized interface for common toolchain components like fine-tuning or synthetic data generation, useful for building agentic applications. By the end of this course, you’ll be equipped to build out new applications with the new Llama 3.2. Thank you to Ahmad Al-Dahle, Amit Sangani, and the whole AI at Meta team AI at Meta for all the hard work on Llama 3.2 — we’re excited to make these open models even more accessible to more developers with this new course! Please sign up here!

"Introducing Multimodal Llama 3.2": As promised two weeks ago, here's the short course on Meta's latest open model! This short course is created with Meta and taught by Amit Sangani, Director of AI Partner Engineering at Meta. Meta’s Llama family of models is leading the way in open models, allowing anyone to download, customize, fine-tune, or build new applications on top of them. Learn about the vision capabilities of the Llama 3.2, and use it for image classification, prompting, tokenization, tool-calling. You'll also learn about the open-source Llama stack, which gives building blocks for many different stages of the LLM application life cycle. In detail, you’ll: - Learn what are the features of Meta's four newest models, and when to use which Llama model. - Learn best practices for multimodal prompting, with applications to advanced image reasoning, illustrated by many examples: Understanding errors on a car dashboard, adding up the total of photographed restaurant receipts, grading written math homework. - Use different roles—system, user, assistant, ipython—in the Llama 3.1 and 3.2 models and the prompt format that identifies those roles. - Understand how Llama uses the tiktoken tokenizer, and how it has expanded to a 128k vocabulary size that improves encoding efficiency and multilingual support. - Learn how to prompt Llama to call built-in and custom tools (functions) with examples for web search and solving math equations. - Learn about Llama Stack, a standardized interface for common toolchain components like fine-tuning or synthetic data generation, useful for building agentic applications. By the end of this course, you’ll be equipped to build out new applications with the new Llama 3.2. Thank you to Ahmad Al-Dahle, Amit Sangani, and the whole AI at Meta team AI at Meta for all the hard work on Llama 3.2 — we’re excited to make these open models even more accessible to more developers with this new course! Please sign up here!

Andrew Ng

131,767 views • 1 year ago

New short course: DSPy: Build and Optimize Agentic Apps DSPy is a powerful open-source framework for automatically tuning prompts for GenAI applications. In this course, you'll learn to use DSPy, together with MLflow. This is built in partnership with Databricks and taught by Chen Qian, co-lead of the DSPy framework. Many AI builders spend hours hand-tuning prompts. When given a set of evals, DSPy automates this process. It’s especially useful for optimizing prompts, including few-shot prompts, in complex agentic AI workflows. Further, if you switch an application to a newer LLM, performance can degrade if your prompts were optimized to the previous model. DSPy automatically optimizes the entire system for the new LLM as well, using just a few evaluation examples. This course teaches DSPy works, and best practices for using it. You’ll write programs using DSPy’s signature-based programming model, debug them with MLflow tracing -- to gain visibility into how different parts of a pipeline, as well as how the overall system, are performing -- and automatically improve their accuracy with DSPy Optimizer. Please sign up here:

New short course: DSPy: Build and Optimize Agentic Apps DSPy is a powerful open-source framework for automatically tuning prompts for GenAI applications. In this course, you'll learn to use DSPy, together with MLflow. This is built in partnership with Databricks and taught by Chen Qian, co-lead of the DSPy framework. Many AI builders spend hours hand-tuning prompts. When given a set of evals, DSPy automates this process. It’s especially useful for optimizing prompts, including few-shot prompts, in complex agentic AI workflows. Further, if you switch an application to a newer LLM, performance can degrade if your prompts were optimized to the previous model. DSPy automatically optimizes the entire system for the new LLM as well, using just a few evaluation examples. This course teaches DSPy works, and best practices for using it. You’ll write programs using DSPy’s signature-based programming model, debug them with MLflow tracing -- to gain visibility into how different parts of a pipeline, as well as how the overall system, are performing -- and automatically improve their accuracy with DSPy Optimizer. Please sign up here:

Andrew Ng

181,457 views • 1 year ago

New course on serving LLMs efficiently -- how do you serve models to many concurrent users at low latency and reasonable cost? This short course is built with Red Hat and taught by Cedric Clyburn. Efficient LLM serving requires efficient memory management. A 70B-parameter model takes ~140 GB just to load the weights. On top of that, every active request needs its own chunk of GPU memory, the KV cache, to store the token context it has built up so far. In this course, you'll learn to reduce a model's memory footprint with quantization and serve it using vLLM, which handles many concurrent requests efficiently through smart memory management. Skills you'll gain: - Quantize a model and measure the accuracy tradeoff - Serve a model with vLLM and watch it handle concurrent requests efficiently - Benchmark your deployment and make informed tradeoffs between speed, cost, and accuracy Join and learn to serve LLMs efficiently:

New course on serving LLMs efficiently -- how do you serve models to many concurrent users at low latency and reasonable cost? This short course is built with Red Hat and taught by Cedric Clyburn. Efficient LLM serving requires efficient memory management. A 70B-parameter model takes ~140 GB just to load the weights. On top of that, every active request needs its own chunk of GPU memory, the KV cache, to store the token context it has built up so far. In this course, you'll learn to reduce a model's memory footprint with quantization and serve it using vLLM, which handles many concurrent requests efficiently through smart memory management. Skills you'll gain: - Quantize a model and measure the accuracy tradeoff - Serve a model with vLLM and watch it handle concurrent requests efficiently - Benchmark your deployment and make informed tradeoffs between speed, cost, and accuracy Join and learn to serve LLMs efficiently:

Andrew Ng

124,185 views • 1 month ago

Introducing "Building with Llama 4." This short course is created with Meta AI at Meta, and taught by Amit Sangani, Director of Partner Engineering for Meta’s AI team. Meta’s new Llama 4 has added three new models and introduced the Mixture-of-Experts (MoE) architecture to its family of open-weight models, making them more efficient to serve. In this course, you’ll work with two of the three new models introduced in Llama 4. First is Maverick, a 400B parameter model, with 128 experts and 17B active parameters. Second is Scout, a 109B parameter model with 16 experts and 17B active parameters. Maverick and Scout support long context windows of up to a million tokens and 10M tokens, respectively. The latter is enough to support directly inputting even fairly large GitHub repos for analysis! In hands-on lessons, you’ll build apps using Llama 4’s new multimodal capabilities including reasoning across multiple images and image grounding, in which you can identify elements in images. You’ll also use the official Llama API, work with Llama 4’s long-context abilities, and learn about Llama’s newest open-source tools: its prompt optimization tool that automatically improves system prompts and synthetic data kit that generates high-quality datasets for fine-tuning. If you need an open model, Llama is a great option, and the Llama 4 family is an important part of any GenAI developer's toolkit. Through this course, you’ll learn to call Llama 4 via API, use its optimization tools, and build features that span text, images, and large context. Please sign up here:

Introducing "Building with Llama 4." This short course is created with Meta AI at Meta, and taught by Amit Sangani, Director of Partner Engineering for Meta’s AI team. Meta’s new Llama 4 has added three new models and introduced the Mixture-of-Experts (MoE) architecture to its family of open-weight models, making them more efficient to serve. In this course, you’ll work with two of the three new models introduced in Llama 4. First is Maverick, a 400B parameter model, with 128 experts and 17B active parameters. Second is Scout, a 109B parameter model with 16 experts and 17B active parameters. Maverick and Scout support long context windows of up to a million tokens and 10M tokens, respectively. The latter is enough to support directly inputting even fairly large GitHub repos for analysis! In hands-on lessons, you’ll build apps using Llama 4’s new multimodal capabilities including reasoning across multiple images and image grounding, in which you can identify elements in images. You’ll also use the official Llama API, work with Llama 4’s long-context abilities, and learn about Llama’s newest open-source tools: its prompt optimization tool that automatically improves system prompts and synthetic data kit that generates high-quality datasets for fine-tuning. If you need an open model, Llama is a great option, and the Llama 4 family is an important part of any GenAI developer's toolkit. Through this course, you’ll learn to call Llama 4 via API, use its optimization tools, and build features that span text, images, and large context. Please sign up here:

Andrew Ng

67,710 views • 1 year ago

New short course on Pretraining LLMs! Developed with Upstage and taught by their CEO Sung Kim and CSO Lucy Park. While prompting or fine-tuning existing models works well for many general language tasks, pretraining is valuable for specialized domains or languages with limited representation in current models. This course walks you through the LLM pretraining pipeline: 1. Data preparation: Learn to source, clean, and prepare training data using HuggingFace. 2. Model architecture: Configure transformer networks, including modifying existing models. 3. Training: Set up and run training using open-source libraries. 4. Evaluation: Benchmark performance using popular evaluation strategies. As an example use case, you'll also compare the output of a base model with its fine-tuned and further pretrained variants, to see the impact of pretraining on a model's ability to write Python. The course also explores an innovative technique called depth up-scaling, which Upstage used to train their Solar model family, reducing pretraining compute costs by up to 70%. This technique works by first duplicating layers of a smaller pretrained model to form a larger model, and then further pretraining the result. Sign up here!

New short course on Pretraining LLMs! Developed with Upstage and taught by their CEO Sung Kim and CSO Lucy Park. While prompting or fine-tuning existing models works well for many general language tasks, pretraining is valuable for specialized domains or languages with limited representation in current models. This course walks you through the LLM pretraining pipeline: 1. Data preparation: Learn to source, clean, and prepare training data using HuggingFace. 2. Model architecture: Configure transformer networks, including modifying existing models. 3. Training: Set up and run training using open-source libraries. 4. Evaluation: Benchmark performance using popular evaluation strategies. As an example use case, you'll also compare the output of a base model with its fine-tuned and further pretrained variants, to see the impact of pretraining on a model's ability to write Python. The course also explores an innovative technique called depth up-scaling, which Upstage used to train their Solar model family, reducing pretraining compute costs by up to 70%. This technique works by first duplicating layers of a smaller pretrained model to form a larger model, and then further pretraining the result. Sign up here!

Andrew Ng

85,657 views • 2 years ago

New Course: Post-training of LLMs Learn to post-train and customize an LLM in this short course, taught by Banghua Zhu, Assistant Professor at the University of Washington University of Washington, and co-founder of @NexusflowX. Training an LLM to follow instructions or answer questions has two key stages: pre-training and post-training. In pre-training, it learns to predict the next word or token from large amounts of unlabeled text. In post-training, it learns useful behaviors such as following instructions, tool use, and reasoning. Post-training transforms a general-purpose token predictor—trained on trillions of unlabeled text tokens—into an assistant that follows instructions and performs specific tasks. Because it is much cheaper than pre-training, it is practical for many more teams to incorporate post-training methods into their workflows than pre-training. In this course, you’ll learn three common post-training methods—Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Online Reinforcement Learning (RL)—and how to use each one effectively. With SFT, you train the model on pairs of input and ideal output responses. With DPO, you provide both a preferred (chosen) and a less preferred (rejected) response and train the model to favor the preferred output. With RL, the model generates an output, receives a reward score based on human or automated feedback, and updates the model to improve performance. You’ll learn the basic concepts, common use cases, and principles for curating high-quality data for effective training. Through hands-on labs, you’ll download a pre-trained model from Hugging Face and post-train it using SFT, DPO, and RL to see how each technique shapes model behavior. In detail, you’ll: - Understand what post-training is, when to use it, and how it differs from pre-training. - Build an SFT pipeline to turn a base model into an instruct model. - Explore how DPO reshapes behavior by minimizing contrastive loss—penalizing poor responses and reinforcing preferred ones. - Implement a DPO pipeline to change the identity of a chat assistant. - Learn online RL methods such as Proximal Policy Optimization (PPO) and Group Relative Policy Optimization (GRPO), and how to design reward functions. - Train a model with GRPO to improve its math capabilities using a verifiable reward. Post-training is one of the most rapidly developing areas of LLM training. Whether you’re building a high-accuracy context-specific assistant, fine-tuning a model's tone, or improving task-specific accuracy, this course will give you experience with the most important techniques shaping how LLMs are post-trained today. Please sign up here:

New Course: Post-training of LLMs Learn to post-train and customize an LLM in this short course, taught by Banghua Zhu, Assistant Professor at the University of Washington University of Washington, and co-founder of @NexusflowX. Training an LLM to follow instructions or answer questions has two key stages: pre-training and post-training. In pre-training, it learns to predict the next word or token from large amounts of unlabeled text. In post-training, it learns useful behaviors such as following instructions, tool use, and reasoning. Post-training transforms a general-purpose token predictor—trained on trillions of unlabeled text tokens—into an assistant that follows instructions and performs specific tasks. Because it is much cheaper than pre-training, it is practical for many more teams to incorporate post-training methods into their workflows than pre-training. In this course, you’ll learn three common post-training methods—Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Online Reinforcement Learning (RL)—and how to use each one effectively. With SFT, you train the model on pairs of input and ideal output responses. With DPO, you provide both a preferred (chosen) and a less preferred (rejected) response and train the model to favor the preferred output. With RL, the model generates an output, receives a reward score based on human or automated feedback, and updates the model to improve performance. You’ll learn the basic concepts, common use cases, and principles for curating high-quality data for effective training. Through hands-on labs, you’ll download a pre-trained model from Hugging Face and post-train it using SFT, DPO, and RL to see how each technique shapes model behavior. In detail, you’ll: - Understand what post-training is, when to use it, and how it differs from pre-training. - Build an SFT pipeline to turn a base model into an instruct model. - Explore how DPO reshapes behavior by minimizing contrastive loss—penalizing poor responses and reinforcing preferred ones. - Implement a DPO pipeline to change the identity of a chat assistant. - Learn online RL methods such as Proximal Policy Optimization (PPO) and Group Relative Policy Optimization (GRPO), and how to design reward functions. - Train a model with GRPO to improve its math capabilities using a verifiable reward. Post-training is one of the most rapidly developing areas of LLM training. Whether you’re building a high-accuracy context-specific assistant, fine-tuning a model's tone, or improving task-specific accuracy, this course will give you experience with the most important techniques shaping how LLMs are post-trained today. Please sign up here:

Andrew Ng

125,146 views • 1 year ago

New short course: Evaluating AI Agents! Evals are important for driving AI system improvements, and in this course you'll learn to systematically assess and improve an AI agent’s performance. This is built in partnership with Arize AI and taught by John Gilhuly, Head of Developer Relations, and , Director of Product. I've often found evals to be a critical tool in the agent development process - they can be the difference between picking the right thing to work on vs. wasting weeks of effort. Whether you’re building a shopping assistant, coding agent, or research assistant, having a structured evaluation process helps you refine its performance systematically, rather than relying on random trial and error. This course shows you how to structure your evals to assess the performance of each component of an agent and its end-to-end performance. For each component, you select the appropriate evaluators, test examples, and performance metrics. This helps you identify areas for improvement both during development and in production. (If you're familiar with error analysis in supervised learning, think of this as adapting those ideas to agentic workflows.) In this course, you'll build an AI agent, and add observability to visualize and debug its steps. You’ll learn about code-based evals, in which you write code explicitly to test a certain step, as well as LLM-as-a-Judge evals, in which you prompt an LLM to efficiently come up with ways to evaluate more open-ended outputs. In detail, you’ll: - Understand key differences between evaluating LLM-based systems and traditional software testing. - Add observability to an agent by collecting traces of the steps taken by the agent and visualizing them - Choose the appropriate evaluator - code-based, LLM-as-a-Judge, human-annotation based - for each component. - Compute a convergence score to evaluate if your agent can respond to a query in an efficient number of steps. - Run structured experiments to improve the agent’s performance by exploring changes to the prompt, LLM model, or the agent’s logic. - Understand how to deploy these evaluation techniques to monitor the agent’s performance in production. By the end of this course, you’ll know how to trace AI agents, systematically evaluate them, and improve their performance. Please sign up here:

New short course: Evaluating AI Agents! Evals are important for driving AI system improvements, and in this course you'll learn to systematically assess and improve an AI agent’s performance. This is built in partnership with Arize AI and taught by John Gilhuly, Head of Developer Relations, and , Director of Product. I've often found evals to be a critical tool in the agent development process - they can be the difference between picking the right thing to work on vs. wasting weeks of effort. Whether you’re building a shopping assistant, coding agent, or research assistant, having a structured evaluation process helps you refine its performance systematically, rather than relying on random trial and error. This course shows you how to structure your evals to assess the performance of each component of an agent and its end-to-end performance. For each component, you select the appropriate evaluators, test examples, and performance metrics. This helps you identify areas for improvement both during development and in production. (If you're familiar with error analysis in supervised learning, think of this as adapting those ideas to agentic workflows.) In this course, you'll build an AI agent, and add observability to visualize and debug its steps. You’ll learn about code-based evals, in which you write code explicitly to test a certain step, as well as LLM-as-a-Judge evals, in which you prompt an LLM to efficiently come up with ways to evaluate more open-ended outputs. In detail, you’ll: - Understand key differences between evaluating LLM-based systems and traditional software testing. - Add observability to an agent by collecting traces of the steps taken by the agent and visualizing them - Choose the appropriate evaluator - code-based, LLM-as-a-Judge, human-annotation based - for each component. - Compute a convergence score to evaluate if your agent can respond to a query in an efficient number of steps. - Run structured experiments to improve the agent’s performance by exploring changes to the prompt, LLM model, or the agent’s logic. - Understand how to deploy these evaluation techniques to monitor the agent’s performance in production. By the end of this course, you’ll know how to trace AI agents, systematically evaluate them, and improve their performance. Please sign up here:

Andrew Ng

126,406 views • 1 year ago

Revolutionizing Move Programming with OpenLedger In this demo, we showcase how Move datasets contributed by data providers to OpenLedger’s datanets are used to fine-tune specialized models with LoRA fine-tuning. As seen in the video, we showcase an example on how builders can deploy a Move-specialized model that powers Co-pilot agents using our no-code model fine-tuning platform. This is the future of AI and Web3 innovation. Watch this space to see more specialised models and data feeds being built for next generation agents on top of OpenLedger #Move

Revolutionizing Move Programming with OpenLedger In this demo, we showcase how Move datasets contributed by data providers to OpenLedger’s datanets are used to fine-tune specialized models with LoRA fine-tuning. As seen in the video, we showcase an example on how builders can deploy a Move-specialized model that powers Co-pilot agents using our no-code model fine-tuning platform. This is the future of AI and Web3 innovation. Watch this space to see more specialised models and data feeds being built for next generation agents on top of OpenLedger #Move

OpenLedger

61,662 views • 1 year ago

We developed an RL method for fine-tuning our models for precise tasks in just a few hours or even minutes. Instead of training the whole model, we add an “RL token” output to π-0.6, our latest model, which is used by a tiny actor and critic to learn quickly with RL.

We developed an RL method for fine-tuning our models for precise tasks in just a few hours or even minutes. Instead of training the whole model, we add an “RL token” output to π-0.6, our latest model, which is used by a tiny actor and critic to learn quickly with RL.

Physical Intelligence

434,203 views • 4 months ago

In Prompt Engineering for Vision Models, taught by Abby Jacques Verre and Caleb Kaiser of Comet , you’ll learn how to prompt and fine-tune vision models for personalized image generation, image editing, object detection and segmentation. The prompts you'll use for vision models could be text, point coordinates, or bounding boxes, depending on the model. You'll also learn to tune hyperparameters to shape the output. Models you'll use include Segment-Anything Model (SAM), OWL-ViT, and Stable Diffusion. You'll also learn to fine-tune Stable Diffusion to generate personalized images (say, an image of a specific person), using a handful of images for training. As an example of a multi-step workflow, you'll use OWL-ViT to detect an object based on a text prompt, then pass the bounding box to SAM to create a segmentation mask, and input that mask into Stable Diffusion to replace the original object with a new one based on a text prompt. Controlling vision models can be tricky; this course will teach prompting and fine-tuning techniques to get precise control over their output. Get started here:

In Prompt Engineering for Vision Models, taught by Abby Jacques Verre and Caleb Kaiser of Comet , you’ll learn how to prompt and fine-tune vision models for personalized image generation, image editing, object detection and segmentation. The prompts you'll use for vision models could be text, point coordinates, or bounding boxes, depending on the model. You'll also learn to tune hyperparameters to shape the output. Models you'll use include Segment-Anything Model (SAM), OWL-ViT, and Stable Diffusion. You'll also learn to fine-tune Stable Diffusion to generate personalized images (say, an image of a specific person), using a handful of images for training. As an example of a multi-step workflow, you'll use OWL-ViT to detect an object based on a text prompt, then pass the bounding box to SAM to create a segmentation mask, and input that mask into Stable Diffusion to replace the original object with a new one based on a text prompt. Controlling vision models can be tricky; this course will teach prompting and fine-tuning techniques to get precise control over their output. Get started here:

Andrew Ng

151,198 views • 2 years ago

Small Language Models (SML) are the future of AI. "Small" (SML) instead of "Large" (LLM). These small models are highly specialized models with superhuman abilities on specific tasks. Here are two techniques to build these models: • Spectrum • Model Merging I give you a short introduction in the attached video, but here is a quick summary: Spectrum helps us identify the most relevant layers to solve one specific task. We can ignore everything else and focus on fine-tuning these layers. Using Spectrum, we can fine-tune models in a heartbeat. Model Merging combines multiple models into a unique, much better model than any of the individual input models. You can also combine models specialized in different tasks and get a model with multiple abilities. This is the state of the art of productizing models. It's what Arcee.ai's platform does behind the scenes. Arcee collaborated with me on this post and is sponsoring it. There are three main steps to produce a model for your particular use case: 1. You create a dataset by uploading your data. 2. You train a model. At this step, Arcee uses Spectrum and Model Merging to produce a highly specialized model for your task. 3. You can deploy that model to any environment you want. Three important notes: • Training process is 2x faster and 2x cheaper than regular fine-tuning. • Resultant models are smaller and have higher accuracy. • They create these specialized models from open-source models. Check this site so you can fully appreciate how this works: If you want to fine-tune an open-source model, consider Arcee's platform. This is the state of the art.

Small Language Models (SML) are the future of AI. "Small" (SML) instead of "Large" (LLM). These small models are highly specialized models with superhuman abilities on specific tasks. Here are two techniques to build these models: • Spectrum • Model Merging I give you a short introduction in the attached video, but here is a quick summary: Spectrum helps us identify the most relevant layers to solve one specific task. We can ignore everything else and focus on fine-tuning these layers. Using Spectrum, we can fine-tune models in a heartbeat. Model Merging combines multiple models into a unique, much better model than any of the individual input models. You can also combine models specialized in different tasks and get a model with multiple abilities. This is the state of the art of productizing models. It's what Arcee.ai's platform does behind the scenes. Arcee collaborated with me on this post and is sponsoring it. There are three main steps to produce a model for your particular use case: 1. You create a dataset by uploading your data. 2. You train a model. At this step, Arcee uses Spectrum and Model Merging to produce a highly specialized model for your task. 3. You can deploy that model to any environment you want. Three important notes: • Training process is 2x faster and 2x cheaper than regular fine-tuning. • Resultant models are smaller and have higher accuracy. • They create these specialized models from open-source models. Check this site so you can fully appreciate how this works: If you want to fine-tune an open-source model, consider Arcee's platform. This is the state of the art.

Santiago

164,162 views • 2 years ago

How can you solve complex tasks using a Large Language Model? Here is a 2-minute introduction to everything you need to know to 10x the quality of your results. Let's talk about three techniques, in order of complexity, starting with the easiest one: • In-Context Learning • Indexing + In-Context Learning • Fine-tuning In-Context Learning The team that trained GPT-3 found something they couldn't explain: You can condition a model using examples of how you want it to behave. I included an example prompt in the attached video. You can "teach" the model how you want it to interpret questions, select the correct answers, and format the results by giving a few examples. You can also give specific knowledge to the model that will be helpful when formulating answers. We call this approach "grounding the model." There's another example in the video. Indexing + In-Context Learning Unfortunately, there is a limit to how much data you can include in a prompt. We call this the "context size." One version of GPT-4 supports a context of approximately 6,000 words, while the other supports 25,000 words. Although this sounds like a lot, many applications need more than that. Imagine you wrote a book and want to build an application to answer any questions about your story. What happens if your book is longer than the context? That's where Indexing comes in. Using a model, you can turn every book passage into an embedding. These are vectors, numbers that "encode" the passage's text. You can then store these embeddings in a particular database that supports fast retrieval of these vectors. You can then turn any question into an embedding and search the database for the list of passages that are similar to that query. Instead of using the entire book to ask the model, you can now use the relevant passages as in-context information, effectively working around the context size limitation. Fine-tuning Fine-tuning can give you an extra boost to get reliable outputs from your LLM. It is, however, the most complex approach on the list. There are different approaches to fine-tuning a model with your data. A popular technique is to process your data with your LLM and use the outputs to train a new classifier that solves your specific task. Notice that here you aren't modifying the LLM. Instead, you are chaining it with your trained classifier. Another approach is to modify the parameters of the LLM using your data. Think of this as "rewiring" the model in a way that solves your particular task. The results and costs will vary depending on how many layers you want to fine-tune from the original model. Many companies think that fine-tuning is the solution to their problems. In my experience, many will benefit from exploring the other two approaches. I love explaining Machine Learning and Artificial Intelligence ideas. If you enjoy in-depth content like this, follow me Santiago so you don't miss what comes next.

How can you solve complex tasks using a Large Language Model? Here is a 2-minute introduction to everything you need to know to 10x the quality of your results. Let's talk about three techniques, in order of complexity, starting with the easiest one: • In-Context Learning • Indexing + In-Context Learning • Fine-tuning In-Context Learning The team that trained GPT-3 found something they couldn't explain: You can condition a model using examples of how you want it to behave. I included an example prompt in the attached video. You can "teach" the model how you want it to interpret questions, select the correct answers, and format the results by giving a few examples. You can also give specific knowledge to the model that will be helpful when formulating answers. We call this approach "grounding the model." There's another example in the video. Indexing + In-Context Learning Unfortunately, there is a limit to how much data you can include in a prompt. We call this the "context size." One version of GPT-4 supports a context of approximately 6,000 words, while the other supports 25,000 words. Although this sounds like a lot, many applications need more than that. Imagine you wrote a book and want to build an application to answer any questions about your story. What happens if your book is longer than the context? That's where Indexing comes in. Using a model, you can turn every book passage into an embedding. These are vectors, numbers that "encode" the passage's text. You can then store these embeddings in a particular database that supports fast retrieval of these vectors. You can then turn any question into an embedding and search the database for the list of passages that are similar to that query. Instead of using the entire book to ask the model, you can now use the relevant passages as in-context information, effectively working around the context size limitation. Fine-tuning Fine-tuning can give you an extra boost to get reliable outputs from your LLM. It is, however, the most complex approach on the list. There are different approaches to fine-tuning a model with your data. A popular technique is to process your data with your LLM and use the outputs to train a new classifier that solves your specific task. Notice that here you aren't modifying the LLM. Instead, you are chaining it with your trained classifier. Another approach is to modify the parameters of the LLM using your data. Think of this as "rewiring" the model in a way that solves your particular task. The results and costs will vary depending on how many layers you want to fine-tune from the original model. Many companies think that fine-tuning is the solution to their problems. In my experience, many will benefit from exploring the other two approaches. I love explaining Machine Learning and Artificial Intelligence ideas. If you enjoy in-depth content like this, follow me Santiago so you don't miss what comes next.

Santiago

384,510 views • 3 years ago

Our course recommendation of the day is “Post-training of LLMs, ” where you’ll learn how to customize pre-trained language models using Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Online Reinforcement Learning (RL). You'll learn when to use each method, how to curate training data, and implement them in code to shape model behavior effectively. Enroll here:

Our course recommendation of the day is “Post-training of LLMs, ” where you’ll learn how to customize pre-trained language models using Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Online Reinforcement Learning (RL). You'll learn when to use each method, how to curate training data, and implement them in code to shape model behavior effectively. Enroll here:

DeepLearning.AI

29,369 views • 9 months ago