Loading video...

Video Failed to Load

Go Home

Our first short course with Anthropic! Building Towards Computer Use with Anthropic. This teaches you to build an LLM-based agent that uses a computer interface by generating mouse clicks and keystrokes. Computer Use is an important, emerging capability for LLMs that will let AI agents do many more tasks...

170,305 views • 1 year ago •via X (Twitter)

11 Comments

daniel ostertag's profile picture
daniel ostertag1 year ago

@AnthropicAI Colt works for anthropic? Anyone else get started learning web dev with his course? Probably one of the first courses I ever paid for.

Fast Company's profile picture
Fast Company1 year ago

#AI agents: from tools to teammates. Software that can act, react, and interact with both humans and other agents will transform how businesses operate—and how they should invest in technology. This @EY_US article explains more. #ad

Vincent Valentine (CEO of UnOpen.ai)'s profile picture
Vincent Valentine (CEO of UnOpen.ai)1 year ago

@AnthropicAI Exciting to see such innovative courses being offered.

Robert's profile picture
Robert1 year ago

@AnthropicAI Thank you! Looking forward to this one and the many to follow!

Ethan_AI Marketer for 𝕏's profile picture
Ethan_AI Marketer for 𝕏1 year ago

@AnthropicAI this sounds like a wild ride for LLMs, lol

Anthara Fairooz's profile picture
Anthara Fairooz1 year ago

@AnthropicAI Exciting course! Building AI agents to interact with computer interfaces is a game-changer. Can't wait to see the demos!

Nitin Panwar's profile picture
Nitin Panwar1 year ago

@AnthropicAI Teaching AI to navigate human interfaces instead of waiting for APIs feels like the moment we taught it to walk instead of just teleport, exciting and transformative!

RyanRejoice's profile picture
RyanRejoice1 year ago

@AnthropicAI Claude Bros unit! Let's go learn how to use Claude even better.

Navigate's profile picture
Navigate1 year ago

@AnthropicAI Fascinating! 🤔

Cheeku Tripathi's profile picture
Cheeku Tripathi1 year ago

@AnthropicAI ability to interact with human interfaces via mouse clicks and keystrokes opens up new possibilities

icecreambar's profile picture
icecreambar1 year ago

@AnthropicAI Hi Andrew. I took your ML class in 2013 on Coursera and wanted to give thanks. Does this course cover Anthropic’s MCP?

Related Videos

OpenAI just announced API access to o1 (advanced reasoning model) yesterday. I'm delighted to announce today a new short course, Reasoning with o1, built with OpenAI, and taught by Colin Jarvis, Head of AI Solutions at OpenAI, to show you how to use this effectively! Unlike previous language models which generate output directly, o1 “thinks before it responds,” and generates many reasoning tokens before returning a more thoughtful and accurate response. It is great at complex reasoning -- including planning for agentic workflows, coding, and domain-specific reasoning in STEM fields like law. But how you should use it is quite different from other LLMs. I think o1 will be a game changer for many AI applications; and in this course, you'll learn how to use it effectively. In detail, you’ll: - Learn to recognize what tasks o1 is suited for, and when to use a smaller model, or combine o1 with a smaller model - Understand the new principles of prompting reasoning models: Be simple and direct; no explicit chain-of-thought required; use structure; show rather than tell - Implement multi-step orchestration in which o1 plans, and hands tasks over to gpt-4o-mini to execute specific steps; this illustrates a design pattern to optimize intelligence (accuracy) and cost - Use o1 for a coding task to build a new application, edit existing code, and test performance by running a coding competition between o1-mini and GPT 4o - Use o1 for image understanding and learn how it performs better with a "hierarchy of reasoning," in which it incurs the latency and cost upfront, preprocessing the image and indexing it with rich details so it can be used for Q&A later - Learn a technique called meta-prompting, in which you use o1 to improve your prompts. Using a customer support evaluation set, you'll iteratively use o1 to modify a prompt to improve performance You'll also learn about how OpenAI used reinforcement learning to produce a model that uses "test-time compute" to improve performance. I think you'll find this course enjoyable and valuable. Please sign up for it here:

Andrew Ng

357,401 views • 1 year ago

New short course: Long-Term Agentic Memory with LangGraph. Learn to build an agent with long-term memory in this course developed in collaboration with taught by its Co-Founder and CEO, Harrison Chase! Personal assistance and productivity tasks have become important use cases for agents. An important feature of an AI assistant, such as a coding or calendar assistant, is its ability to keep improving over time from its experience. Agent memory is the key capability that enables this. To add memory to an agent, you must first figure out what to store and what to retrieve when it is time to use the information. Additionally, you’ll have to decide when to update the stored information. For example, you might update in each iteration loop of the agent or perform updates in the background, with a helper agent. In this course, you will learn a mental framework to build agents with long-term memory. You'll create a useful email assistant that can respond, ignore, and notify using writing, scheduling, and memory-management tools. You’ll develop your agent's memory by adding facts to its memory store, provide examples to learn the user's preferences, and optimize system prompts to evolve instructions based on previous responses. In detail, you’ll: - Learn how the three types of memory--semantic, episodic, and procedural–and the two update mechanisms–via hot path and in the background–apply to your agents. - Build an email agent with writing, scheduling, and availability tools, along with a router that triages incoming email and handles it accordingly by ignoring, responding, or notifying the user. - Add tools to your email agent that allow it to operate on semantic memory by learning facts about the user, storing them in a long-term memory store, and searching over them in future interactions. - Incorporate episodic memory, in the form of few-shot examples, in the triage step of your agents to help them learn and update user preferences. - Add procedural memory as system prompts, optimized with feedback to improve the instructions the agent follows. Learn how to approach memory in agents, and start building agents with long-term memory with LangGraph! Please sign up here:

Andrew Ng

131,640 views • 1 year ago

Build and customize complex AI applications with a flexible framework in this new short course, Building AI Applications with Haystack. Created in collaboration with deepset, makers of Haystack, and taught by Tuana, who is the developer relations lead for Haystack at deepset. Generative AI technology is changing rapidly and it can be challenging to integrate APIs from different LLMs, vector databases, and various tools such as web search. In this course, you will learn how to use the Haystack framework to make your development process more modular, allowing you to manage complexity and focus more on building your application. In detail, you’ll: - Build a RAG pipeline using Haystack’s main building blocks – components, pipelines, and document stores. - Create custom components in your pipeline by building a Hacker News summarizer that extends your app’s ability to access APIs. - Use conditional routing to create a branching pipeline with a fallback to web search mechanism when the LLM does not have the necessary context to respond to the user's query. - Build a self-reflecting agent for named entity recognition that loops using an output validator custom component. - Create a chat agent using OpenAI's function-calling capabilities which allow you to provide Haystack pipelines as tools to the LLM, enhancing that agent's capabilities. By the end of this course, you will learn a high-level orchestration framework that can help make your applications flexible, extendible, and maintainable, even as the technology stack changes, new user needs arise, and you add new features to your application. Please sign up here:

Andrew Ng

53,785 views • 1 year ago

New course: MCP: Build Rich-Context AI Apps with Anthropic. Learn to build AI apps that access tools, data, and prompts using the Model Context Protocol in this short course, created in partnership with Anthropic Anthropic and taught by Elie Schoppik Elie Schoppik, its Head of Technical Education. Connecting AI applications to external systems that bring rich context to LLM-based applications has often meant writing custom integrations for each use case. MCP is an open protocol that standardizes how LLMs access tools, data, and prompts from external sources, and simplifies how you provide context to your LLM-based applications. For example, you can provide context via third-party tools that let your LLM make API calls to search the web, access data from local docs, retrieve code from a GitHub repo, and so on. MCP, developed by Anthropic, is based on a client-server architecture that defines the communication details between an MCP client, hosted inside the AI application, and an MCP server that exposes tools, resources, and prompt templates. The server can be a subprocess launched by the client that runs locally or an independent process running remotely. In this hands-on course, you'll learn the core architecture behind MCP. You’ll create an MCP-compatible chatbot, build and deploy an MCP server, and connect the chatbot to your MCP server and other open-source servers. Here’s what you’ll do: - Understand why MCP makes AI development less fragmented and standardizes connections between AI applications and external data sources - Learn the core components of the client-server architecture of MCP and the underlying communication mechanism - Build a chatbot with custom tools for searching academic papers, and transform it into an MCP-compatible application - Build a local MCP server that exposes tools, resources, and prompt templates using FastMCP, and test it using MCP Inspector - Create an MCP client inside your chatbot to dynamically connect to your server - Connect your chatbot to reference servers built by Anthropic’s MCP team, such as filesystem, which implements filesystem operations, and fetch, which extracts contents from the web as markdown - Configure Claude Desktop to connect to your server and others, and explore how it abstracts away the low-level logic of MCP clients - Deploy your MCP server remotely and test it with the Inspector or other MCP-compatible applications - Learn about the roadmap for future MCP development, such as multi-agent architecture, MCP registry API, server discovery, authorization, and authentication MCP is an exciting and important technology that lets you build rich-context AI applications that connect to a growing ecosystem of MCP servers, with minimal integration work. Please sign up here!

Andrew Ng

141,952 views • 1 year ago

New course to bring you up to state-of-the-art at using AI to help you code: Build Apps with Windsurf's AI Coding Agents, built in partnership with WIndsurf (Codeium) and taught by Anshul Ramachandran! AI-assisted IDEs (Integrated Development Environments) make developers’ workflows faster, more efficient, and much more fun. Agentic tools like Windsurf are more than just code autocomplete—they are collaborative coding agents that help you break down complex applications, iterate efficiently, and generate code that spans multiple files. Although a lot of coding assistants share the same underlying large language models for planning and reasoning, a major point of distinction is how they handle tools, keep track of context, and stay aligned with your intent as a developer. For instance, if you make modifications to a class definition in your code and make the same modifications to other classes in the same directory, you might tell the AI agent "Do the same thing in similar places in this directory." Here, tracking your intent means understanding that “the same thing" refers to that recent edit you just made, which must be followed by appropriate search and tool-calling to implement the changes. In this course, you'll learn the inner workings of coding agents, their strengths and limitations, and how to use Windsurf to quickly build several applications. In detail, you'll: - Build a mental model of how agents work by combining human-action tracking, tool integration, and context awareness to carry out an agentic coding workflow. - Learn the challenges of code search and discovery and how a multi-step retrieval approach helps coding agents address them. - Use Windsurf to analyze and understand a large, old codebase and update it to the latest versions of the frameworks and packages it uses. - Build a Wikipedia data analysis app that retrieves, parses, and analyzes word frequencies. - Enhance the performance of your Wikipedia analysis app by adding caching, and through this, also learn how to course-correct when the AI agent produces unexpected results. - Learn tips and tricks such as keyboard shortcuts, autocomplete, and @ mentions to quickly call on agentic capabilities. - Use image/multimodal capabilities of the AI agent to increase your development velocity; you'll see an example of uploading a mockup with sketched-out UI features, and ask the agent to use that to build new functionality to an app. By the end of this course, you’ll understand agentic coding in-depth and know how to use it to make your development process much faster, more efficient, and enjoyable. Please sign up here!

Andrew Ng

139,761 views • 1 year ago

"Introducing Multimodal Llama 3.2": As promised two weeks ago, here's the short course on Meta's latest open model! This short course is created with Meta and taught by Amit Sangani, Director of AI Partner Engineering at Meta. Meta’s Llama family of models is leading the way in open models, allowing anyone to download, customize, fine-tune, or build new applications on top of them. Learn about the vision capabilities of the Llama 3.2, and use it for image classification, prompting, tokenization, tool-calling. You'll also learn about the open-source Llama stack, which gives building blocks for many different stages of the LLM application life cycle. In detail, you’ll: - Learn what are the features of Meta's four newest models, and when to use which Llama model. - Learn best practices for multimodal prompting, with applications to advanced image reasoning, illustrated by many examples: Understanding errors on a car dashboard, adding up the total of photographed restaurant receipts, grading written math homework. - Use different roles—system, user, assistant, ipython—in the Llama 3.1 and 3.2 models and the prompt format that identifies those roles. - Understand how Llama uses the tiktoken tokenizer, and how it has expanded to a 128k vocabulary size that improves encoding efficiency and multilingual support. - Learn how to prompt Llama to call built-in and custom tools (functions) with examples for web search and solving math equations. - Learn about Llama Stack, a standardized interface for common toolchain components like fine-tuning or synthetic data generation, useful for building agentic applications. By the end of this course, you’ll be equipped to build out new applications with the new Llama 3.2. Thank you to Ahmad Al-Dahle, Amit Sangani, and the whole AI at Meta team AI at Meta for all the hard work on Llama 3.2 — we’re excited to make these open models even more accessible to more developers with this new course! Please sign up here!

Andrew Ng

131,606 views • 1 year ago

New short course: LLMs as Operating Systems: Agent Memory, created with Letta, and taught by its founders Charles Packer and Sarah Wooders. An LLM's input context window has limited space. Using a longer input context also costs more and results in slower processing. So, managing what's stored in this context window is important. In the innovative paper MemGPT: Towards LLMs as Operating Systems, its authors (which include the instructors) proposed using an LLM agent to manage this context window. Their system uses a large persistent memory that stores everything that could be included in the input context, and an agent decides what is actually included. Take the example of building a chatbot that needs to remember what's been said earlier in a conversation (perhaps over many days of interaction with a user). As the conversation's length grows, the memory management agent will move information from the input context to a persistent searchable database; summarize information to keep relevant facts in the input context; and restore relevant conversation elements from further back in time. This allows a chatbot to keep what's currently most relevant in its input context memory to generate the next response. When I read the original MemGPT paper, I thought it was an innovative technique for handling memory for LLMs. The open-source Letta framework, which we'll use in this course, makes MemGPT easy to implement. It adds memory to your LLM agents and gives them transparent long-term memory. In detail, you’ll learn: - How to build an agent that can edit its own limited input context memory, using tools and multi-step reasoning - What is a memory hierarchy (an idea from computer operating systems, which use a cache to speed up memory access), and how these ideas apply to managing the LLM input context (where the input context window is a "cache" storing the most relevant information; and an agent decides what to move in and out of this to/from a larger persistent storage system) - How to implement multi-agent collaboration by letting different agents share blocks of memory This course will give you a sophisticated understanding of memory management for LLMs, which is important for chatbots having long conversations, and for complex agentic workflows. Please sign up here!

Andrew Ng

200,729 views • 1 year ago