Loading video...

Video Failed to Load

Go Home

I created a web app for teaching computer vision basics, allowing users to upload an image or video stream and instantly view it as a numerical pixel matrix, with options to display values in RGB, Hex, or Grayscale formats. It features predefined convolution kernels applied in real-time: - Sobel...

270,251 views • 5 months ago •via X (Twitter)

0 Comments

No comments available

Comments from the original post will appear here

Related Videos

Our first short course with Anthropic! Building Towards Computer Use with Anthropic. This teaches you to build an LLM-based agent that uses a computer interface by generating mouse clicks and keystrokes. Computer Use is an important, emerging capability for LLMs that will let AI agents do many more tasks than were possible before, since it lets them interact with interfaces designed for humans to use, rather than only tools that provide explicit API access. I hope you will enjoy learning about it! This course is taught by Anthropic's Head of Curriculum, Colt_Steele. You'll learn to apply image reasoning and tool use to "use" a computer as follows: a model processes an image of the screen, analyzes it to understand what's going on, and navigates the computer via mouse clicks and keystrokes. This course goes through the key building blocks, and culminates in a demo of an AI assistant that uses a web browser to search for a research paper, downloads the PDF, and finally summarizes the paper for you. In detail, you’ll: - Learn about Anthropic's family of models, when to use which one, and make API requests to Claude - Use multi-modal prompts that combine text and image content blocks, and also work with streaming responses - Improve your prompting by using prompt templates, using XML to structure prompts, and providing examples - Implement prompt caching to reduce cost and latency - Apply tool-use to build a chatbot that can call different tools to respond to queries - See all these building blocks come together in Computer Use demo Please sign up here:

Andrew Ng

170,211 views • 1 year ago

New Short Course: Building AI Browser Agents! Learn how to build AI agents that interact and take actions on websites in this course, created in partnership with and taught by and @namangarg0, Co-founders of AGI Inc. AI browser agents can log into websites, fill out forms, click through web pages, or even place orders online for you. They use both visual information, like screenshots, and structural data, like the HTML or Document Object Model (DOM) of a web page, to reason and take action. With the complexity of webpages and multiple possible actions at each step, it can be challenging for an AI browser agent to complete an assigned task. Because these agents run long action sequences, a single error—like clicking the wrong button or misreading a field—can lead to unexpected outcomes or errors that compound over time. In this course, you'll understand how autonomous web agents work, their current limitations, and how AgentQ enables them to improve through self-correction. In detail, you'll: - Learn what web agents are, how they automate tasks online, their architecture, key components, limitations, and an overview of their decision-making strategies. - Build a web agent that can scrape website and return course recommendations in a structured output format. - Build an autonomous web agent that can execute multiple tasks, such as finding and summarizing webpages, filling out a form, and signing up for a newsletter. - Explore AgentQ, a framework that enables agents to self-correct by combining Monte Carlo Tree Search (MCTS), a self-critique mechanism for continuous improvement, and Direct Preference Optimization (DPO). - Deep dive into MCTS, learn how it finds an effective path, illustrated by an example of Gridworld animation, and use AgentQ to complete web tasks. - Understand AI agents' current state and future directions—including key factors shaping their evolution, such as hardware, algorithm innovation, and data availability. By the end of this course, you will have hands-on experience building browser agents and a deeper understanding of how to make them more robust and reliable. Please sign up here:

Andrew Ng

185,870 views • 1 year ago