正在加载视频...

视频加载失败

New short course: Open Source Models with Hugging Face 🤗, taught by Maria Khalusova, Marc Sun, and Younes Belkada! Hugging Face has been a game changer by letting you quickly grab any of hundreds of thousands of already-trained open source models to assemble into new applications. This course teaches...

224,520 次观看 • 2 年前 •via X (Twitter)

10 条评论

Leandro von Werra 的头像
Leandro von Werra2 年前

@mariaKhalusova @_marcsun @huggingface The one and only @younesbelkada!

Thomas Wolf 的头像
Thomas Wolf2 年前

@mariaKhalusova @_marcsun @huggingface dream team 🤩

race 的头像
race2 年前

@mariaKhalusova @_marcsun @huggingface Where do I get one of those shirts

Dankoyy 的头像
Dankoyy2 年前

I really liked the course, but I believe it could be multilingual. When we talk about AI, courses on the themes could easily be translated into almost any language with extreme quality. This includes teaching the AI specific words that don't need translation. Advancing AI is also about reaching the most vulnerable and transforming their lives through the enhancement of productive capacity.

Suzana Ilić 的头像
Suzana Ilić2 年前

@mariaKhalusova @_marcsun @huggingface Younes! amazing go go go!! 🔥 @younesbelkada

Nova Lead 的头像
Nova Lead2 年前

@mariaKhalusova @_marcsun @huggingface Thrilled to see @huggingface leading the charge with their new course on Open Source Models! The power of collaboration and open-source innovation is truly transforming AI. Can't wait to explore the synergies between these models and our initiatives. The future of AI is bright

Arvind Nagaraj 的头像
Arvind Nagaraj2 年前

@mariaKhalusova @_marcsun @huggingface This is such a wonderful 🤗 course - nice to see multimodality get coverage! And so cool to see @younesbelkada code live! If you wish to understand multimodality in depth, please see my blog posts: 1. 2.

Matteo Troìa 的头像
Matteo Troìa2 年前

@mariaKhalusova @_marcsun @huggingface @Alessio_Zoccoli 😉😎

Wambugu Muchemi 🔬 的头像
Wambugu Muchemi 🔬2 年前

@mariaKhalusova @_marcsun @huggingface I love the course. Well taught and insightful. Asante!

Toronto Consulting Group 的头像
Toronto Consulting Group2 年前

@mariaKhalusova @_marcsun @huggingface Wawawiwa!

相关视频

"Introducing Multimodal Llama 3.2": As promised two weeks ago, here's the short course on Meta's latest open model! This short course is created with Meta and taught by Amit Sangani, Director of AI Partner Engineering at Meta. Meta’s Llama family of models is leading the way in open models, allowing anyone to download, customize, fine-tune, or build new applications on top of them. Learn about the vision capabilities of the Llama 3.2, and use it for image classification, prompting, tokenization, tool-calling. You'll also learn about the open-source Llama stack, which gives building blocks for many different stages of the LLM application life cycle. In detail, you’ll: - Learn what are the features of Meta's four newest models, and when to use which Llama model. - Learn best practices for multimodal prompting, with applications to advanced image reasoning, illustrated by many examples: Understanding errors on a car dashboard, adding up the total of photographed restaurant receipts, grading written math homework. - Use different roles—system, user, assistant, ipython—in the Llama 3.1 and 3.2 models and the prompt format that identifies those roles. - Understand how Llama uses the tiktoken tokenizer, and how it has expanded to a 128k vocabulary size that improves encoding efficiency and multilingual support. - Learn how to prompt Llama to call built-in and custom tools (functions) with examples for web search and solving math equations. - Learn about Llama Stack, a standardized interface for common toolchain components like fine-tuning or synthetic data generation, useful for building agentic applications. By the end of this course, you’ll be equipped to build out new applications with the new Llama 3.2. Thank you to Ahmad Al-Dahle, Amit Sangani, and the whole AI at Meta team AI at Meta for all the hard work on Llama 3.2 — we’re excited to make these open models even more accessible to more developers with this new course! Please sign up here!

Andrew Ng

131,606 次观看 • 1 年前

Explore state-of-the-art multimodal prompting in our new short course Large Multimodal Model Prompting with Gemini, taught by Erwin Huizenga in collaboration with Google Cloud. One interesting insight from this course: with multimodal models, prompt structure matters significantly. Placing text inputs, such as a patient's medical history, before image inputs, like an X-ray, can enhance the model's ability to contextualize and interpret visual data effectively. In other contexts, such as image captioning, you may get better results by putting the image first. Multimodal models behave differently than text-only LLMs, and effective prompting for models varies depending on the model you’re using. In this course you’ll learn how to effectively prompt Gemini models. Gemini's multimodal capabilities also enable new approaches in AI application development, for example: - The Gemini library handles various video formats (MP4, MOV, MPEG), streamlining applications using these formats. - Large context window (up to 1 million tokens) enables processing of extensive content, like analyzing multiple 50-minute videos simultaneously. - Function calling feature integrates real-time data (e.g., current exchange rates) into model responses. The course demonstrates building multimodal applications with real-world examples including document analyzers that reason across text and graphs simultaneously, video content extractors that find and timestamp specific information from multiple hours of footage, and automated expense report systems processing receipt images while cross-referencing company policies. Sign up here:

Andrew Ng

73,915 次观看 • 1 年前

Introducing "Building with Llama 4." This short course is created with Meta AI at Meta, and taught by Amit Sangani, Director of Partner Engineering for Meta’s AI team. Meta’s new Llama 4 has added three new models and introduced the Mixture-of-Experts (MoE) architecture to its family of open-weight models, making them more efficient to serve. In this course, you’ll work with two of the three new models introduced in Llama 4. First is Maverick, a 400B parameter model, with 128 experts and 17B active parameters. Second is Scout, a 109B parameter model with 16 experts and 17B active parameters. Maverick and Scout support long context windows of up to a million tokens and 10M tokens, respectively. The latter is enough to support directly inputting even fairly large GitHub repos for analysis! In hands-on lessons, you’ll build apps using Llama 4’s new multimodal capabilities including reasoning across multiple images and image grounding, in which you can identify elements in images. You’ll also use the official Llama API, work with Llama 4’s long-context abilities, and learn about Llama’s newest open-source tools: its prompt optimization tool that automatically improves system prompts and synthetic data kit that generates high-quality datasets for fine-tuning. If you need an open model, Llama is a great option, and the Llama 4 family is an important part of any GenAI developer's toolkit. Through this course, you’ll learn to call Llama 4 via API, use its optimization tools, and build features that span text, images, and large context. Please sign up here:

Andrew Ng

67,587 次观看 • 1 年前

Our first short course with Anthropic! Building Towards Computer Use with Anthropic. This teaches you to build an LLM-based agent that uses a computer interface by generating mouse clicks and keystrokes. Computer Use is an important, emerging capability for LLMs that will let AI agents do many more tasks than were possible before, since it lets them interact with interfaces designed for humans to use, rather than only tools that provide explicit API access. I hope you will enjoy learning about it! This course is taught by Anthropic's Head of Curriculum, Colt_Steele. You'll learn to apply image reasoning and tool use to "use" a computer as follows: a model processes an image of the screen, analyzes it to understand what's going on, and navigates the computer via mouse clicks and keystrokes. This course goes through the key building blocks, and culminates in a demo of an AI assistant that uses a web browser to search for a research paper, downloads the PDF, and finally summarizes the paper for you. In detail, you’ll: - Learn about Anthropic's family of models, when to use which one, and make API requests to Claude - Use multi-modal prompts that combine text and image content blocks, and also work with streaming responses - Improve your prompting by using prompt templates, using XML to structure prompts, and providing examples - Implement prompt caching to reduce cost and latency - Apply tool-use to build a chatbot that can call different tools to respond to queries - See all these building blocks come together in Computer Use demo Please sign up here:

Andrew Ng

170,240 次观看 • 1 年前