Video yükleniyor...

Video Yüklenemedi

Ana Sayfaya Dön

Have you used quantization with an open source machine learning library, and wondered how quantization works? How can you preserve model accuracy as you compress from 32 bits to 16, 8, or even 2 bits? In our new short course, Quantization in Depth, taught by Hugging Face's Marc Sun...

198,616 görüntüleme • 2 yıl önce •via X (Twitter)

10 Yorum

Quantcheck profil fotoğrafı
Quantcheck2 yıl önce

@huggingface Whether it is signal processing, data compression, or machine learning, Quantization plays a crucial role.

Saquib Mehmood profil fotoğrafı
Saquib Mehmood2 yıl önce

@huggingface Thanks. Very helpful refresher.

kevlarai profil fotoğrafı
kevlarai2 yıl önce

@huggingface I'd love to learn more about this. What are the suggested pre-reqs?

Malik KISSOUM profil fotoğrafı
Malik KISSOUM2 yıl önce

@huggingface This is fire 🔥🔥🔥, thank you for making deep learning so fun and accessible

AIxBlock profil fotoğrafı
AIxBlock2 yıl önce

@huggingface The detailed approach to understanding and implementing different quantization methods will undoubtedly empower many developers!

Vincent Valentine (CEO of UnOpen.ai) profil fotoğrafı
Vincent Valentine (CEO of UnOpen.ai)2 yıl önce

@huggingface @AndrewYNg Fascinating course. Quantization intrigues me - compressing models while retaining accuracy? How does this technique balance resource optimization and performance? Exploring the intricacies seems insightful.

Data & Analytics profil fotoğrafı
Data & Analytics2 yıl önce

@huggingface @AndrewYNg Interesting topic! Quantization can be tricky, but preserving model accuracy is key. Have you tried any techniques to maintain accuracy during compression?

GPT.Biz profil fotoğrafı
GPT.Biz2 yıl önce

探索量化的奥秘吧,这门课程将带你从理论到实践,了解如何优化模型的存储与计算效率!

GeraDeluxer profil fotoğrafı
GeraDeluxer2 yıl önce

@huggingface Thanks a lot for the great AI content 🚀

Michael Guo profil fotoğrafı
Michael Guo2 yıl önce

@huggingface I have put this course on my radar for quite some time and thanks for the reminder and I need get it done

Benzer Videolar

Tokenization -- turning text into a sequence of integers -- is a key part of generative AI, and most API providers charge per million tokens. How does tokenization work? Learn the details of tokenization and RAG optimization in Retrieval Optimization: From Tokenization to Vector Quantization, created in collaboration with Qdrant and taught by its Developer Relations Lead, Kacper Łukawski. This course focuses on Retrieval augmented generation (RAG), which has two steps: First, a retriever finds relevant information; then, the generator uses what’s retrieved as context to produce a response. You’ll learn to optimize the first step (the retriever) by understanding how tokenization works and how it impacts the relevance of your search. In addition, you will also learn to measure and improve retrieval quality, speed, and memory. In detail, you’ll: - Learn about the internal workings of the embedding models and how your text turns into vectors. - Understand how several tokenizers, such as Byte-Pair Encoding, WordPiece, Unigram, and SentencePiece work. - Explore common challenges with tokenizers, such as unknown tokens, domain-specific identifiers, and numerical values, that can negatively affect your vector search. - Understand how to measure the quality of your search across relevance, ranking, and score-related metrics. - Understand how the main parameters in "HNSW", a graph-based algorithm, affect the relevance and speed of vector search, and how to tune its parameters. - Experiment with the three major quantization methods – product, scalar, and binary – and learn how they impact memory requirements, search quality, and speed. By the end of this course, you’ll have a solid understanding of how tokenization functions and how to optimize vector search in your RAG systems. Please sign up here!

Andrew Ng

146,200 görüntüleme • 1 yıl önce

New Short Course: Getting Structured LLM Output! Learn how to get structured outputs from your LLM applications in this course, built in partnership with .txt, and taught by Will Kurt, a Founding Engineer, and , Developer Relations Engineer. It's challenging for software to automatically parse through an LLM's freeform text outputs. Structured outputs—like JSON—solve this by converting natural language into consistent, clear, data that a machine can read and process. This course teaches you how to generate structured outputs while building several use cases, including a social media analysis agent. You’ll learn about structured outputs and efficient ways to generate outputs in your defined schema or format. You’ll begin by using structured output APIs, then use re-prompting libraries like “instructor” to generate structured output. Finally, you’ll learn how constrained decoding works; this is a very clever technique in which constraints are applied on each subsequent token generated, blocking any tokens that don’t fit your defined schema. In detail, you’ll: - Learn why structured outputs are important, how they allow for scalable software development, and the different approaches to generate them, including vendor-provided APIs, re-prompting libraries, and structured generation. - Build a simple social media agent using OpenAI’s structured output API, learn how to define a model's desired structured output using Pydantic, and perform basic programming with your outputs, such as importing structured data into a data frame using pandas. - Learn how to use the open-source library "instructor," which checks the structured output of the model and re-prompts the model until it validates the desired output, and explore the limitations of this approach. - Understand how structured generation by the “outlines” library works by modifying LLM logits, on a per-generated-token basis based on the desired format, to give a particular output structure. - Learn how regular expressions, which outlines works with, are represented as finite-state machines, and how they can be used to develop a range of structured outputs beyond JSON. By the end of this course, you’ll have broadened your knowledge of the approaches you can use to get structured outputs from your LLM applications. Please sign up here:

Andrew Ng

89,578 görüntüleme • 1 yıl önce