Video wird geladen...

Video konnte nicht geladen werden

Beim Laden dieses Videos ist ein Problem aufgetreten. Dies könnte an einem vorübergehenden Netzwerkproblem liegen oder das Video ist möglicherweise nicht verfügbar.

Introducing DoctorGPT! After applying fine-tuning, reinforcement learning, & compilation techniques to Meta's Llama2 model, I got amazing results: - Passes the US Medical Licensing Exam - Offline - iOS & Android - Open Source Code: Full video tutorial:

Siraj Raval

61,596 subscribers

450,245 Aufrufe • vor 2 Jahren •via X (Twitter)

Bildung Gesundheit & Wellness Wissenschaft & Technologie

Anya Rossi• Live Now

Private livecam show

9 Kommentare

Profilbild von Rowan Cheung

Rowan Cheungvor 2 Jahren

Great work

Profilbild von Dwayne

Dwaynevor 2 Jahren

Finally, I can achieve my dreams of becoming a doctor and charging $300 for a band aid.

Profilbild von biccs👨🏽‍💻{3.LAND}

biccs👨🏽‍💻{3.LAND}vor 2 Jahren

Me: my head hurts DoctorGPT: you have cancer 💀

Profilbild von Abhinav Das

Abhinav Dasvor 2 Jahren

A Doctor in my pocket!

Profilbild von Manish Patel

Manish Patelvor 2 Jahren

Can't wait for @DrHughHarvey to see this 😬😱

Profilbild von Linus Ekenstam – eu/acc

Linus Ekenstam – eu/accvor 2 Jahren

This is amazing work Siraj!!!

Profilbild von Lachlan Phillips exo/acc 👾

Lachlan Phillips exo/acc 👾vor 2 Jahren

.@elonmusk @lindayaX @X The features that make YouTube the best are: 1. Full screen button makes the video horizontal mode. 2. Video plays as audio in the background so you can do other things and listen. Please implement these! Xx

Profilbild von Izzy Brooks

Izzy Brooksvor 2 Jahren

@wagieeacc 👀

Profilbild von AJ Keller

AJ Kellervor 2 Jahren

Giving me so many thoughts!

Ähnliche Videos

Open-source SUNO is here! Introducing HeartMula: free & offline AI music generator Full tutorial:

Open-source SUNO is here! Introducing HeartMula: free & offline AI music generator Full tutorial:

⚡AI Search⚡

77,569 Aufrufe • vor 5 Monaten

Video tutorial of how to connect your 9 proxies !!! On iOS, Android & PC !!! 🪶

Video tutorial of how to connect your 9 proxies !!! On iOS, Android & PC !!! 🪶

Street Update 247

145,671 Aufrufe • vor 4 Monaten

Many people are flat wrong about DeepSeek. You might think everyone freaked out about DeepSeek because the model is really good—which it is—or because it was from China, which is also true. But there's a more subtle reason: DeepSeek showed they could achieve those results using reinforcement learning instead of supervised fine-tuning. In other words, they didn't need an expensive phase where a bunch of people tuned the model by answering questions. They were able to do all of this faster and cheaper using reinforcement fine-tuning. This is a huge deal! I recorded the attached video with the following goals in mind: 1. Explain how reinforcement fine-tuning works 2. Show you how you can fine-tune your own models 3. Walk you through a complete code example

Many people are flat wrong about DeepSeek. You might think everyone freaked out about DeepSeek because the model is really good—which it is—or because it was from China, which is also true. But there's a more subtle reason: DeepSeek showed they could achieve those results using reinforcement learning instead of supervised fine-tuning. In other words, they didn't need an expensive phase where a bunch of people tuned the model by answering questions. They were able to do all of this faster and cheaper using reinforcement fine-tuning. This is a huge deal! I recorded the attached video with the following goals in mind: 1. Explain how reinforcement fine-tuning works 2. Show you how you can fine-tune your own models 3. Walk you through a complete code example

Santiago

105,545 Aufrufe • vor 1 Jahr

Introducing Cohere's first open-source coding model: North Mini Code Small & efficient, designed for agentic performance and built for community input.

Introducing Cohere's first open-source coding model: North Mini Code Small & efficient, designed for agentic performance and built for community input.

Cohere

590,585 Aufrufe • vor 20 Tagen

Introducing Open TTS Tracker! 🗣️ *sound on* A one-stop shop to track all open access/ source TTS models! Ranging from XTTS to Pheme, OpenVoice to VITS, and more... ⚡ For each model, we compile: 1. Souce-code 2. Checkpoints 3. License 4. Fine-tuning code 5. Languages supported 6. Paper 7. Demo Help us make it more complete! Let's 2024 the year of open TTS models! ❤️

Introducing Open TTS Tracker! 🗣️ sound on A one-stop shop to track all open access/ source TTS models! Ranging from XTTS to Pheme, OpenVoice to VITS, and more... ⚡ For each model, we compile: 1. Souce-code 2. Checkpoints 3. License 4. Fine-tuning code 5. Languages supported 6. Paper 7. Demo Help us make it more complete! Let's 2024 the year of open TTS models! ❤️

Vaibhav (VB) Srivastav

68,037 Aufrufe • vor 2 Jahren

YC S23's MediSearch just launched MediSearch Pro—your personal medical research assistant. It's currently the most accurate publicly available medical search engine, scoring 94.2% on the US Medical Licensing Examination (USMLE). Now on web, iOS, and Android.

YC S23's MediSearch just launched MediSearch Pro—your personal medical research assistant. It's currently the most accurate publicly available medical search engine, scoring 94.2% on the US Medical Licensing Examination (USMLE). Now on web, iOS, and Android.

Y Combinator

14,063 Aufrufe • vor 1 Jahr

It's time to become the legend. Tomb Raider is out now for iOS & Android! iOS: Android:

It's time to become the legend. Tomb Raider is out now for iOS & Android! iOS: Android:

Tomb Raider

120,454 Aufrufe • vor 4 Monaten

🦎Seen our promo video? Impressed? Well, the game is out and growing! 🔥Check us out & enjoy your first SocialFi experience: iOS & Android: ♥️ Special thanks to @BNBChain, one of the fastest & most reliable blockchains, for their amazing ecosystem.

🦎Seen our promo video? Impressed? Well, the game is out and growing! 🔥Check us out & enjoy your first SocialFi experience: iOS & Android: ♥️ Special thanks to @BNBChain, one of the fastest & most reliable blockchains, for their amazing ecosystem.

IguVerse 🦎 Public Beta Live

80,620 Aufrufe • vor 3 Jahren

Just added Qwen 2.5 32B Coder to LlamaCoder – it's an amazing open source coding model. Going to run some evals between it and Llama 3.1 405B & will share my results soon.

Just added Qwen 2.5 32B Coder to LlamaCoder – it's an amazing open source coding model. Going to run some evals between it and Llama 3.1 405B & will share my results soon.

Hassan

52,565 Aufrufe • vor 1 Jahr

Introducing the Open Deep Research app! Generate detailed reports on any topic with open source LLMs. Free & fully open source. We’re releasing everything: evaluation dataset, code, app, and blog.🔥

Introducing the Open Deep Research app! Generate detailed reports on any topic with open source LLMs. Free & fully open source. We’re releasing everything: evaluation dataset, code, app, and blog.🔥

Together AI

28,338 Aufrufe • vor 1 Jahr

New short course on Fine-tuning LLMs! Many developers are moving beyond only prompting, to also fine-tuning LLMs - that is, taking a pre-trained model and training it further on your own data, which can deliver superior results inexpensively. In this course, Sharon Zhou, CEO of Lamini (disclosure: I’m a minor shareholder) shows you how to recognize when fine-tuning can be help, and how to train an open-source LLM on your own data. I hope you enjoy the course!

New short course on Fine-tuning LLMs! Many developers are moving beyond only prompting, to also fine-tuning LLMs - that is, taking a pre-trained model and training it further on your own data, which can deliver superior results inexpensively. In this course, Sharon Zhou, CEO of Lamini (disclosure: I’m a minor shareholder) shows you how to recognize when fine-tuning can be help, and how to train an open-source LLM on your own data. I hope you enjoy the course!

Andrew Ng

502,781 Aufrufe • vor 2 Jahren

Introducing the Send Mobile by SEND ecosystem 📱 An open-source React Native kit to build iOS and Android mobile apps on Solana in ~15 minutes Ft. 18+ protocol integrations 🧵

Introducing the Send Mobile by SEND ecosystem 📱 An open-source React Native kit to build iOS and Android mobile apps on Solana in ~15 minutes Ft. 18+ protocol integrations 🧵

Solana

906,049 Aufrufe • vor 1 Jahr

New Course: Reinforcement Fine-Tuning LLMs with GRPO! Learn to use reinforcement learning to improve your LLM performance in this short course, built in collaboration with Predibase by Rubrik, and taught by Travis Addair, its Co-Founder and CTO, and Arnav Garg, its Senior Engineer and Machine Learning Lead. Reasoning models have been one of the most important developments in LLMs. Reinforcement Fine-Tuning (RFT) uses rewards to encourage LLMs to find solutions to multi-step reasoning tasks such as solving math problems and debugging code - without needing pre-existing training examples like in traditional supervised fine-tuning. Group Relative Policy Optimization (GRPO) is a reinforcement fine-tuning algorithm gaining rapid adoption. Developed by the DeepSeek team and used to train the R1 reasoning model, GRPO uses reward functions that you can write in Python to assign rewards to model responses. It’s beneficial for tasks with verifiable outcomes and can work well even with fewer than 100 training examples. It can also significantly improve the reasoning ability of smaller LLMs, making applications faster and more cost effective. In this course, you’ll take a technical deep dive into RFT with GRPO. You’ll learn to build reward functions that you can use in the GRPO training process to guide an LLM toward better performance on multi-step reasoning tasks. In detail, you’ll: - Learn when reinforcement fine-tuning is a better fit than supervised fine-tuning, especially for tasks involving multi-step reasoning or limited labeled data. - Understand how GRPO uses programmable reward functions as a more scalable alternative to the human feedback required for other reinforcement learning algorithms, such as RLHF and DPO. - Frame the Wordle game as a reinforcement fine-tuning problem and see how an LLM can learn to plan, analyze feedback, and improve its strategy over time. - Design reward functions that power the reinforcement fine-tuning process. - Learn techniques for evaluating more subjective tasks, such as rating the quality of a text summary, using an LLM as a judge. - Understand why reward hacking happens and how to avoid it by adding penalty functions to discourage undesirable behaviors. - Learn the four key components of the loss calculation in the GRPO algorithm: token probability distribution ratios, advantages, clipping, and KL-divergence. - Launch reinforcement fine-tuning jobs using Predibase’s hosted training services. By the end of this course, you’ll be able to build and fine-tune LLMs using reinforcement learning to improve reasoning without relying on large labeled datasets or subjective human feedback. Please sign up here:

New Course: Reinforcement Fine-Tuning LLMs with GRPO! Learn to use reinforcement learning to improve your LLM performance in this short course, built in collaboration with Predibase by Rubrik, and taught by Travis Addair, its Co-Founder and CTO, and Arnav Garg, its Senior Engineer and Machine Learning Lead. Reasoning models have been one of the most important developments in LLMs. Reinforcement Fine-Tuning (RFT) uses rewards to encourage LLMs to find solutions to multi-step reasoning tasks such as solving math problems and debugging code - without needing pre-existing training examples like in traditional supervised fine-tuning. Group Relative Policy Optimization (GRPO) is a reinforcement fine-tuning algorithm gaining rapid adoption. Developed by the DeepSeek team and used to train the R1 reasoning model, GRPO uses reward functions that you can write in Python to assign rewards to model responses. It’s beneficial for tasks with verifiable outcomes and can work well even with fewer than 100 training examples. It can also significantly improve the reasoning ability of smaller LLMs, making applications faster and more cost effective. In this course, you’ll take a technical deep dive into RFT with GRPO. You’ll learn to build reward functions that you can use in the GRPO training process to guide an LLM toward better performance on multi-step reasoning tasks. In detail, you’ll: - Learn when reinforcement fine-tuning is a better fit than supervised fine-tuning, especially for tasks involving multi-step reasoning or limited labeled data. - Understand how GRPO uses programmable reward functions as a more scalable alternative to the human feedback required for other reinforcement learning algorithms, such as RLHF and DPO. - Frame the Wordle game as a reinforcement fine-tuning problem and see how an LLM can learn to plan, analyze feedback, and improve its strategy over time. - Design reward functions that power the reinforcement fine-tuning process. - Learn techniques for evaluating more subjective tasks, such as rating the quality of a text summary, using an LLM as a judge. - Understand why reward hacking happens and how to avoid it by adding penalty functions to discourage undesirable behaviors. - Learn the four key components of the loss calculation in the GRPO algorithm: token probability distribution ratios, advantages, clipping, and KL-divergence. - Launch reinforcement fine-tuning jobs using Predibase’s hosted training services. By the end of this course, you’ll be able to build and fine-tune LLMs using reinforcement learning to improve reasoning without relying on large labeled datasets or subjective human feedback. Please sign up here:

Andrew Ng

86,457 Aufrufe • vor 1 Jahr

reinforcement learning expanding the range of techniques of the Ultra Mobile Vehicle (UMV)

reinforcement learning expanding the range of techniques of the Ultra Mobile Vehicle (UMV)

Science girl

19,399 Aufrufe • vor 9 Monaten

Fine-tuning in 2026 has never been easier You can make any open-source model 10x more powerful And thanks to Unsloth Studio, creating custom datasets takes just a few mins, Here is the full course:

Fine-tuning in 2026 has never been easier You can make any open-source model 10x more powerful And thanks to Unsloth Studio, creating custom datasets takes just a few mins, Here is the full course:

David Ondrej

35,742 Aufrufe • vor 1 Monat

Fine-tune 100+ LLMs directly from a UI! LLaMA-Factory lets you train and fine-tune open-source LLMs and VLMs without writing any code. Supports 100+ models, multimodal fine-tuning, PPO, DPO, experiment tracking, and much more! 100% open-source with 50k stars!

Fine-tune 100+ LLMs directly from a UI! LLaMA-Factory lets you train and fine-tune open-source LLMs and VLMs without writing any code. Supports 100+ models, multimodal fine-tuning, PPO, DPO, experiment tracking, and much more! 100% open-source with 50k stars!

Avi Chawla

557,355 Aufrufe • vor 1 Jahr

Fine-tune 100+ LLMs directly from a UI! LLaMA-Factory lets you train and fine-tune open-source LLMs and VLMs without writing any code. Supports 100+ models, multimodal fine-tuning, PPO, DPO, experiment tracking, and much more! 100% open-source, 51k+ stars 🌟

Fine-tune 100+ LLMs directly from a UI! LLaMA-Factory lets you train and fine-tune open-source LLMs and VLMs without writing any code. Supports 100+ models, multimodal fine-tuning, PPO, DPO, experiment tracking, and much more! 100% open-source, 51k+ stars 🌟

Akshay 🚀

60,163 Aufrufe • vor 1 Jahr

🚀 LTXV 0.9.5 is here! Our latest AI video model brings: ✅ Commercial licensing 🎬 Keyframe conditioning 🔍 Higher resolution & improved quality 📈 Longer sequences, fewer artifacts 🖥️ Native ComfyUI support Bringing open-source AI video creation to the next level.

🚀 LTXV 0.9.5 is here! Our latest AI video model brings: ✅ Commercial licensing 🎬 Keyframe conditioning 🔍 Higher resolution & improved quality 📈 Longer sequences, fewer artifacts 🖥️ Native ComfyUI support Bringing open-source AI video creation to the next level.

Lightricks

31,740 Aufrufe • vor 1 Jahr

🚀 LTXV 0.9.5 is here! Our latest AI video model brings: ✅ Commercial licensing 🎬 Keyframe conditioning 🔍 Higher resolution & improved quality 📈 Longer sequences, fewer artifacts 🖥️ Native ComfyUI support Bringing open-source AI video creation to the next level.

🚀 LTXV 0.9.5 is here! Our latest AI video model brings: ✅ Commercial licensing 🎬 Keyframe conditioning 🔍 Higher resolution & improved quality 📈 Longer sequences, fewer artifacts 🖥️ Native ComfyUI support Bringing open-source AI video creation to the next level.

LTX.io

65,581 Aufrufe • vor 1 Jahr

Super excited to launch a new AI course! 🚀 Fine-Tuning & Reinforcement Learning for LLMs: Intro to Post-Training A collaboration between AMD 🤝 Andrew Ng’s DeepLearning.AI to give every developer the tools & compute to work with the same post-training techniques, used across today’s leading AI labs. 🎓 Learn for free → 🧵

Super excited to launch a new AI course! 🚀 Fine-Tuning & Reinforcement Learning for LLMs: Intro to Post-Training A collaboration between AMD 🤝 Andrew Ng’s DeepLearning.AI to give every developer the tools & compute to work with the same post-training techniques, used across today’s leading AI labs. 🎓 Learn for free → 🧵

Sharon Zhou

20,386 Aufrufe • vor 8 Monaten