Video wird geladen...

Video konnte nicht geladen werden

Zur Startseite

"will humanity ever do a 10 million GPU pre-training run?" OpenAI CEO, Sam Altman, raises the question: oAI employee: there'll be 10m GPUs working together on an AI system that learns and performs tasks. however, the approach may shift from fully synchronous pre-training to "semi-synchronous" or more decentralized methods.

125,411 Aufrufe • vor 1 Jahr •via X (Twitter)

11 Kommentare

Profilbild von alby13
alby13vor 1 Jahr

don't forget that transformer architecture won't be around forever. you've got to think macro, on a longer timeline.

Profilbild von NICE
NICEvor 1 Jahr

Stay competitive by balancing cutting-edge AI with automation tools. Forrester shows how.

Profilbild von prabhu💢
prabhu💢vor 1 Jahr

Thats a massive scale it's exciting and kind of scary at the same time

Profilbild von XR Multiverse
XR Multiversevor 1 Jahr

ChatGPT has been around for 3 years and its still not able to return information without omitting half of it. Both OpenAI and Anthropic are horrible at the most important part of AI.

Profilbild von Ramón Guillamón
Ramón Guillamónvor 1 Jahr

using distributed computing. ex: using smartphones cpus - 8000/14000M

Profilbild von Bhaktavaschal Samal
Bhaktavaschal Samalvor 1 Jahr

10-million-GPU pre-training run is theoretically plausible but faces significant technical, economic, and ethical hurdles. potential benefits of such a system are immense, they however, must be weighed against the risks and resource costs involve. shift to asynchronous pre-training is not just plausible but necessary for scaling to 10 million GPUs reflects a broader evolution in ai infrastructure—one that prioritizes resilience, efficiency, and decentralization over rigid synchronization. this transition however requires solving novel challenges in optimization, system design, and governance and if successful, it could enable unprecedented ai capabilities while paving the way for more sustainable and democratized ai ecosystems.

Profilbild von Jeramie Baker
Jeramie Bakervor 1 Jahr

Could be done more effectively with my triadic designed systems in theory and sim's HUGE gains!

Profilbild von JayBird
JayBirdvor 1 Jahr

With a big enough nuclear reactor ☢️

Profilbild von Matthias Heger - AI acc ⏩
Matthias Heger - AI acc ⏩vor 1 Jahr

probably yes but gpus themselves will scale.

Profilbild von ぽいロード@音技術者
ぽいロード@音技術者vor 1 Jahr

twink

Profilbild von Xyber Man
Xyber Manvor 1 Jahr

gpus are outdated and not suited for ai google with their tpus will lead

Ähnliche Videos

New Course: Post-training of LLMs Learn to post-train and customize an LLM in this short course, taught by Banghua Zhu, Assistant Professor at the University of Washington University of Washington, and co-founder of @NexusflowX. Training an LLM to follow instructions or answer questions has two key stages: pre-training and post-training. In pre-training, it learns to predict the next word or token from large amounts of unlabeled text. In post-training, it learns useful behaviors such as following instructions, tool use, and reasoning. Post-training transforms a general-purpose token predictor—trained on trillions of unlabeled text tokens—into an assistant that follows instructions and performs specific tasks. Because it is much cheaper than pre-training, it is practical for many more teams to incorporate post-training methods into their workflows than pre-training. In this course, you’ll learn three common post-training methods—Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Online Reinforcement Learning (RL)—and how to use each one effectively. With SFT, you train the model on pairs of input and ideal output responses. With DPO, you provide both a preferred (chosen) and a less preferred (rejected) response and train the model to favor the preferred output. With RL, the model generates an output, receives a reward score based on human or automated feedback, and updates the model to improve performance. You’ll learn the basic concepts, common use cases, and principles for curating high-quality data for effective training. Through hands-on labs, you’ll download a pre-trained model from Hugging Face and post-train it using SFT, DPO, and RL to see how each technique shapes model behavior. In detail, you’ll: - Understand what post-training is, when to use it, and how it differs from pre-training. - Build an SFT pipeline to turn a base model into an instruct model. - Explore how DPO reshapes behavior by minimizing contrastive loss—penalizing poor responses and reinforcing preferred ones. - Implement a DPO pipeline to change the identity of a chat assistant. - Learn online RL methods such as Proximal Policy Optimization (PPO) and Group Relative Policy Optimization (GRPO), and how to design reward functions. - Train a model with GRPO to improve its math capabilities using a verifiable reward. Post-training is one of the most rapidly developing areas of LLM training. Whether you’re building a high-accuracy context-specific assistant, fine-tuning a model's tone, or improving task-specific accuracy, this course will give you experience with the most important techniques shaping how LLMs are post-trained today. Please sign up here:

Andrew Ng

125,146 Aufrufe • vor 11 Monaten