Video yükleniyor...

Video Yüklenemedi

Bu video yüklenirken bir sorun oluştu. Bu geçici bir ağ sorunundan kaynaklanıyor olabilir veya video kullanılamıyor olabilir.

Ana Sayfaya Dön

Decentralized RL as fast as centralized RL for LLMs. Bittensor SN81 grail has shattered the bandwidth barrier with PULSE (Patch Updates via Lossless Sparse Encoding). "By identifying the 99% weight sparsity inherent in Adam-bounded updates, we've achieved a 100x reduction in weight synchronization - dropping 14GB transfers to just... show more

Openτensor Foundaτion

173,778 subscribers

13,074 görüntüleme • 2 ay önce •via X (Twitter)

Anya Rossi• Live Now

Private livecam show

0 Yorum

Yorum bulunmuyor

Orijinal gönderinin yorumları burada görünecek

Benzer Videolar

Covenant Labs just did a 90-minute AMA breaking down their 3 Bittensor subnets. templar. basilica. grail. Pre-training, compute, and post-training under one roof. Most people missed it. Here's everything they said. Covenant is building what they call the "end to end intelligence continuum." Three subnets. Three layers of the AI stack. All permissionless. Templar (SN3) handles decentralized pre-training. Basilica (SN39) handles compute. Grail (SN81) handles RL post-training. Sam Dare, the lead, put it bluntly. Decentralized training is "humanity's last dance." Not about beating OpenAI head to head. About creating optionality. About making it cheap enough for anyone to train models. The gap between academia and frontier labs is growing exponentially. Researchers can't afford to experiment. The actual training run costs 5% of the reported budget. The other 95% is experimentation. If Covenant cracks cheap training, that entire surface area opens up. On Templar specifically: • Hit 39% emission on Bittensor. Highest since Apex was the only subnet on the network • Covenant-72B trained permissionlessly with 70+ contributors on commodity internet • 1.1 trillion tokens processed. No centralized data center • Performance competitive with LLaMA-2-70B On Grail, something flew under the radar. They built Pulse. A weight synchronization method that compresses model updates by 100x. • In RL post-training, only ~1% of weights update per step • Pulse exploits that sparsity. Lossless compression • Prime Intellect's comparable system took 14 minutes to sync a 30B model • Pulse makes decentralized RL training actually feasible at scale • Already used by Cursor The lead researcher on Grail said they've trained on math, code, and GPU kernels. Got 40-60% improvement on benchmarks. Working toward agentic training with 100K+ token context and 30B+ parameter models. On Basilica, the compute subnet: The team was blunt. Just reselling GPU hours is a 5-10% margin game. Traditional compute providers already do that. Their play is value-added services. • "GPU as code." No dashboard. No UI. Agents interact via SDK • Custom scheduler that places workloads across heterogeneous hardware • Verification checks for GPU, CPU, bandwidth, memory, storage, and OS security • Partnerships with providers like Mass Compute for 10-20% below market pricing • Miners compete on useful infrastructure, not just GPU hours Sam then went on a rant about the miner burn debate. His take: Bittensor had to grow up. dTAO introduced investors. The old "miners are God" philosophy doesn't hold. • Subnet owners have a duty to protect token value • Miners are a resource optimization exercise, not a cost reduction exercise • 100% miner emissions on compute subnets = immediate sell pressure • The 41% miner allocation is arbitrary. Different business models need different splits • Fish (who started burns) agreed. Burns usually mean the validation isn't mature enough The bigger point. You can't police burns. Subnets just send to their own keys instead of the burn address. Subnet 28 does exactly that. Sam's position: judge subnets on outcomes, not process. Const has changed the protocol 9-10 times in 2 years. That iteration speed is Bittensor's actual moat. The whole Covenant thesis is playing out in real time. TAO is up 100%+ in a month. Jensen Huang name-dropped the network. Grayscale has an ETF filing. But the real story is three subnets quietly building every layer of decentralized AI.

Covenant Labs just did a 90-minute AMA breaking down their 3 Bittensor subnets. templar. basilica. grail. Pre-training, compute, and post-training under one roof. Most people missed it. Here's everything they said. Covenant is building what they call the "end to end intelligence continuum." Three subnets. Three layers of the AI stack. All permissionless. Templar (SN3) handles decentralized pre-training. Basilica (SN39) handles compute. Grail (SN81) handles RL post-training. Sam Dare, the lead, put it bluntly. Decentralized training is "humanity's last dance." Not about beating OpenAI head to head. About creating optionality. About making it cheap enough for anyone to train models. The gap between academia and frontier labs is growing exponentially. Researchers can't afford to experiment. The actual training run costs 5% of the reported budget. The other 95% is experimentation. If Covenant cracks cheap training, that entire surface area opens up. On Templar specifically: • Hit 39% emission on Bittensor. Highest since Apex was the only subnet on the network • Covenant-72B trained permissionlessly with 70+ contributors on commodity internet • 1.1 trillion tokens processed. No centralized data center • Performance competitive with LLaMA-2-70B On Grail, something flew under the radar. They built Pulse. A weight synchronization method that compresses model updates by 100x. • In RL post-training, only ~1% of weights update per step • Pulse exploits that sparsity. Lossless compression • Prime Intellect's comparable system took 14 minutes to sync a 30B model • Pulse makes decentralized RL training actually feasible at scale • Already used by Cursor The lead researcher on Grail said they've trained on math, code, and GPU kernels. Got 40-60% improvement on benchmarks. Working toward agentic training with 100K+ token context and 30B+ parameter models. On Basilica, the compute subnet: The team was blunt. Just reselling GPU hours is a 5-10% margin game. Traditional compute providers already do that. Their play is value-added services. • "GPU as code." No dashboard. No UI. Agents interact via SDK • Custom scheduler that places workloads across heterogeneous hardware • Verification checks for GPU, CPU, bandwidth, memory, storage, and OS security • Partnerships with providers like Mass Compute for 10-20% below market pricing • Miners compete on useful infrastructure, not just GPU hours Sam then went on a rant about the miner burn debate. His take: Bittensor had to grow up. dTAO introduced investors. The old "miners are God" philosophy doesn't hold. • Subnet owners have a duty to protect token value • Miners are a resource optimization exercise, not a cost reduction exercise • 100% miner emissions on compute subnets = immediate sell pressure • The 41% miner allocation is arbitrary. Different business models need different splits • Fish (who started burns) agreed. Burns usually mean the validation isn't mature enough The bigger point. You can't police burns. Subnets just send to their own keys instead of the burn address. Subnet 28 does exactly that. Sam's position: judge subnets on outcomes, not process. Const has changed the protocol 9-10 times in 2 years. That iteration speed is Bittensor's actual moat. The whole Covenant thesis is playing out in real time. TAO is up 100%+ in a month. Jensen Huang name-dropped the network. Grayscale has an ETF filing. But the real story is three subnets quietly building every layer of decentralized AI.

Jesus Martinez

26,642 görüntüleme • 2 ay önce

Centralized vs Decentralized RL Explained perfectly by our CEO Shashank | 𝔽rAI in under 60 seconds. A must watch 👇

Centralized vs Decentralized RL Explained perfectly by our CEO Shashank | 𝔽rAI in under 60 seconds. A must watch 👇

Fraction AI

11,802 görüntüleme • 7 ay önce

$TAO Bittensor - Interview with Const. Hey Bittensor Community! I told you I would keep you informed as best as I could. I spoke with const a little while ago and asked him: Const, Covenant stepped out. That leaves three slots that need to be addressed somehow. Bittensor continues. Some noise has happened and more will probably happen. So I’d like to ask you: What are you going to be working on in the next few days as a priority? And would you like to leave a message for every alpha holder who believes in what is being built? Check his response in the video below. Share your thoughts. Your opinion matters here.

$TAO Bittensor - Interview with Const. Hey Bittensor Community! I told you I would keep you informed as best as I could. I spoke with const a little while ago and asked him: Const, Covenant stepped out. That leaves three slots that need to be addressed somehow. Bittensor continues. Some noise has happened and more will probably happen. So I’d like to ask you: What are you going to be working on in the next few days as a priority? And would you like to leave a message for every alpha holder who believes in what is being built? Check his response in the video below. Share your thoughts. Your opinion matters here.

Tao Outsider

26,479 görüntüleme • 2 ay önce

Yan Liberman (Yan Liberman) explains why Grass is the most underappreciated AI infrastructure play in the market: "GRASS made it. $12M in quarterly revenue as of November 2025. Sparse updates, that's why it's been overlooked." "Residential opt-in bandwidth network. Users contribute unused bandwidth through a browser extension. GRASS scrapes the internet with it."

Yan Liberman (Yan Liberman) explains why Grass is the most underappreciated AI infrastructure play in the market: "GRASS made it. $12M in quarterly revenue as of November 2025. Sparse updates, that's why it's been overlooked." "Residential opt-in bandwidth network. Users contribute unused bandwidth through a browser extension. GRASS scrapes the internet with it."

The Rollup

71,022 görüntüleme • 20 gün önce

Hiring RL Engineer! Started off as a curious project at Lossfunk to push the boundaries of LLMs in social reasoning - we are now building RL environments, data, and benchmarks to simulate more real-world scenarios. If you want to train SoTA RL models over multi-GPUs (H200s/B200s) to unlock next AI frontier, this is for you.

Hiring RL Engineer! Started off as a curious project at Lossfunk to push the boundaries of LLMs in social reasoning - we are now building RL environments, data, and benchmarks to simulate more real-world scenarios. If you want to train SoTA RL models over multi-GPUs (H200s/B200s) to unlock next AI frontier, this is for you.

Satpal Singh Rathore

45,915 görüntüleme • 10 ay önce

.const believes Bittensor will outpace centralized machine learning labs, becoming the go-to hub for efficient problem-solving in data collection, benchmarks, and AI training. As competitors emerge, $TAO aims to remain the Schelling point for digital commodity systems.

.const believes Bittensor will outpace centralized machine learning labs, becoming the go-to hub for efficient problem-solving in data collection, benchmarks, and AI training. As competitors emerge, $TAO aims to remain the Schelling point for digital commodity systems.

Grayscale

102,100 görüntüleme • 1 yıl önce

There's no point in doing decentralized training without efficient communication. >> DiLoCo (H=15) ships ~480mb/merge with 163 syncs. >> SparseLoCo (H=15) ships ~5.5–17mb/merge at 0.78–3.12% density with 163 syncs Top-K Compression + 2 bit comms ~28–89× smaller per sync than DiLoCo. Subnet 3 :: Luis el grande If you have the algorithm, you can train large language models across disparate compute, collectively. "In the space of eight months or nine months, we've been able to scale our model from 1.2B to 70B, which represents 58x improvement" Distributed State Research paper :: Full Episode059 + const :: The holy grail of distributed AI training SN3 :: Templar :: Luis el grande_ai SN39 :: Basilica :: basilica SN81 :: Grail :: grail #SN3 #SN39 #SN81 #Bittensor

There's no point in doing decentralized training without efficient communication. >> DiLoCo (H=15) ships ~480mb/merge with 163 syncs. >> SparseLoCo (H=15) ships ~5.5–17mb/merge at 0.78–3.12% density with 163 syncs Top-K Compression + 2 bit comms ~28–89× smaller per sync than DiLoCo. Subnet 3 :: Luis el grande If you have the algorithm, you can train large language models across disparate compute, collectively. "In the space of eight months or nine months, we've been able to scale our model from 1.2B to 70B, which represents 58x improvement" Distributed State Research paper :: Full Episode059 + const :: The holy grail of distributed AI training SN3 :: Templar :: Luis el grande_ai SN39 :: Basilica :: basilica SN81 :: Grail :: grail #SN3 #SN39 #SN81 #Bittensor

Openτensor Foundaτion

17,767 görüntüleme • 9 ay önce

We've got a heaping helping of fancy new updates in today's #MultiVersus patch! Get a taste of the big shiny updates in this video, then dig into the juicy deets and feast your eyes on our full patch notes here:

We've got a heaping helping of fancy new updates in today's #MultiVersus patch! Get a taste of the big shiny updates in this video, then dig into the juicy deets and feast your eyes on our full patch notes here:

MultiVersus

376,709 görüntüleme • 1 yıl önce

We developed an RL method for fine-tuning our models for precise tasks in just a few hours or even minutes. Instead of training the whole model, we add an “RL token” output to π-0.6, our latest model, which is used by a tiny actor and critic to learn quickly with RL.

We developed an RL method for fine-tuning our models for precise tasks in just a few hours or even minutes. Instead of training the whole model, we add an “RL token” output to π-0.6, our latest model, which is used by a tiny actor and critic to learn quickly with RL.

Physical Intelligence

429,829 görüntüleme • 3 ay önce

Introducing Reinforcement-Learned Teachers (RLTs): Transforming how we teach LLMs to reason with reinforcement learning (RL). Blog: Paper: Traditional RL focuses on “learning to solve” challenging problems with expensive LLMs and constitutes a key step in making student AI systems ultimately acquire reasoning capabilities via distillation and cold-starting. Enter our RLTs—a new class of models prompted with not only a problem’s question but also its solution, and directly trained to generate clear, step-by-step “explanations” to teach their students. Remarkably, an RLT with only 7B parameters produces superior results when distilling and cold-starting students in competitive and graduate-level reasoning tasks than orders-of-magnitude larger LLMs. RLTs are as effective even when distilling 32B students, much larger than the teacher itself—unlocking a new standard for efficiency in developing reasoning language models with RL. Code:

Introducing Reinforcement-Learned Teachers (RLTs): Transforming how we teach LLMs to reason with reinforcement learning (RL). Blog: Paper: Traditional RL focuses on “learning to solve” challenging problems with expensive LLMs and constitutes a key step in making student AI systems ultimately acquire reasoning capabilities via distillation and cold-starting. Enter our RLTs—a new class of models prompted with not only a problem’s question but also its solution, and directly trained to generate clear, step-by-step “explanations” to teach their students. Remarkably, an RLT with only 7B parameters produces superior results when distilling and cold-starting students in competitive and graduate-level reasoning tasks than orders-of-magnitude larger LLMs. RLTs are as effective even when distilling 32B students, much larger than the teacher itself—unlocking a new standard for efficiency in developing reasoning language models with RL. Code:

Sakana AI

179,219 görüntüleme • 1 yıl önce

🤖Adding new RL algorithms to LeRobot just got much easier. Demo: HIL-SERL training with a SAC-based RL algorithm on an SO-100 for a hole-in-hand peg-in-hole task. Sparse reward, only 30 offline demos mixed with live robot experience, and ~1 hour of online training with human interventions only when the policy fails. The bottom graph tracks intervention rate: high at the start, steadily dropping as the policy improves. The refactor separates algorithm logic from training infrastructure: • RLAlgorithm owns learning logic • RLTrainer handles orchestration • DataMixer combines rollouts, demos, interventions, and future data sources Adding an RL algorithm now looks much closer to adding a policy: one algorithm file, one config, one registry entry. SAC is first. RLT, RECAP, ConRFT, QC-FQL, DSRL, and VLA RL fine-tuning next! Thomas Wolf clem 🤗

🤖Adding new RL algorithms to LeRobot just got much easier. Demo: HIL-SERL training with a SAC-based RL algorithm on an SO-100 for a hole-in-hand peg-in-hole task. Sparse reward, only 30 offline demos mixed with live robot experience, and ~1 hour of online training with human interventions only when the policy fails. The bottom graph tracks intervention rate: high at the start, steadily dropping as the policy improves. The refactor separates algorithm logic from training infrastructure: • RLAlgorithm owns learning logic • RLTrainer handles orchestration • DataMixer combines rollouts, demos, interventions, and future data sources Adding an RL algorithm now looks much closer to adding a policy: one algorithm file, one config, one registry entry. SAC is first. RLT, RECAP, ConRFT, QC-FQL, DSRL, and VLA RL fine-tuning next! Thomas Wolf clem 🤗

LeRobot

30,037 görüntüleme • 1 ay önce

Bittensor Roadmap: Major updates Weight copying fixed with "Commit Reveal". See the developer doc Dividends to subnet validator are now better correlated to the subnet miner's performance, with "Consensus-based weights". See a comprehensive blog: Plus, new “child” hotkeys are boosting security for subnets and validators. All of this plus more in the latest Novelty Search podcast:

Bittensor Roadmap: Major updates Weight copying fixed with "Commit Reveal". See the developer doc Dividends to subnet validator are now better correlated to the subnet miner's performance, with "Consensus-based weights". See a comprehensive blog: Plus, new “child” hotkeys are boosting security for subnets and validators. All of this plus more in the latest Novelty Search podcast:

Openτensor Foundaτion

44,162 görüntüleme • 2 yıl önce

Scaling laws in deep RL? Turns out that batch size, learning rate, and UTD (update-to-data) for getting the most efficient and scalable deep RL has predictable relationships. Checkout the analysis in new work by Oleg Rybkin & collaborators:

Scaling laws in deep RL? Turns out that batch size, learning rate, and UTD (update-to-data) for getting the most efficient and scalable deep RL has predictable relationships. Checkout the analysis in new work by Oleg Rybkin & collaborators:

Sergey Levine

43,464 görüntüleme • 1 yıl önce

Excited to present FastTD3: a simple, fast, and capable off-policy RL algorithm for humanoid control -- with an open-source code to run your own humanoid RL experiments in no time! Thread below 🧵

Excited to present FastTD3: a simple, fast, and capable off-policy RL algorithm for humanoid control -- with an open-source code to run your own humanoid RL experiments in no time! Thread below 🧵

Younggyo Seo

130,935 görüntüleme • 1 yıl önce

Guardian Tensor Links – Sneak Peek 1 We are thrilled to kick off our Guardian Tensor Links series with the first of many exciting updates. This series will showcase how Guardian devices seamlessly connect, exchange TAO tokens, and share AI models across the Bittensor network. In this update, we have successfully activated two Guardian devices as Bittensor nodes and conducted a TAO token transfer from the first to the second device. The transaction was completed successfully, and we verified it on Taostats, ensuring smooth functionality across the Bittensor network. In our next update, we’ll take things even further, demonstrating how multiple Guardian devices can work together, forming a decentralized network to collaborate and share models. This will highlight the true potential of decentralized AI learning and connectivity within the Bittensor ecosystem. Stay tuned for more exciting updates as we continue exploring the power of Guardian devices within the Bittensor network!

Guardian Tensor Links – Sneak Peek 1 We are thrilled to kick off our Guardian Tensor Links series with the first of many exciting updates. This series will showcase how Guardian devices seamlessly connect, exchange TAO tokens, and share AI models across the Bittensor network. In this update, we have successfully activated two Guardian devices as Bittensor nodes and conducted a TAO token transfer from the first to the second device. The transaction was completed successfully, and we verified it on Taostats, ensuring smooth functionality across the Bittensor network. In our next update, we’ll take things even further, demonstrating how multiple Guardian devices can work together, forming a decentralized network to collaborate and share models. This will highlight the true potential of decentralized AI learning and connectivity within the Bittensor ecosystem. Stay tuned for more exciting updates as we continue exploring the power of Guardian devices within the Bittensor network!

ZKCrypt AI

16,503 görüntüleme • 1 yıl önce

🚨 A conspiracy is circulating that Ilia Topuria MISSED WEIGHT by half a pound for #UFC317 The commissioner read Topuria’s weight as 155.5lbs, reconfirmed the weight, and after being told he had to be 155lbs, says he miss spoke and announced the weight as 155lbs. Mistake or…🤔

🚨 A conspiracy is circulating that Ilia Topuria MISSED WEIGHT by half a pound for #UFC317 The commissioner read Topuria’s weight as 155.5lbs, reconfirmed the weight, and after being told he had to be 155lbs, says he miss spoke and announced the weight as 155lbs. Mistake or…🤔

Combat Casuals

1,961,953 görüntüleme • 11 ay önce

.Shorebird is now in open beta with code push for Flutter! 🎉🐦 ⚡️ push updates to devices instantly 🔎 updates are diffed for small patch sizes ✨ updates are installed in the background 👀 learn more: 🤝 discord:

.Shorebird is now in open beta with code push for Flutter! 🎉🐦 ⚡️ push updates to devices instantly 🔎 updates are diffed for small patch sizes ✨ updates are installed in the background 👀 learn more: 🤝 discord:

Felix Angelov 💙

59,462 görüntüleme • 3 yıl önce

Danil Donchenko setting up his leg kicks with head movement, his back and forth weight transfers disguising the kick as he shifts weight over his lead foot

Danil Donchenko setting up his leg kicks with head movement, his back and forth weight transfers disguising the kick as he shifts weight over his lead foot

MixingMartialArts

18,767 görüntüleme • 4 ay önce

RL GRIME dropping his unreleased collab with WINK at EDC. Need this ID RL GRIME

RL GRIME dropping his unreleased collab with WINK at EDC. Need this ID RL GRIME

Brownies & Lemonade🍫🍋

16,308 görüntüleme • 1 yıl önce