Loading video...

Video Failed to Load

Go Home

I hand-wrote a 500-LoC RL stack to make hacking on RL research much easier. Most RL stacks are either massive and unhackable, or duct-taped research scripts. I am open-sourcing Mithrl, a modular RLVR stack. Next items on my checklist: adding more complex environment examples, supporting multi-gpu + async RL,...

17,218 views • 3 months ago •via X (Twitter)

0 Comments

No comments available

Comments from the original post will appear here

Related Videos

🚨 RL for LLMs is finally accessible. Introducing OpenTinker: The first community-driven, open-source framework designed to democratize Reinforcement Learning for LLMs. Inspired by Thinking Machines's amazing Tinker, we realize the biggest bottleneck in agentic LLM research isn’t the math—it’s the setup. Current RL pipelines are messy. Configuring VeRL for every single experiment is a productivity killer. OpenTinker fixed it. 🛠 How OpenTinker Works: Decoupled Design of Server and Client - Setup Once, Run Forever: Configure the OpenTinker backend on your GPU cluster once. - Develop Locally: Define your RL environments directly on your laptop. - Train on the Cloud: Simply point your local client to the backend. The cluster handles the compute; you handle the science. 📉 The 10x Development Efficiency Thanks to our elegant architectural decomposition, OpenTinker reduces the time to develop a new RL training pipeline by at least an order of magnitude. ⚡ Turn Idle GPU Compute into Gold Small labs often have underutilized hardware. OpenTinker turns your idle GPUs into an internal/external API service for - RL Training - SFT - Inference 🎯 Who needs OpenTinker? - Researchers tired of infrastructure hell. - Labs needing to standardize workflows. - Teams wanting to maximize hardware ROI. Thanks my amazing PhD student Siqi Zhu for leading the project. We are building the future of open RL infra. Be the first to build with us. 👇 Start Building with OpenTinker Now 🚀 Repo: 🌐 Blog: If you believe RL should be accessible to everyone, give us a star, repost this 🔄 post, and let us know what agents you plan to build!

Jiaxuan You

58,120 views • 6 months ago