正在加载视频...
视频加载失败
from Jeff Dean at Dwarkesh Patel podcast: "asynchronous training where each copy of the model does local computation [...] it makes people uncomfortable [...] but it actually works" yep, i can confirm, it does work for real
101,448 次观看 • 1 年前 •via X (Twitter)
10 条评论

@dwarkesh_sp some people actually came to me in SF and told me "but DiLoCo is actually working!" being very surprised that it wasn't just another paper misleading with outlandish claims learn more:

“In our cross-functional teams, everyone has an equal seat at the table because everyone will bring different perspectives and expertise.” In this @Atlassian-sponsored podcast, learn more about why the @Hot_Wheels #brand has had such staying power. #collaboration #ad

@JeffDean @dwarkesh_sp Does this diminish the value of ultra-high-bandwidth, low-latency interconnects like InfiniBand, or are they still important?

@JeffDean @dwarkesh_sp methods like DiLoCo adds a new axis, in our published experiments but also in @PrimeIntellect's Intellect-1 and @flwrlabs's Photon, you have multiple levels of parallelism: in the order of required bandwidth/latency: tensor parallelism > (fs)dp > diloco

@JeffDean @dwarkesh_sp diloco is the nightmare of people who think we can just ban technology to prevent danger

@JeffDean @dwarkesh_sp How far is this from hogwild? I'm a bit out of the loop on the latest

@JeffDean @dwarkesh_sp Omg I did this! And I thought I was so clever, but of course Jeff's team already did it a decade ago 😂

@JeffDean @dwarkesh_sp @Ar_Douillard, I'm curious about your thoughts on optimizing the longevity of modules or sub-networks, such that they may still be useful in many future/downstream models. Will tomorrow's large open source models be composed of recycled parts?

@JeffDean @dwarkesh_sp wild

@JeffDean @dwarkesh_sp If Jeff says it works, it just does. Even physics can't change that
相关视频
Sensitive content
She makes NO money on OF but just does it for the ATTENTION?!
whatever
225,356 次观看 • 10 个月前
🚨Where does the Gates foundation money actually go? What does it do?
Truth_teller 🇷🇺
76,534 次观看 • 2 年前
