正在加载视频...
视频加载失败
First look at SPARTA, a distributed AI training algorithm that avoids synchronization by randomly exchanging sparse sets of parameters ( 1,000x reduction in inter-GPU communication, enabling training of large models over slow bandwidths without specialized infrastructure. SPARTA works on its own but can also be combined with sync-based low... show more
99,289 次观看 • 1 年前 •via X (Twitter)
0 条评论
暂无评论
原始帖子的评论将显示在这里

