Loading video...

Video Failed to Load

Go Home

First look at SPARTA, a distributed AI training algorithm that avoids synchronization by randomly exchanging sparse sets of parameters ( 1,000x reduction in inter-GPU communication, enabling training of large models over slow bandwidths without specialized infrastructure. SPARTA works on its own but can also be combined with sync-based low...

99,301 views • 1 year ago •via X (Twitter)

0 Comments

No comments available

Comments from the original post will appear here

Related Videos