Loading video...
Video Failed to Load
Distributed training on M4 Mac Mini cluster We implemented Google DeepMind DiLoCo on Apple Silicon to train large models with 100-1000x less bandwidth compared to DDP baseline. AI is entering a new era where a distributed network of consumer devices can train large models.
347,655 views • 1 year ago •via X (Twitter)
0 Comments
No comments available
Comments from the original post will appear here
