Загрузка видео...
Не удалось загрузить видео
Announcing INTELLECT-1: the first-ever decentralized training of a 10B model Scaling decentralized training 10x beyond prior efforts. Anyone can join us to build open-source AGI 🦋
790,053 просмотров • 1 год назад •via X (Twitter)
Комментарии: 10

We are grateful to our launch partners contributing compute: @huggingface @SemiAnalysis_ @arcee_ai @hyperbolic_labs @autonolas @akashnet_ @SchellingAI and many others 🤍

Anyone can contribute compute to advance open-source AI through our platform and later on also with their own hardware.

Built on Prime: Our new decentralized training framework that improves and scales DiLoCo up 25X: • Fault-Tolerant Training via new ElasticDeviceMesh abstraction • Optimized Communication: Reduces synchronization times by up to 1000-2000x vs centralized training. • Improving bandwidth utilization by 40x compared to our OpenDiLoCo release • High Compute Utilization: 98% compute utilization at 10B scale • Custom Int8 All-Reduce Kernels • Live checkpoint recovery Github:

INTELLECT-1 will be fully open source, incl. the training framework and dataset. Model specs: • 10B parameters, 6T+ tokens dataset • Llama Architecture and tokenizer • Dataset mix: Fineweb-edu, DLCM, Stack v2, OpenWebMath

Why it matters: Open-source AI is crucial to mitigate centralization risks and one of the biggest public goods. We need to coordinate compute, talent, capital to compete with closed-source labs. The longer term goal: scale to open source AGI models, continuously improving upon the best open source models in the world.

Shoutout to @samsja19, @jackminong, and @johannes_hage for their work on the decentralized training research. @manveerxyz, @jannik_stra, and @burnpiro for their work on the decentralized training platform. @eliebakouch for his help with composing the dataset. @Ar_Douillard et al. for their work on DiLoCo, and many PyTorch contributors for contributing valuable input.

Let's build open source AGI together. Join the decentralized training run: Apply: Discord:

Blogpost with more infos

Update: We did it — the first decentralized training of our 10B model is complete!

refreshing and liberating, and it's what tech should feel like
