Loading video...
Video Failed to Load
⚙️ 404—GEN has released the WORLD'S LARGEST open-source 3D model dataset ⚙️ • 21.5M+ AI-generated 3d models • Open-source and attribution-ready • Larger than all existing 3D datasets combined This is an unmatched scale for Gaussian Splatting research. More details below 🧵 ⬇️
44,735 views • 1 year ago •via X (Twitter)
12 Comments

About the Synthetic Dataset 🔎 • This is the largest open-source dataset of its kind, comprised of over 21.5 million high-fidelity 3D models. Each asset includes detailed metadata, usage rights, and ownership attribution. • This dataset is larger than all existing 3D model datasets combined. • Its scale is especially notable for the Gaussian Splatting research community, where typical are a fraction of the size.

Data Scarcity ⚖️ • Synthetic datasets of this size are critical for improving AI models, as the models require massive amounts of high-quality data to train and learn. • In recent years, it's been predicted that we will run out of human-generated data to use in training by 2030, as mentioned in this article by Epoch AI Research - • "Synthetic" datasets, on the other hand, contain data created by other AI models. This data, especially when created on a large scale, can be very useful in training new models.

Unlocking the Power of Decentralized Networks 🌐 • This milestone would be nearly impossible to achieve through a centralized system. 404—GEN’s rapid scale is a direct result of being a part of the Bittensor network, where independent miners are rewarded based on output quality. • This network design has enabled an unprecedented volume of synthetic 3D content to be generated, scored, and shared in record time, highlighting the production potential of decentralized intelligence and compute.

Access and Use Cases 🛠️ • One of the biggest challenges was finding a place large enough to house the dataset for public use. In its entirety, the dataset contains 40TB of data • Due to its size, a sample set is now publicly available on @huggingface at • Access to the full dataset is available upon request at with priority given to projects advancing 3D research and development. • The dataset unlocks immediate value for researchers, developers, and studios looking to accelerate their pipelines with high-quality 3D content. For additional information, visit our website:

running bittensor

🏃♂️

🤯

This is just insane

3D generated models are better and better and better. Nice work guys

You don't need human feedback on those?

How does it translate to direct revenue ser? :)

Announcing: Our most advanced speech-to-text model goes beyond accuracy to capture the real-world complexity of human conversation and deliver reliable, source-of-truth audio data. Explore Universal-2 updates 👇
