Loading video...
Video Failed to Load
Introducing ESM Cambrian. Unsupervised learning can invert biology at scale to reveal the hidden structure of the natural world. We’ve scaled up compute and data to train a new generation of protein language models. ESM C defines a new state of the art for protein representation learning.
206,827 views • 1 year ago •via X (Twitter)
10 Comments

ESM C models establish a frontier of performance as a function of parameter scale. We see large improvements across all parameter scales over previous state of the art models. Read more:

Information about protein structure in ESM C representations improves predictably with increasing training compute, demonstrating linear scaling across multiple orders of magnitude. (We overtrained the 300M and 600M models past the predicted point of compute optimality).

ESM C comes with major performance and efficiency benefits over ESM2. The 300M parameter ESM C delivers similar performance to ESM2 650M. The 600M delivers similar performance to ESM2 3B and approaches the capabilities of the ESM2 15B, with far greater efficiency. The 6B parameter ESM C outperforms all ESM2 models by a wide margin.

Today we’re releasing ESM C 300M, and 600M with open weights. ESM C 6B is available immediately on EvolutionaryScale Forge for academic use, and AWS Sagemaker for commercial use. ESM C will be on NVIDIA BioNemo soon. We’re excited to see what you build with ESM!

cool work!

Thanks Nathan!

Why is it called Cambrian?

wouldn't want to spoil the mystery!

Congrats Alex and the evoscale team! Exciting

Thank you Surge!
Related Videos
A new generation of grapplers are learning about Keenan in the Gi 💀
BJJotter
29,394 views • 1 year ago

