Loading video...

Video Failed to Load

Go Home

Introducing LifeGPT, showing that LLMs can simulate complex, Turing-complete systems like Conway's Game of Life with near-perfect accuracy—no prior topology needed.🌐This unlocks new potential for AI in modeling self-organizing systems in biology, materials science, & beyond.🔬🤖 #AI #LifeGPT. Cellular Automata (CA), like Conway's Game of Life ("Life"), are computationally...

114,174 views • 1 year ago •via X (Twitter)

10 Comments

Markus J. Buehler's profile picture
Markus J. Buehler1 year ago

Paper📰: Jaime Berkovich, Markus J. Buehler, LifeGPT: Topology-Agnostic Generative Pretrained Transformer Model for Cellular Automata, 2024 Code: Weights 🤗:

Khlorghaal.so.1 🟦's profile picture
Khlorghaal.so.1 🟦1 year ago

the abstract had me really skeptical about its utility, but potential ability to infer rules to form a system is extremely useful. an immediate application would be lossy data compression

RNLG's profile picture
RNLG1 year ago

congrats, you've used a Mill(a computer) to carve a spoon (a LLM) to carve a spoon (a program) to finally then eat your soup with. And badly, at that.

snats's profile picture
snats1 year ago

i know this is most likely the case but could you in theory demonstrate with this that they are turing complete?

🐕🐠🦆 SOPHIE 🐌🦉🧸🎀🥇🧟‍♀️🐾☕️☃️💥🔭🧙‍♀️🔥⚡️✨'s profile picture
🐕🐠🦆 SOPHIE 🐌🦉🧸🎀🥇🧟‍♀️🐾☕️☃️💥🔭🧙‍♀️🔥⚡️✨1 year ago

to claim conway's game of life is "complex" is an exaggeration. the resulting emergent structures are complex, but they exist on a meta level. the ruleset itself is extraordinarily simple and easy to simulate, which doesnt make an LLM that learns it all that impressive

John Shedletsky's profile picture
John Shedletsky1 year ago

Why did you write a paper about this?

Niamato Inc's profile picture
Niamato Inc1 year ago

Absolutely thrilled by the groundbreaking work by Professor @ProfBuehlerMIT and Prof Jaime Berkovich on LifeGPT! The model’s ability to simulate the Game of Life on a toroidal grid with such precision is a testament to your innovation and expertise. Excited to see where this leads in both AI and biological research!

Markus J. Buehler's profile picture
Markus J. Buehler1 year ago

Thank you @Niamatomobility !

Mike Young's profile picture
Mike Young1 year ago

We have a summary up on @aimodelsfyi here (lmk your feedback!)

James's profile picture
James1 year ago

but that's like teaching an LLM to predict output of NAND gates based on input. sure you can compute that way but it's ultra inefficient. of course an LLM should be able to reason about such a thing, but if it needs to simulate something it should write code like the rest of us

Related Videos

Today we're announcing #GAIA1: a 9B parameter world model, trained on 4,700 hours of driving data, able to simulate complex and diverse driving scenes from video, text and action inputs. This model is 480x larger than the preview we shared earlier this year and the results are incredible. These videos are entirely synthetically generated by Wayve's generative AI, GAIA-1. But there is more here than just generating videos, GAIA is an entire world model. A world model allows us to simulate the future, conditioned on video, text and action inputs, which can be leveraged for making informed decisions when driving. Why is this game-changing for autonomous driving? 1. Safety. One limitation with AI systems like today's Large Language Models is that they are autoregressive, next-word prediction algorithms, but aren't necessarily aware of the implications of their decisions. A world model allows us to give our AI the capability to be aware of its decisions, by simulating the future, which is important for self-driving safety. 2. Synthetic training data. I believe synthetic training data is the future for AI, because it is safer, cheaper, and infinitely scalable. GAIA-1 unlocks unprecedented realism and diversity of synthetic data for self-driving. 3. Long-tail robustness. One of the biggest challenges for self-driving is long-tail robustness: dealing with the enormous magnitude of edge cases we see on the road. An advantage of generative AI is its incredible ability to recombine experiences in new ways. This is exciting for self-driving as it means we can learn from two edge case scenarios, and combine them to become a corner case. For example, we can experience driving in fog, and experience of jay-walking pedestrians, and GAIA can learn from these experiences to understand how to generate a fog+jay walking scenario. Check out many more videos in our blog or further technical details in our paper: Or come chat with our team who are at the International Conference on Computer Vision (#ICCV2023) this week in Paris in Booth 32 Jamie Shotton

Alex Kendall

631,833 views • 2 years ago

A transformer can learn not just the outcomes of dynamics, but the operator that executes the rules. To show this we trained a transformer on roughly 0.04% of a discrete rule space - 100 of 262,144 possible rules - and it learned to apply unseen rules from the same rule class. The model does not simply memorize specific rules. It learns the operator that maps a supplied rule plus an initial state, including unseen rules from this class, to the correct next state. This is relevant because it is a shift from “neural networks approximate dynamics” to “neural networks can learn to execute symbolic programs within a defined rule class”. The rule itself is supplied at inference time, as data, and the network has internalized how rules act, not which rules to apply. On previously unseen rules, the model achieves 98.5% perfect one-step forecasts and reconstructs governing rules with up to 96% functional accuracy. Two results make this hold up under scrutiny. First, inductive bias decay. As we scaled training rule diversity, the correlation between functional inference accuracy and distance-from-nearest-training-rule collapsed to R² = 0.00. At the largest tested training-rule diversity, the model’s performance on a new rule shows no measurable dependence on how similar that rule is to anything it was trained on. The bias toward training data (the thing we worry most about in compositional generalization claims) is something we can measure decaying, and we find that at scale it is gone. Second, an identifiability theory. We derive a closed-form expression for the number of rules consistent with a single observation. This reframes the inverse problem: failure to recover ground truth is not necessarily a model defect, but can be correct behavior when the data underdetermine the rule. The model is sampling the equivalence class; and identifiability is governed by coverage, not capacity. The methodological move underneath both results is amortization. Classical work on rule inference (e.g. the Santa Fe EVCA program, evolutionary search over CA rule space) was per-instance: search the rule space for each new system. We replace that with a single forward pass of a transformer trained across many instantiations of the rule class. That is what makes symbolic rule inference scalable as a research direction rather than a curiosity. We show that this works in a tightly constrained domain: binary, deterministic, local cellular automata on small grids. The locality-break experiment shows the model fails sharply when target systems violate its structural priors (which is itself a useful diagnostic, but it bounds the operator class). We don't yet know how this scales to multistate, higher-dimensional, or stochastic CA, or whether it transfers cleanly to non-CA systems whose coarse-grained dynamics admit local surrogates. The identifiability framework - what can be inferred from observation, given a hypothesis class - should transfer wherever finite local rules meet sparse data. The amortization argument transfers wherever per-instance symbolic search has been the bottleneck. Those are the pieces I expect to outlive the cellular automata setting. Led by Jaime Berkovich with Noah David, at LAMM@MIT. Out now in Advanced Science Advanced Portfolio News (link to paper & code below).

Markus J. Buehler

38,912 views • 1 month ago

Tencent presents GameGen-O Open-world Video Game Generation We introduce GameGen-O, the first diffusion transformer model tailored for the generation of open-world video games. This model facilitates high-quality, open-domain generation by simulating a wide array of game engine features, such as innovative characters, dynamic environments, complex actions, and diverse events. Additionally, it provides interactive controllability, thus allowing for the gameplay simulation. The development of GameGen-O involves a comprehensive data collection and processing effort from scratch. We collect and build the first Open-World Video Game Dataset (OGameData), amassed extensive data from over a hundred of next-generation open-world games, employing a proprietary data pipeline for efficient sorting, scoring, filtering, and decoupled captioning. This robust and extensive OGameData forms the foundation of our model's training process. GameGen-O undergoes a two-stage training process, consisting of foundation model pretraining and instruction tuning. In the first phase, the model is pre-trained on the OGameData via the text-to-video and video continuation, endowing GameGen-O with the capability for open-domain video game generation. In the second phase, the pre-trained model is frozen, and we fine-tuned using a trainable InstructNet, which enables the production of subsequent frames based on multimodal structural instructions. This whole training process imparts the model with the ability to generate and interactively control content. In summary, GameGen-O represents a notable initial step forward in the realm of open-world video game generation via generative models. It underscores the potential of generative models to serve as an alternative to rendering techniques, which can efficiently combine creative generation with interactive capabilities.

AK

366,858 views • 1 year ago

Introducing BioCLIP: A Vision Foundation Model for the Tree of Life A foundation model that strongly generalizes on the tree of life (2M+ species), outperforming OpenAI CLIP by 18% in zero-shot classification, and supports open-ended classification over almost the entire tree of life What's the secrete ingredients? > Data: we curate and release TreeOfLife-10M, the largest and most diverse ML-ready dataset of organism images to date. It contains 10.4M images for over 450K taxa, sourced from iNaturalist, BIOSCAN, and Encyclopedia of Life. > Modeling: we creatively repurposes CLIP's multimodal contrastive learning objective for hierarchical image classification. The autoregressive language model naturally encodes the hierarchy of the tree of life taxonomy, which in turn bakes the hierarchical representation into the vision transformer encoder. Key results > Strong zero/few-shot classification for animals/plants/fungi, including rare species, outperforming CLIP by avg 16-18% absolute. > T-sne visualization shows that BioCLIP's vision encoder has captued the fine-grained hierarchical structure of the tree of life > BioCLIP is a kind of universal classifier for the tree of life. Just give it an organism image and it will likely find the correct species (among top 5)! But use it with caution; it's not perfect yet.. Final remarks > AI for Science is really hard but extremely rewarding! It took us a ton of time (1+ year) and frustration trying to find a plausible way to integrate the tree of life taxonomy into foundation model training. But when the "Eureka!" moment came and the idea hit us (by the great Wei-Lun Chao) that CLIP's multimodal contrastive learning objective can be repurposed for that, everything just follows naturally. It was truly a moment of joy and excitement! > BioCLIP is our first attempt at foundation models for biology, but it certainly won't be the last! There's so much more to do at the intersection of one of the oldest scientific disciplines and the young but thriving field of AI. Biological intelligence is the foundation for artificial intelligence, and artificial intelligence will in turn become the most important tool for us to unraval the mysteries of biological intelligence. We are hiring postdocs and PhDs in the NSF Imageomics Institute institute to explore this exciting field! Drop us an email. also happy to chat about it at #NeurIPS2023 with any of Tanya, Wei-Lun Chao, or me. - paper: - project: - demo: - model: - data (TreeOfLife-10M): to be released on Hugging Face soon joint work with the amazing Imageomics Institute team: @samstevens6860 Lisa Wu, Matt Thompson, Elizabeth Campolongo Chan Hee (Luke) Song David Carlyn Li Dong Wasila Dahdul Chuck Stewart, Tanya Berger-Wolf Wei-Lun Chao Yu Su

Yu Su

80,595 views • 2 years ago

This is how DNA turns coded information into functional proteins - the building blocks of the nanomachines that keep the cells in your body alive. This complex process highlights the sophisticated interconnected systems of Life which must all exist together from the beginning, or Life doesn't happen. First, an RNA molecule is copied from a short segment of DNA. Without the specifically ordered DNA information, RNA cannot form, proteins cannot be built, cells stop working, and life ceases to exist. Life is information first. Once the RNA Molecule is created, it gets ejected from the Polymerase where it was built, and it travels through a complex molecular machine called a Nuclear Pore Complex (NPC), which is an information recognition device that controls the flow of information in and out of a cell's nucleus. The NPC is highly complex - composed of about 500-1,000 protein subunits, derived from a set of about 35 distinct proteins. Without this molecular machine, there is no regulation for what goes in and out of the cell's nucleus, which would lead to catastrophic death for the cell. It must exist for cells to exist. Once the RNA Molecule passes through the NPC, it travels to the Ribosome, a 2-part chemical factory which reads the information on RNA and uses it to construct functional proteins using a specifically sequenced chain of amino acids. Once complete, this protein will then be sent to the section of the cell it belongs to integrate into another molecular machine and do its job. The Ribosome is another highly complex molecular machine - consisting of between 56-80 proteins. Without this molecular machines, proteins cannot be built. Proteins are the building blocks of every cell in every organism on Earth. Without Ribosomes, Life doesn't exist. If you're paying attention, you'll start to realize that Life relies on a highly sophisticated interdependent network of complex machines, which all rely on each other for the function of the system. DNA requires the cell for stability, but the cell requires the proteins for its structure and function, but those proteins require DNA and RNA to be built - it's a circle of necessary interdependence. Systems like this cannot be built by evolutionary processes, which requires that each piece of the process is built by gradual incremental means over lots of time. Without all the pieces there, from the beginning, none of it works. There is only one known source of complex & interdependent informational systems like those we find in life: and that is from Intelligence. Molecular Biology is the best and most obvious evidence of the Intelligent Design in Life.

Divinely Designed

62,470 views • 5 months ago

#NewPaper The first microscope, invented in the 16th century, was designed to unlock the secrets of the microscopic world. Today, as many fields become increasingly data-driven, there is a pressing need for new types of microscopes---tools that help us zoom in, explore, and understand complex data. We call these tools "algorithmic microscopes." Introducing the Vendiscope: The first algorithmic microscope for data collections. 🔬 The Vendiscope maximizes the probability-weighted Vendi Score of a dataset to assign a weight to each element in the collection. This weight represents a data point's contribution to the overall diversity of the collection. These weights enable high-resolution data analysis at scale. We use them to zoom in on datasets across three domains: biology, materials science, & AI. 🧬 Biology: We used the Vendiscope on the protein universe, which contains nearly 250 million proteins. We found that nearly 200 million of the proteins are near-duplicates of each other and that AlphaFold fails on proteins that contribute most to the diversity of the protein universe. (See GIF below). 🪜 Materials Science: We used the Vendiscope on the Materials Project database, which contains 170K materials as of today. We found that 85% of crystals with formation energy data are near-duplicates of each other and that ML models for materials property prediction struggle with materials that contribute most to diversity. 🤖 Artificial Intelligence: We applied the Vendiscope to CIFAR-10, a benchmark dataset containing 50K images. We found duplicates. We applied the Vendiscope to analyze state-of-the-art generative models trained on this dataset. We found the best generative models memorize training data, as is known in the AI literature. However, we can do more with the Vendiscope and characterize the type of samples that get memorized. We found that data points contributing least to diversity are more prone to memorization by these generative models. 🧠 "Our findings demonstrate that the Vendiscope can serve as a powerful tool for data-driven science, providing a systematic and scalable way to identify duplicates and outliers, as well as pinpointing samples prone to memorization and those that models may struggle to predict---even before training." 💫 "The Vendiscope provides a unified framework for analyzing complex data at scale. Researchers, engineers, and data auditors can use the Vendiscope to audit datasets, identify potential biases, and refine data collection practices. For AI ethicists, the Vendiscope offers a critical lens to understand how models interact with data, particularly in the context of bias, memorization, and data fairness, enabling better mitigation strategies to prevent undesirable outcomes in AI deployment. For scientists, the Vendiscope represents a new companion in the discovery process." #VendiScoring #AlgorithmicMicroscopy Link to paper: Authors: Amey Pasarkar (Amey Pasarkar) and Adji Bousso Dieng (@adjiboussodieng)

Vertaix® (AI & Science)

34,762 views • 1 year ago

Self-Evolving AI : New MIT AI Rewrites its Own Code and it’s Changing Everything | Julian Horsey, Geeky Gadgets TL;DR Key Takeaways : - MIT’s SEAL framework introduces “self-adapting language models” that autonomously enhance their capabilities by generating synthetic training data, self-editing, and updating internal parameters. - SEAL’s self-adaptation process mirrors human learning, allowing continuous improvement and dynamic adaptation to new tasks without relying on external datasets. - Reinforcement learning serves as a feedback mechanism in SEAL, rewarding effective self-edits and making sure sustained progress and goal alignment. SEAL overcomes AI’s reliance on pre-existing datasets by generating its own training material, excelling in long-term task retention and complex problem-solving scenarios. - Potential applications of SEAL include autonomous robotics, personalized education, and advanced problem-solving in fields like healthcare, logistics, and scientific research. --- What if artificial intelligence could not only learn but also rewrite its own code to become smarter over time? This is no longer a futuristic fantasy—MIT’s new “self-adapting language models” (SEAL) framework has made it a reality. Unlike traditional AI systems that rely on external datasets and human intervention to improve, SEAL takes a bold leap forward by autonomously generating its own training data and refining its internal processes. In essence, this AI doesn’t just evolve—it rewires itself, mirroring the way humans adapt through trial, error, and self-reflection. The implications are staggering: a system that can independently enhance its capabilities could redefine the boundaries of what AI can achieve, from solving complex problems to adapting in real time to unforeseen challenges. In this exploration by Wes Roth of MIT’s innovative SEAL framework, you’ll uncover how this self-improving AI works and why it’s a fantastic option for the field of artificial intelligence. From its ability to overcome the “data wall” that limits many current systems to its use of reinforcement learning as a feedback mechanism, SEAL introduces a level of autonomy and adaptability that was previously unimaginable. Imagine AI systems that can retain knowledge over time, dynamically adjust to new tasks, and operate with minimal human oversight. Whether you’re intrigued by its potential for autonomous robotics, personalized education, or advanced problem-solving, SEAL’s ability to rewrite its own rules promises to reshape the future of technology. Could this be the first step toward truly independent, self-evolving AI? What Sets SEAL Apart? The SEAL framework introduces a novel concept of self-adaptation, distinguishing it from traditional AI models. Unlike conventional systems that depend on external datasets for updates, SEAL enables AI to generate synthetic training data independently. This self-generated data is then used to iteratively refine the model, making sure continuous improvement. By persistently updating its internal parameters, SEAL enables AI systems to dynamically adapt to new tasks and inputs. To better illustrate this, consider how humans learn. When faced with a new concept, you might take notes, revisit them, and refine your understanding as you gather more information. SEAL mirrors this process by continuously refining its internal knowledge and performance through iterative self-improvement. This capability allows SEAL to evolve in real time, making it uniquely suited for tasks requiring adaptability and long-term learning. The Role of Reinforcement Learning in SEAL Reinforcement learning plays a critical role in the SEAL framework, acting as a feedback mechanism that evaluates the effectiveness of the model’s self-edits. It rewards changes that enhance performance, creating a cycle of continuous improvement. Over time, this feedback loop optimizes the system’s ability to generate and apply edits, making sure sustained progress. This process is analogous to how humans learn through trial and error. By rewarding effective changes, SEAL aligns its self-generated data and edits with desired outcomes. The integration of reinforcement learning not only enhances the system’s adaptability but also ensures it remains focused on achieving specific goals. This structured feedback mechanism is a cornerstone of SEAL’s ability to refine itself autonomously and efficiently. Real-World Applications and Testing SEAL has demonstrated remarkable performance across various applications, particularly in tasks requiring the integration of factual knowledge and advanced question-answering capabilities. For instance, when tested on benchmarks like the ARC AGI, SEAL outperformed other models by effectively generating and using synthetic data. This ability to create its own training material addresses a significant limitation of current AI systems: their reliance on pre-existing datasets. SEAL’s capacity for long-term task retention and dynamic adaptation further enhances its utility. It excels in scenarios that demand sustained focus and coherence, such as answering complex questions or adapting to evolving objectives. By using its iterative learning process, SEAL is equipped to handle these challenges with exceptional efficiency, making it a valuable tool for a wide range of real-world applications. Overcoming AI’s Data Limitations One of SEAL’s most promising features is its ability to overcome the “data wall” that constrains many AI systems today. By generating synthetic data, SEAL ensures a continuous supply of training material, allowing sustained development without relying on external datasets. This capability is particularly valuable for autonomous AI systems that must operate independently over extended periods. Additionally, SEAL addresses a critical weakness in many current AI models: their struggle with coherence and task retention over long durations. By emulating human learning processes, SEAL enables AI systems to manage complex, long-term tasks with minimal human intervention. This ability to retain and apply knowledge over time positions SEAL as a fantastic tool for advancing AI capabilities. Potential Applications and Future Impact The introduction of SEAL marks a significant milestone in AI research, opening new possibilities for self-improving systems. Its ability to dynamically adapt, retain knowledge, and generate its own training data has far-reaching implications for the future of AI development. Potential applications include: - Autonomous robotics: Systems that can adapt to changing environments and perform tasks with minimal human oversight. - Personalized education: AI-driven platforms that tailor learning experiences to individual needs and preferences. - Advanced problem-solving: Applications in fields such as healthcare, logistics, and scientific research, where adaptability and precision are critical. Read more:

Owen Gregorian

70,672 views • 11 months ago

Experiments in progress. The one on the right has been learning for ~3 hours, the one in the middle for ~1 hour, and the one on the left just started a few minutes ago. The initial motivation for making the physical Atari was just to commit ourselves to a subset of algorithms that can make progress in this setup. This commitment rules out algorithms that require billions of samples to learn (or worse, require multiple environments running in parallel). Atari games are simple enough that we should be able to show learning on them in a short amount of time with no prior knowledge. Since then, I've realized that this setup is also a good way to compare different paradigms in robotics in a principled way. These paradigms are sim2real, learning from tele-operated data, and learning directly on the robots. So far, I have observed that getting sim2real to work reliably is hard. It requires tweaks that don't scale. Policies that can play perfectly in simulation fall apart because of latencies and the messiness of the real world. These aspects could be modeled to improve the simulation, but not without sinking significant human engineering hours. I have higher hopes for learning from tele-operated data, but that requires a human to learn the task first. These experiments are on my to-do list. I have to learn to play some of the games well through the robot. I’m half-decent at playing Pong and Ms Pacman now. Learning directly on robots is looking like the most promising approach. This approach takes away pesky distribution shifts and makes it possible to have algorithms that continually improve with more data and time without any human intervention. It feels great to let experiments run overnight and wake up to find improved policies. With learning on robots, I should, in principle, be able to go on a long vacation and come back to find better policies for complex tasks beyond Atari games. Whether that is possible with current learning algorithms is a different question.

Khurram Javed

52,078 views • 6 months ago