正在加载视频...

视频加载失败

Introducing Digital Red Queen (DRQ): Adversarial Program Evolution in Core War with LLMs Blog: Core War is a programming game where self-replicating assembly programs, called warriors, compete for control of a virtual machine. In this dynamic environment, where there is no distinction between code and data, warriors must crash...

143,361 次观看 • 4 个月前 •via X (Twitter)

0 条评论

暂无评论

原始帖子的评论将显示在这里

相关视频

Survival of the fittest code. Core War (1984) is a game where programs must crash their opponents to survive. Warriors written in an assembly language called Redcode fight for control of a virtual machine. Our new paper: Digital Red Queen: Adversarial Program Evolution in Core War with LLMs, explores what happens when LLMs drive an adversarial evolutionary arms race in this domain. We task LLMs to write Warrior programs in Redcode that must out-compete a virtual world full of such programs. Core War is a Turing-complete environment where code and data share the same address space, which leads to some very chaotic self-modifying code dynamics. This approach is inspired by the Red Queen hypothesis in evolutionary biology: the principle that species must continually adapt and evolve simply to survive against ever changing competitors. In our work, programs continuously adapt to defeat a growing history of opponents rather than a static benchmark. We find that this adversarial process leads to the emergence of increasingly general strategies, including targeted self-replication, data bombing, and massive multithreading. Most intriguingly, it reveals a form of convergent evolution. Different code implementations settle into similar high performing behaviors, mirroring how biological agents independently evolve similar traits to solve the same problems. I think this work positions Core War as a sandbox for studying Red Queen dynamics in artificial systems. It offers a safe controlled environment for analyzing how AI agents might evolve in real world adversarial settings such as cybersecurity. By simulating these adversarial dynamics in an isolated sandbox, we offer a glimpse into the future where deployed LLM systems may start competing against one another for limited resources in the real world.

hardmaru

173,423 次观看 • 4 个月前

*New Paper on AI & Democracy* Imagine two approaches to democracy. The one we have today, where citizens choose a professional politician to represent them and others. Or an augmented form of democracy, where each citizen controls a personalized AI that helps them participate in thousands of nuanced decisions. This second approach is the idea of Augmented Democracy I introduced six years ago at TED. In our latest paper we explore a simplified version of Augmented Democracy by combining off-the-shelf LLMs, such as ChatGPT, with data collected using a collaborative government program builder. This was an online game where people build a personalized government program using proposals extracted from the programs of the candidates of the 2022 presidential election in Brazil. So how accurate are these augmented forms of democracy? Imagine a user who gave us 40 answers. We can use the first 20 to fine-tune a model that we can test using the 20 answers the model didn’t see. We can then compare the accuracy of these predictions with the ones obtained by a “bundle” rule, which assumes that users that self-reported to be from the left or right always chose the proposals from the candidate that shares their political identity. This showed us that LLMs were more accurate at predicting policy preferences than the bundle rule, meaning that the preferences captured in the participation data were more nuanced than a left-right axis, and that the LLMs can capture some of that nuance. Also, the LLMs can choose among policies coming from the same candidate, which is something that we cannot do using a bundle rule. But can these LLMs help us complete the aggregate preferences of the population? Direct or unbundled forms of participation can result in incomplete data when people answer only a fraction of all questions. In our paper, we simulate this incompleteness by sampling the full dataset. We ask how close we can get to the full dataset by using a random sample, or a random sample augmented by predictions made by these LLMs. Overall, we find that LLM-augmented data gets much closer to the full dataset than a pure random sample. These results do not mean that augmented democracy technology is ready, but they means we are in a much better place to continue exploring this idea than six years ago. This paper was a collaborative effort with Jairo Gudino, PhD student at CCL at the University of Toulouse Capitole and Umberto Grandi from IRIT also at the University of Toulouse Capitole. We hope you find these results insightful!

César A. Hidalgo

26,912 次观看 • 1 年前