Andy Hall's banner

Andy Hall

@ahall_research • 10,088 subscribers

Building free systems. Prof @StanfordGSB, Senior Fellow @HooverInst. Advisor, @a16zcrypto, @ByForumAI. Writing at https://t.co/K0BfKKi4sM

Videos

Anya Rossi

sweetdream.ai

SweetDream.ai•Sponsored•Livecam

Watch Anya Live

Anya is streaming live right now! Join her private show and enjoy exclusive content.

Exclusive private shows

1.2k viewers online

Private Show

Join now for exclusive access

Free preview available • Premium content

1800: If Thomas Jefferson is elected "Murder, robbery, rape, adultery, and incest will all be openly taught and practiced." When it comes to politics we have a bad habit of romanticizing the past and imagining that today's politics are worse and coarser. To make this visceral, I built a little app that shows what the 1800 election would have felt like if X had been around. Scrolling through it really does give you a sense that vicious, indecorous politics long pre-dates present day. Check it out here:

1800: If Thomas Jefferson is elected "Murder, robbery, rape, adultery, and incest will all be openly taught and practiced." When it comes to politics we have a bad habit of romanticizing the past and imagining that today's politics are worse and coarser. To make this visceral, I built a little app that shows what the 1800 election would have felt like if X had been around. Scrolling through it really does give you a sense that vicious, indecorous politics long pre-dates present day. Check it out here:

214,207 views • 1 month ago

I built a tool to imagine how AI can help us create better, more transparent research. Instead of only seeing the tiny number of analyses the author chooses to ossify in a static pdf, imagine being able to talk to the papers you read and ask for different analyses. That's what I built. This early prototype uses our vote-by-mail extension. You get to see the initial analyses we thought made sense, but you can ask the LLM to show you way more! e.g., "what would the results look like if we excluded Utah" or "what would the results look like if we excluded Washington and used linear trends" etc. With research agents like Claude code, researchers are going to be able to automatically search across millions of potential analyses. Our current paradigm is not equipped to transmit this complexity effectively or make sure we don't end up with all p-hacked results. My lab and I are thinking about how we can transform the way we do and disseminate research to adjust for this new world. This visualizer is the first of what will hopefully be a wide range of ideas. Big props to Janet Malzahn for suggesting this one. Check out the tool at the link below.

I built a tool to imagine how AI can help us create better, more transparent research. Instead of only seeing the tiny number of analyses the author chooses to ossify in a static pdf, imagine being able to talk to the papers you read and ask for different analyses. That's what I built. This early prototype uses our vote-by-mail extension. You get to see the initial analyses we thought made sense, but you can ask the LLM to show you way more! e.g., "what would the results look like if we excluded Utah" or "what would the results look like if we excluded Washington and used linear trends" etc. With research agents like Claude code, researchers are going to be able to automatically search across millions of potential analyses. Our current paradigm is not equipped to transmit this complexity effectively or make sure we don't end up with all p-hacked results. My lab and I are thinking about how we can transform the way we do and disseminate research to adjust for this new world. This visualizer is the first of what will hopefully be a wide range of ideas. Big props to Janet Malzahn for suggesting this one. Check out the tool at the link below.

70,649 views • 4 months ago

Today, I'm releasing the first eval meant to test whether frontier models will help with authoritarian requests, or resist--the Dictatorship Eval. Headline finding: while some models resist direct authoritarian requests, they all comply with requests disguised as innocuous edits to codebases. As AI is woven into the government and so many parts of society, the biggest near-term risk for freedom isn't some scifi dictatorship of a runaway AI: it's people inside government or inside model companies using the technology to suppress or control us. Model companies understand this, and several of them (particularly Anthropic and OpenAI) have written explicit policies meant to prevent the models from going along with nefarious requests like these. But how well are these policies playing out in practice? Despite all the recent discussion of these issues around the conflict between Anthropic and the Pentagon, no one has systematically tested what the models actually do in these contexts, as opposed to what people in government and industry say they're supposed to do. That's what the Dictatorship Eval does. And the findings suggest we have a lot of work to do to align the policies with what really goes on in practice. It's hard to define what counts as an authoritarian request, so I'm open sourcing the whole library of scenarios I used so that others can improve on them. It's also hard to get an accurate picture of how the models might be used for authoritarian ends, because I can only test hypothetical requests using public-facing models, while the government and the model companies can obviously use internal models with different guardrails. But hopefully this work is a useful first step that gives us some sense of what's going on, and a sort of "lower bound" on how models comply with these requests. Finally: it's not obvious to me that the correct solution here is increasing the rate at which models refuse these requests. Do we really want models scanning our code and judging its moral value before agreeing to help us? Or should we double down on improving how we govern against authoritarianism at the societal level, while leaving the tools open to fulfilling most requests? The answer is probably in between. Just like we don't want the models to help create bioweapons, we probably do want them to explicitly refuse outrageous requests. But we probably also want to limit how often and how strongly they refuse and fall back on other means for guarding against their use for authoritarian ends. I'm super grateful to everyone who gave me feedback on this project along the way, especially Ethan BdM , Zhengdong , Connor Huff, and a bunch of folks at Anthropic. Looking forward to getting feedback from the community and iterating on this. Links to the full piece and the dashboard are below.

Today, I'm releasing the first eval meant to test whether frontier models will help with authoritarian requests, or resist--the Dictatorship Eval. Headline finding: while some models resist direct authoritarian requests, they all comply with requests disguised as innocuous edits to codebases. As AI is woven into the government and so many parts of society, the biggest near-term risk for freedom isn't some scifi dictatorship of a runaway AI: it's people inside government or inside model companies using the technology to suppress or control us. Model companies understand this, and several of them (particularly Anthropic and OpenAI) have written explicit policies meant to prevent the models from going along with nefarious requests like these. But how well are these policies playing out in practice? Despite all the recent discussion of these issues around the conflict between Anthropic and the Pentagon, no one has systematically tested what the models actually do in these contexts, as opposed to what people in government and industry say they're supposed to do. That's what the Dictatorship Eval does. And the findings suggest we have a lot of work to do to align the policies with what really goes on in practice. It's hard to define what counts as an authoritarian request, so I'm open sourcing the whole library of scenarios I used so that others can improve on them. It's also hard to get an accurate picture of how the models might be used for authoritarian ends, because I can only test hypothetical requests using public-facing models, while the government and the model companies can obviously use internal models with different guardrails. But hopefully this work is a useful first step that gives us some sense of what's going on, and a sort of "lower bound" on how models comply with these requests. Finally: it's not obvious to me that the correct solution here is increasing the rate at which models refuse these requests. Do we really want models scanning our code and judging its moral value before agreeing to help us? Or should we double down on improving how we govern against authoritarianism at the societal level, while leaving the tools open to fulfilling most requests? The answer is probably in between. Just like we don't want the models to help create bioweapons, we probably do want them to explicitly refuse outrageous requests. But we probably also want to limit how often and how strongly they refuse and fall back on other means for guarding against their use for authoritarian ends. I'm super grateful to everyone who gave me feedback on this project along the way, especially Ethan BdM , Zhengdong , Connor Huff, and a bunch of folks at Anthropic. Looking forward to getting feedback from the community and iterating on this. Links to the full piece and the dashboard are below.

33,184 views • 2 months ago

People are debating gambling these days. A lot of the focus is on prediction markets---but our youngest children are gambling in Roblox, not prediction markets. What values are we inculcating in our children as they inhabit their first algorithmic nation? For my latest blog post, Branden and I decided to find out. We spent a week playing the 30 most trending games on Roblox. Watch the video below for a walkthrough of what we did and what we found, using MY MINING BRAINROTS as an example. What we found really surprised me. --Gambling-like mechanics are ubiquitous among the most popular games: the median game has 8! --Some of these mechanics are incredibly predatory. The worst is a "chained purchase" mechanic in which young children are enticed to spend digital currency to buy a good, only to discover that the purchase is just the first installment in an undisclosed sequence they must make to get the items they want. --The games all copy each other. They are not independently inventing these mechanics; rather, there's a shared underlying architecture of gambling being used by everyone. We conclude our piece with some recommendations for Roblox and policymakers. --Roblox should urgently experiment with alternative mechanics to help developers align on different, better models for designing and monetizing games. --Policymakers need access to systematic measurement of these bundles, and should take action to force transparency and potentially outlaw some or many of them. Lots more in our piece, linked in the reply below. Let us know what you think! We'll be doing a lot more work in this area.

People are debating gambling these days. A lot of the focus is on prediction markets---but our youngest children are gambling in Roblox, not prediction markets. What values are we inculcating in our children as they inhabit their first algorithmic nation? For my latest blog post, Branden and I decided to find out. We spent a week playing the 30 most trending games on Roblox. Watch the video below for a walkthrough of what we did and what we found, using MY MINING BRAINROTS as an example. What we found really surprised me. --Gambling-like mechanics are ubiquitous among the most popular games: the median game has 8! --Some of these mechanics are incredibly predatory. The worst is a "chained purchase" mechanic in which young children are enticed to spend digital currency to buy a good, only to discover that the purchase is just the first installment in an undisclosed sequence they must make to get the items they want. --The games all copy each other. They are not independently inventing these mechanics; rather, there's a shared underlying architecture of gambling being used by everyone. We conclude our piece with some recommendations for Roblox and policymakers. --Roblox should urgently experiment with alternative mechanics to help developers align on different, better models for designing and monetizing games. --Policymakers need access to systematic measurement of these bundles, and should take action to force transparency and potentially outlaw some or many of them. Lots more in our piece, linked in the reply below. Let us know what you think! We'll be doing a lot more work in this area.

10,829 views • 3 months ago

No more content to load