Loading video...
Video Failed to Load
"Please learn from our mistakes. Don't do exactly the same things that we did, or you'll end up in ten years with having nothing to show for it." — Nicholas Carlini urging AI researchers to avoid the pitfalls of past adversarial ML research at the Vienna Alignment Workshop 2024.
5,370,524 views • 1 year ago •via X (Twitter)
10 Comments

Follow us for updates about upcoming content and workshops: and watch the full video at

Sounds good if the problem you're trying to solve is "how to publish 9000 papers in 10 years". 😁

I actually listened to the talk and I am not sure what’s the advice? It appears the advice is to study why adversarial papers are still so easy to break and maybe try not to put such a difficult problem as a target? Eg limit jailbreak to 100 queries via API (no weights access).

One of the first things I used Chat-GPT for was having it write for me a Python keylogger. I just had to avoid that word and prompt around it. The second thing I did was having it write porn stories. Worked incredibly well, with some understanding of context windows.

He looks like Andrew tate if he went to school

Alignment evals for LLMs based purely on input-output pattern recognition are bound to fail.

Making mistake it's a fundamental part of human nature, if you don't make mistakes you don't progress,it's necessary and it's thanks to the mistakes of the past that progress was made possible. If you eliminate mistake, you cut a central part of human nature.

Oh someone elaborating on negative results for once, cool.

Top G looks different

These AI researchers are just working together to create more recessions and crisis for middle and lower class.

