
FAR.AI
@farairesearch • 19,203 subscribers
Frontier alignment research to ensure the safe development and deployment of advanced AI systems.
Videos

All recordings from San Diego Alignment Workshop are now available. Yoshua Bengio keynote with talks from Asa Cooper Stickland Dawn Song Marius Hobbhahn Tomek Korbak Cas (Stephen Casper) Divya Siddarth Anna Gausen Xander Davies Max Tegmark Niloofar Daniel Kang Natasha Jaques + more.👇
FAR.AI1,189,443 次观看 • 5 个月前

"Please learn from our mistakes. Don't do exactly the same things that we did, or you'll end up in ten years with having nothing to show for it." — Nicholas Carlini urging AI researchers to avoid the pitfalls of past adversarial ML research at the Vienna Alignment Workshop 2024.
FAR.AI5,370,506 次观看 • 1 年前

🎥 Singapore Alignment Workshop videos are live! Hear from Yoshua Bengio Owain Evans Jacob Pfau Daniel Kang Cas (Stephen Casper) Zico Kolter Siva Reddy Kalesha Bullard Shayne Longpre Mark Brakel あちぁん Tegan Maharaj @teganmaharaj.bsky.social Adam Tauman Kalai Aditya Gopalan + more. Full playlist below.👇
FAR.AI766,984 次观看 • 1 年前

“The hope is that ... just optimizing something to be sparse—without optimizing it to be interpretable—will stumble across that interpretable decomposition.” — Neel Nanda on sparse autoencoders for mechanistic interpretability and AI safety at the Vienna Alignment Workshop.
FAR.AI1,148,210 次观看 • 1 年前

You cannot really train all these models to cater to different preferences. Can you have one model that caters to all? Furong Huang unveils a technique to customize AI models on-the-fly to user goals, reducing the computational cost of tailoring AI systems to individual needs.
FAR.AI410,541 次观看 • 11 个月前

Planning capabilities double every 7mo→human-level in 5yrs? Yoshua Bengio: "We still don't know how to make sure powerful AIs won't turn against us" AIs now lie to avoid shutdown, self-preserve. Solution: Non-agentic "Scientist AIs" + global governance beyond market forces 👇
FAR.AI309,619 次观看 • 9 个月前