#alignmentworkshop

"If you literally catch your AI trying to escape, you have to stop deploying it." Buck Shlegeris shares strategies for managing misaligned AI, including trusted monitoring and collusion-busting techniques to limit catastrophic risks as capabilities grow. #AlignmentWorkshop
FAR.AI194,863 views • 1 year ago

Vienna #AlignmentWorkshop: 129 researchers tackled #AISafety from interpretability & robustness to governance. Keynote by Jan Leike + talks by Victoria Krakovna David Krueger Gillian Hadfield Robert Trager Neel Nanda David Bau Helen Toner Mary Phuong and more. Blog recap & videos. 👇
FAR.AI46,317 views • 1 year ago
No more content to load