Updates
- New research write-ups and discussions: Logical Inductors Converge to Correlated Equilibria (Kinda)
- MIRI researcher Tsvi Benson-Tilsen and Alex Zhu ran an AI safety retreat for MIT students and alumni.
- Andrew Critch discusses what kind of advice to give to junior AI-x-risk-concerned researchers, and I clarify two points about MIRI’s strategic view.
- From Eliezer Yudkowsky: Challenges to Paul Christiano’s Capability Amplification Proposal. (Cross-posted to LessWrong.)
News and links
- Jessica Taylor discusses the relationship between decision theory, game theory, and the NP and PSPACE complexity classes.
- From OpenAI’s Geoffrey Irving, Paul Christiano, and Dario Amodei: an AI safety technique based on training agents to debate each other. And from Amodei and Danny Hernandez, an analysis showing that “since 2012, the amount of compute used in the largest AI training runs has been increasing exponentially with a 3.5 month-doubling time”.
- Christiano asks: Are Minimal Circuits Daemon-Free? and When is Unaligned AI Morally Valuable?
- The Future of Humanity Institute’s Allan Dafoe discusses the future of AI, international governance, and macrostrategy on the 80,000 Hours Podcast.