Updates New research write-ups and discussions: Logical Inductors Converge to Correlated Equilibria (Kinda) MIRI researcher Tsvi Benson-Tilsen and Alex Zhu ran an AI safety retreat for MIT students and alumni. Andrew Critch discusses what kind of advice to give to...
Updates New research write-ups and discussions: Resource-Limited Reflective Oracles; Computing An Exact Quantilal Policy New at AI Impacts: Promising Research Projects MIRI research fellow Scott Garrabrant and associates Stuart Armstrong and Vanessa Kosoy are among the winners in the second...
Updates A new paper: “Categorizing Variants of Goodhart’s Law” New research write-ups and discussions: Distributed Cooperation; Quantilal Control for Finite Markov Decision Processes New at AI Impacts: Transmitting Fibers in the Brain: Total Length and Distribution of Lengths Scott Garrabrant,...
Update Nov. 23: This post was edited to reflect Scott’s terminology change from “naturalized world-models” to “embedded world-models.” For a full introduction to these four research problems, see Scott Garrabrant and Abram Demski’s “Embedded Agency.” Scott Garrabrant is taking over...
Updates New research write-ups and discussions: Knowledge is Freedom; Stable Pointers to Value II: Environmental Goals; Toward a New Technical Explanation of Technical Explanation; Robustness to Scale New at AI Impacts: Likelihood of Discontinuous Progress Around the Development of AGI...
MIRI senior researcher Eliezer Yudkowsky was recently invited to be a guest on Sam Harris’ “Waking Up” podcast. Sam is a neuroscientist and popular author who writes on topics related to philosophy, religion, and public discourse. The following is a...