Updates A new paper: “Forecasting Using Incomplete Models“ New research write-ups and discussions: Prisoners’ Dilemma with Costs to Modeling; Counterfactual Mugging Poker Game; Optimization Amplifies Eliezer Yudkowsky, Paul Christiano, Jessica Taylor, and Wei Dai discuss Alex Zhu’s FAQ for Paul’s...
MIRI Research Associate Vanessa Kosoy has a paper out on issues in naturalized induction: “Forecasting using incomplete models”. Abstract: We consider the task of forecasting an infinite sequence of future observations based on some number of past observations, where the...
Updates New research write-ups and discussions: Logical Inductors Converge to Correlated Equilibria (Kinda) MIRI researcher Tsvi Benson-Tilsen and Alex Zhu ran an AI safety retreat for MIT students and alumni. Andrew Critch discusses what kind of advice to give to...
Updates New research write-ups and discussions: Resource-Limited Reflective Oracles; Computing An Exact Quantilal Policy New at AI Impacts: Promising Research Projects MIRI research fellow Scott Garrabrant and associates Stuart Armstrong and Vanessa Kosoy are among the winners in the second...
Updates A new paper: “Categorizing Variants of Goodhart’s Law” New research write-ups and discussions: Distributed Cooperation; Quantilal Control for Finite Markov Decision Processes New at AI Impacts: Transmitting Fibers in the Brain: Total Length and Distribution of Lengths Scott Garrabrant,...
Update Nov. 23: This post was edited to reflect Scott’s terminology change from “naturalized world-models” to “embedded world-models.” For a full introduction to these four research problems, see Scott Garrabrant and Abram Demski’s “Embedded Agency.” Scott Garrabrant is taking over...