Summer MIRI Updates: Buck Shlegeris and Ben Weinstein-Raun have joined the MIRI team! Additionally, we ran a successful internship program over the summer, and we’re co-running a new engineer-oriented workshop series with CFAR. On the fundraising side, we received a $489,000 grant from the Long-Term Future Fund, a $150,000 AI Safety Retraining Program grant from… Read more »
Posts By: Rob Bensinger
August 2018 Newsletter
Updates New posts to the new AI Alignment Forum: Buridan’s Ass in Coordination Games; Probability is Real, and Value is Complex; Safely and Usefully Spectating on AIs Optimizing Over Toy Worlds MIRI Research Associate Vanessa Kosoy wins a $7500 AI Alignment Prize for “The Learning-Theoretic AI Alignment Research Agenda.” Applications for the prize’s next round will… Read more »
July 2018 Newsletter
Updates A new paper: “Forecasting Using Incomplete Models“ New research write-ups and discussions: Prisoners’ Dilemma with Costs to Modeling; Counterfactual Mugging Poker Game; Optimization Amplifies Eliezer Yudkowsky, Paul Christiano, Jessica Taylor, and Wei Dai discuss Alex Zhu’s FAQ for Paul’s research agenda. We attended EA Global in SF, and gave a short talk on “Categorizing Variants… Read more »
New paper: “Forecasting using incomplete models”
MIRI Research Associate Vanessa Kosoy has a paper out on issues in naturalized induction: “Forecasting using incomplete models”. Abstract: We consider the task of forecasting an infinite sequence of future observations based on some number of past observations, where the probability measure generating the observations is “suspected” to satisfy one or more of a set… Read more »
June 2018 Newsletter
Updates New research write-ups and discussions: Logical Inductors Converge to Correlated Equilibria (Kinda) MIRI researcher Tsvi Benson-Tilsen and Alex Zhu ran an AI safety retreat for MIT students and alumni. Andrew Critch discusses what kind of advice to give to junior AI-x-risk-concerned researchers, and I clarify two points about MIRI’s strategic view. From Eliezer Yudkowsky: Challenges to… Read more »
May 2018 Newsletter
Updates New research write-ups and discussions: Resource-Limited Reflective Oracles; Computing An Exact Quantilal Policy New at AI Impacts: Promising Research Projects MIRI research fellow Scott Garrabrant and associates Stuart Armstrong and Vanessa Kosoy are among the winners in the second round of the AI Alignment Prize. First place goes to Tom Everitt and Marcus Hutter’s “The Alignment Problem… Read more »
April 2018 Newsletter
Updates A new paper: “Categorizing Variants of Goodhart’s Law” New research write-ups and discussions: Distributed Cooperation; Quantilal Control for Finite Markov Decision Processes New at AI Impacts: Transmitting Fibers in the Brain: Total Length and Distribution of Lengths Scott Garrabrant, the research lead for MIRI’s agent foundations program, outlines focus areas and 2018 predictions for MIRI’s research. Scott presented on logical… Read more »
2018 research plans and predictions
Update Nov. 23: This post was edited to reflect Scott’s terminology change from “naturalized world-models” to “embedded world-models.” For a full introduction to these four research problems, see Scott Garrabrant and Abram Demski’s “Embedded Agency.” Scott Garrabrant is taking over Nate Soares’ job of making predictions about how much progress we’ll make in different research… Read more »