MIRI Updates

October 2017 Newsletter

“So far as I can presently estimate, now that we’ve had AlphaGo and a couple of other maybe/maybe-not shots across the bow, and seen a huge explosion of effort invested into machine learning and an enormous flood of papers, we...

There’s No Fire Alarm for Artificial General Intelligence

What is the function of a fire alarm? One might think that the function of a fire alarm is to provide you with important evidence about a fire existing, allowing you to change your policy accordingly and exit...

September 2017 Newsletter

Research updates “Incorrigibility in the CIRL Framework”: a new paper by MIRI assistant researcher Ryan Carey responds to Hadfield-Menell et al.’s “The Off-Switch Game”. New at IAFF: The Three Levels of Goodhart’s Curse; Conditioning on Conditionals; Stable Pointers to Value:...

New paper: “Incorrigibility in the CIRL Framework”

MIRI assistant research fellow Ryan Carey has a new paper out discussing situations where good performance in Cooperative Inverse Reinforcement Learning (CIRL) tasks fails to imply that software agents will assist or cooperate with programmers. The paper, titled “Incorrigibility in...

August 2017 Newsletter

Research updates “A Formal Approach to the Problem of Logical Non-Omniscience”: We presented our work on logical induction at the 16th Conference on Theoretical Aspects of Rationality and Knowledge. New at IAFF: Smoking Lesion Steelman; “Like This World, But…”; Jessica...

July 2017 Newsletter

A number of major mid-year MIRI updates: we received our largest donation to date, $1.01 million from an Ethereum investor! Our research priorities have also shifted somewhat, reflecting the addition of four new full-time researchers (Marcello Herreshoff, Sam Eisenstat, Tsvi...

Browse