MIRI Updates
AlphaGo Zero and the Foom Debate
AlphaGo Zero uses 4 TPUs, is built entirely out of neural nets with no handcrafted features, doesn’t pretrain against expert games or anything else human, reaches a superhuman level after 3 days of self-play, and is the strongest version of...
October 2017 Newsletter
“So far as I can presently estimate, now that we’ve had AlphaGo and a couple of other maybe/maybe-not shots across the bow, and seen a huge explosion of effort invested into machine learning and an enormous flood of papers, we...
There’s No Fire Alarm for Artificial General Intelligence
What is the function of a fire alarm? One might think that the function of a fire alarm is to provide you with important evidence about a fire existing, allowing you to change your policy accordingly and exit...
September 2017 Newsletter
Research updates “Incorrigibility in the CIRL Framework”: a new paper by MIRI assistant researcher Ryan Carey responds to Hadfield-Menell et al.’s “The Off-Switch Game”. New at IAFF: The Three Levels of Goodhart’s Curse; Conditioning on Conditionals; Stable Pointers to Value:...
New paper: “Incorrigibility in the CIRL Framework”
MIRI assistant research fellow Ryan Carey has a new paper out discussing situations where good performance in Cooperative Inverse Reinforcement Learning (CIRL) tasks fails to imply that software agents will assist or cooperate with programmers. The paper, titled “Incorrigibility in...
August 2017 Newsletter
Research updates “A Formal Approach to the Problem of Logical Non-Omniscience”: We presented our work on logical induction at the 16th Conference on Theoretical Aspects of Rationality and Knowledge. New at IAFF: Smoking Lesion Steelman; “Like This World, But…”; Jessica...