July 2016 Newsletter

Research updates

A new paper: “A Formal Solution to the Grain of Truth Problem.” The paper was presented at UAI-16, and describes the first general reduction of game-theoretic reasoning to expected utility maximization.
Participants in MIRI’s recently-concluded Colloquium Series on Robust and Beneficial AI (CSRBAI) have put together AI safety environments for the OpenAI Reinforcement Learning Gym. ((Inspiration for these gyms came in part from Chris Olah and Dario Amodei in a conversation with Rafael.)) Help is welcome creating more safety environments and conducting experiments on the current set. Questions can be directed to rafael.cosman@gmail.com.

General updates

We attended the White House’s Workshop on Safety and Control in AI.
Our 2016 MIRI Summer Fellows Program recently drew to a close. The program, run by the Center for Applied Rationality, aims to train AI scientists’ and mathematicians’ research and decision-making skills.
“Why Ain’t You Rich?“: Nate Soares discusses decision theory in Dawn or Doom. See “Toward Idealized Decision Theory” for context.
Numerai, an anonymized distributed hedge fund for machine learning researchers, has added an option for donating earnings to MIRI “as a hedge against things going horribly right” in the field of AI.

News and links

The White House is requesting information on “safety and control issues for AI,” among other questions. Public submissions will be accepted through July 22.
“Concrete Problems in AI Safety“: Researchers from Google Brain, OpenAI, and academia propose a very promising new AI safety research agenda. The proposal is showcased on the Google Research Blog and the OpenAI Blog, as well as the Open Philanthropy Blog, and has received press coverage from Bloomberg, The Verge, and MIT Technology Review.
After criticizing the thinking behind OpenAI earlier in the month, Alphabet executive chairman Eric Schmidt comes out in favor of AI safety research:

Do we worry about the doomsday scenarios? We believe it’s worth thoughtful consideration. Today’s AI only thrives in narrow, repetitive tasks where it is trained on many examples. But no researchers or technologists want to be part of some Hollywood science-fiction dystopia. The right course is not to panic—it’s to get to work. Google, alongside many other companies, is doing rigorous research on AI safety, such as how to ensure people can interrupt an AI system whenever needed, and how to make such systems robust to cyberattacks.

Dylan Hadfield-Mennell, Anca Dragan, Pieter Abbeel, and Stuart Russell propose a formal definition of the value alignment problem as “Cooperative Inverse Reinforcement Learning,” a two-player game where a human and robot are both “rewarded according to the human’s reward function, but the robot does not initially know what this is.” In a CSRBAI talk (slides), Hadfield-Mennell discusses applications for AI corrigibility.
Jaan Tallinn brings his AI risk focus to the Bulletin of Atomic Scientists.
Stephen Hawking weighs in on intelligence explosion (video). Sam Harris and Neil DeGrasse Tyson debate the idea at greater length (audio, at 1:22:37).
Ethereum developer Vitalik Buterin discusses the implications of value complexity and fragility and other AI safety concepts for cryptoeconomics.
Wired covers a “demonically clever” backdoor based on chips’ analog properties.
CNET interviews MIRI and a who’s who of AI scientists for a pair of articles: “AI, Frankenstein? Not So Fast, Experts Say” and “When Hollywood Does AI, It’s Fun But Farfetched.”
Next month’s Effective Altruism Global conference is accepting applicants.

Browse