Skip to content

Check out The AI Doc, now streaming.

The Problem
Research
About Us
Updates
Donate

The Problem
Research
About Us
Updates
Donate

June 2019 Newsletter

June 1, 2019
Rob Bensinger

Evan Hubinger, Chris van Merwijk, Vladimir Mikulik, Joar Skalse, and Scott Garrabrant have released the first two (of five) posts on “mesa-optimization”:

The goal of this sequence is to analyze the type of learned optimization that occurs when a learned model (such as a neural network) is itself an optimizer—a situation we refer to as mesa-optimization.

We believe that the possibility of mesa-optimization raises two important questions for the safety and transparency of advanced machine learning systems. First, under what circumstances will learned models be optimizers, including when they should not be? Second, when a learned model is an optimizer, what will its objective be—how will it differ from the loss function it was trained under—and how can it be aligned?

The sequence begins with Risks from Learned Optimization: Introduction and continues with Conditions for Mesa-Optimization. (LessWrong mirror.)

Other updates

New research posts: Nash Equilibria Can Be Arbitrarily Bad; Self-Confirming Predictions Can Be Arbitrarily Bad; And the AI Would Have Got Away With It Too, If…; Uncertainty Versus Fuzziness Versus Extrapolation Desiderata
We've released our annual review for 2018.
Applications are open for two AI safety events at the EA Hotel in Blackpool, England: the Learning-By-Doing AI Safety Workshop (Aug. 16-19), and the Technical AI Safety Unconference (Aug. 22-25).
A discussion of takeoff speed, including some very incomplete and high-level MIRI comments.

News and links

Other recent AI safety posts: Tom Sittler's A Shift in Arguments for AI Risk and Wei Dai's “UDT2” and “against UD+ASSA”.
Talks from the SafeML ICLR workshop are now available online.
From OpenAI: “We’re implementing two mechanisms to responsibly publish GPT-2 and hopefully future releases: staged release and partnership-based sharing.”
FHI's Jade Leung argues that “states are ill-equipped to lead at the formative stages of an AI governance regime,” and that “private AI labs are best-placed to lead on AI governance”.

Browse

Search

Browse

Categories

Analysis
Conversations
Guest Posts
MIRI Strategy
News
Newsletters
Papers
Uncategorized
Video

Subscribe

Follow us on

Facebook X-twitter Rss

Contact
Donate
Careers
Team
Transparency
Privacy

Contact
Donate
Careers
Team
Transparency
Privacy

Subscribe to our Newsletter

Machine Intelligence Research Institute

Berkeley, California

Facebook X-twitter