October 2019 Newsletter

Posted by & filed under Newsletters.

Updates Ben Pace summarizes a second round of AI Alignment Writing Day posts. The Zettelkasten Method: MIRI researcher Abram Demski describes a note-taking system that's had a large positive effect on his research productivity. Will MacAskill writes a detailed critique of functional decision theory; Abram Demski (1, 2) and Matthew Graves respond in the comments. News and… Read more »

September 2019 Newsletter

Posted by & filed under Newsletters.

Updates We ran a very successful MIRI Summer Fellows Program, which included a day where participants publicly wrote up their thoughts on various AI safety topics. See Ben Pace’s first post in a series of roundups. A few highlights from the writing day: Adele Lopez's Optimization Provenance; Daniel Kokotajlo's Soft Takeoff Can Still Lead to Decisive Strategic Advantage and The… Read more »

August 2019 Newsletter

Posted by & filed under Newsletters.

Updates MIRI research associate Stuart Armstrong is offering $1000 for good questions to ask an Oracle AI. Recent AI safety posts from Stuart: Indifference: Multiple Changes, Multiple Agents; Intertheoretic Utility Comparison: Examples; Normalising Utility as Willingness to Pay; and Partial Preferences Revisited. MIRI researcher Buck Shlegeris has put together a quick and informal AI safety reading… Read more »

July 2019 Newsletter

Posted by & filed under Newsletters.

Hubinger et al.'s “Risks from Learned Optimization in Advanced Machine Learning Systems”, one of our new core resources on the alignment problem, is now available on arXiv, the AI Alignment Forum, and LessWrong. In other news, we received an Ethereum donation worth $230,910 from Vitalik Buterin — the inventor and co-founder of Ethereum, and now our third-largest… Read more »

New paper: “Risks from learned optimization”

Posted by & filed under Papers.

Evan Hubinger, Chris van Merwijk, Vladimir Mikulik, Joar Skalse, and Scott Garrabrant have a new paper out: “Risks from learned optimization in advanced machine learning systems.” The paper’s abstract: We analyze the type of learned optimization that occurs when a learned model (such as a neural network) is itself an optimizer—a situation we refer to… Read more »

June 2019 Newsletter

Posted by & filed under Newsletters.

Evan Hubinger, Chris van Merwijk, Vladimir Mikulik, Joar Skalse, and Scott Garrabrant have released the first two (of five) posts on “mesa-optimization”: The goal of this sequence is to analyze the type of learned optimization that occurs when a learned model (such as a neural network) is itself an optimizer—a situation we refer to as… Read more »

May 2019 Newsletter

Posted by & filed under Newsletters.

Updates A new paper from MIRI researcher Vanessa Kosoy, presented at the ICLR SafeML workshop this week: "Delegative Reinforcement Learning: Learning to Avoid Traps with a Little Help." New research posts: Learning "Known" Information When the Information is Not Actually Known; Defeating Goodhart and the "Closest Unblocked Strategy" Problem; Reinforcement Learning with Imperceptible Rewards The Long-Term Future Fund has announced twenty-three new… Read more »

New paper: “Delegative reinforcement learning”

Posted by & filed under Papers.

MIRI Research Associate Vanessa Kosoy has written a new paper, “Delegative reinforcement learning: Learning to avoid traps with a little help.” Kosoy will be presenting the paper at the ICLR 2019 SafeML workshop in two weeks. The abstract reads: Most known regret bounds for reinforcement learning are either episodic or assume an environment without traps…. Read more »