Blog

Category: Papers

AI Governance to Avoid Extinction: The Strategic Landscape and Actionable Research Questions

We’re excited to release a new AI governance research agenda from the MIRI Technical Governance Team. With this research agenda, we have two main aims: to describe the strategic landscape of AI development and to catalog important governance research questions....

Finite Factored Sets

[mathjax] This is the edited transcript of a talk introducing finite factored sets. For most readers, it will probably be the best starting point for learning about factored sets. Video: (Lightly edited) slides: https://intelligence.org/files/Factored-Set-Slides.pdf (Part 1, Title Slides)...

New paper: “Risks from learned optimization”

Evan Hubinger, Chris van Merwijk, Vladimir Mikulik, Joar Skalse, and Scott Garrabrant have a new paper out: “Risks from learned optimization in advanced machine learning systems.” The paper’s abstract: We analyze the type of learned optimization that occurs when a...

New paper: “Delegative reinforcement learning”

MIRI Research Associate Vanessa Kosoy has written a new paper, “Delegative reinforcement learning: Learning to avoid traps with a little help.” Kosoy will be presenting the paper at the ICLR 2019 SafeML workshop in two weeks. The abstract reads: Most...

New paper: “Forecasting using incomplete models”

MIRI Research Associate Vanessa Kosoy has a paper out on issues in naturalized induction: “Forecasting using incomplete models”. Abstract: We consider the task of forecasting an infinite sequence of future observations based on some number of past observations, where the...

New paper: “Categorizing variants of Goodhart’s Law”

Goodhart’s Law states that “any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.” However, this is not a single phenomenon. In Goodhart Taxonomy, I proposed that there are (at least) four different...

Browse