Scott Garrabrant, Author at Machine Intelligence Research Institute

Finite Factored Sets

Posted May 23, 2021 by Scott Garrabrant & filed under Papers.

This is the edited transcript of a talk introducing finite factored sets. For most readers, it will probably be the best starting point for learning about factored sets. Video: (Lightly edited) slides: https://intelligence.org/files/Factored-Set-Slides.pdf (Part 1, Title Slides) · · · Finite Factored Sets (Part 1, Motivation) … Read more »

Saving Time

Posted May 18, 2021 by Scott Garrabrant & filed under Analysis.

Note: This is a preamble to Finite Factored Sets, a sequence I’ll be posting over the next few weeks. This Sunday at noon Pacific time, I’ll be giving a Zoom talk (link) introducing Finite Factored Sets, a framework which I find roughly as technically interesting as logical induction. (Update May 25: A video and blog… Read more »

Subsystem Alignment

Posted November 6, 2018 by Scott Garrabrant & filed under Analysis.

You want to figure something out, but you don’t know how to do that yet. You have to somehow break up the task into sub-computations. There is no atomic act of “thinking”; intelligence must be built up of non-intelligent parts. The agent being made of parts is part of what made counterfactuals hard, since… Read more »

Embedded World-Models

Posted November 2, 2018 by Scott Garrabrant & filed under Analysis.

An agent which is larger than its environment can: Hold an exact model of the environment in its head. Think through the consequences of every potential course of action. If it doesn’t know the environment perfectly, hold every possible way the environment could be in its head, as is the case with Bayesian… Read more »

Embedded Agents

Posted October 29, 2018 by Scott Garrabrant & filed under Analysis.

Suppose you want to build a robot to achieve some real-world goal for you—a goal that requires the robot to learn for itself and figure out a lot of things that you don’t already know. There’s a complicated engineering problem here. But there’s also a problem of figuring out what it even means to… Read more »

New paper: “Categorizing variants of Goodhart’s Law”

Posted March 27, 2018 by Scott Garrabrant & filed under Papers.

Goodhart’s Law states that “any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.” However, this is not a single phenomenon. In Goodhart Taxonomy, I proposed that there are (at least) four different mechanisms through which proxy measures break when you optimize for them: Regressional, Extremal, Causal, and… Read more »

Posts By: Scott Garrabrant

Finite Factored Sets

Saving Time

Subsystem Alignment

Embedded World-Models

Embedded Agents

New paper: “Categorizing variants of Goodhart’s Law”

Search

Browse

Subscribe