Embedded Agency is a series of blog posts written by Abram Demski and Scott Garrabrant, also available as a hand-drawn sequence on the AI Alignment Forum. You can find part 1 here.

We’ve included links and references below, listed in the order they come up in the relevant topic/section.




( Text Introduction  —  Illustrated Introduction  ———  MIRI Blog Afterword  —  LessWrong Afterword )


Further reading: “Security Mindset and Ordinary Paranoia”; “Agent Foundations for Aligning Machine Intelligence with Human Interests



Decision Theory

( Text Version  —  Illustrated Version )




Embedded World-Models

( Text Version  —  Illustrated Version )


Further reading: “The Problem with AIXI



Robust Delegation

( Text Version  —  Illustrated Version )


Further reading: “Problem of Fully Updated Deference



Subsystem Alignment

( Text Version  —  Illustrated Version )


