Embedded Agency is a write-up by Abram Demski and Scott Garrabrant, available on the AI Alignment Forum here. There’s also a shorter version of the post as a hand-drawn sequence, and a lightly rewritten version on arXiv.

Further reading: “Security Mindset and Ordinary Paranoia”; “Agent Foundations for Aligning Machine Intelligence with Human Interests



Decision Theory

Embedded World-Models

Further reading: “The Problem with AIXI



Robust Delegation

Further reading: “Problem of Fully Updated Deference



Subsystem Alignment

