How could superintelligent systems be aligned with the interests of humanity? This annotated bibliography compiles some recent research relevant to that question, and categorizes it into six topics: (1) realistic world models; (2) idealized decision theory; (3) logical uncertainty; (4) Vingean reflection; (5) corrigibility; and (6) value learning. Within each subject area, references are organized in an order amenable to learning the topic. These are by no means the only six topics relevant to the study of alignment, but this annotated bibliography could be used by anyone who wants to understand the state of the art in one of these six particular areas of active research.
Today we’ve also released a page that collects the technical agenda and supporting reports. See our Technical Agenda page.