Blog

Category: Analysis

What I mean by “alignment is in large part about making cognition aimable at all”

(Epistemic status: attempting to clear up a misunderstanding about points I have attempted to make in the past. This post is not intended as an argument for those points.) I have long said that the lion’s share of the AI...

A central AI alignment problem: capabilities generalization, and the sharp left turn

(This post was factored out of a larger post that I (Nate Soares) wrote, with help from Rob Bensinger, who also rearranged some pieces and added some text to smooth things out. I’m not terribly happy with it, but am...

AGI Ruin: A List of Lethalities

Preamble: (If you’re already familiar with all basics and don’t want any preamble, skip ahead to Section B for technical difficulties of alignment proper.) I have several times failed to write up a well-organized list of reasons why AGI will...

Six Dimensions of Operational Adequacy in AGI Projects

Editor’s note: The following is a lightly edited copy of a document written by Eliezer Yudkowsky in November 2017. Since this is a snapshot of Eliezer’s thinking at a specific time, we’ve sprinkled reminders throughout that this is from 2017....

Shah and Yudkowsky on alignment failures

  This is the final discussion log in the Late 2021 MIRI Conversations sequence, featuring Rohin Shah and Eliezer Yudkowsky, with additional comments from Rob Bensinger, Nate Soares, Richard Ngo, and Jaan Tallinn. The discussion begins with summaries and comments...

Ngo and Yudkowsky on scientific reasoning and pivotal acts

This is a transcript of a conversation between Richard Ngo and Eliezer Yudkowsky, facilitated by Nate Soares (and with some comments from Carl Shulman). This transcript continues the Late 2021 MIRI Conversations sequence, following Ngo’s view on alignment difficulty.  ...