MIRI Updates

The basic reasons I expect AGI ruin

I’ve been citing AGI Ruin: A List of Lethalities to explain why the situation with AI looks lethally dangerous to me. But that post is relatively long, and emphasizes specific open technical problems over “the basics”. Here are 10 things...

Misgeneralization as a misnomer

Here’s two different ways an AI can turn out unfriendly: You somehow build an AI that cares about “making people happy”. In training, it tells people jokes and buys people flowers and offers people an ear when they need one....

Pausing AI Developments Isn’t Enough. We Need to Shut it All Down

(Published in TIME on March 29.) An open letter published today calls for “all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4.” This 6-month moratorium would be better...

Truth and Advantage: Response to a draft of “AI safety seems hard to measure”

Status: This was a response to a draft of Holden’s cold take “AI safety seems hard to measure”. It sparked a further discussion, that Holden recently posted a summary of. The follow-up discussion ended up focusing on some issues in...

Deep Deceptiveness

Meta This post is an attempt to gesture at a class of AI notkilleveryoneism (alignment) problem that seems to me to go largely unrecognized. E.g., it isn’t discussed (or at least I don’t recognize it) in the recent plans written...

Yudkowsky on AGI risk on the Bankless podcast

Eliezer gave a very frank overview of his take on AI two weeks ago on the cryptocurrency show Bankless: I’ve posted a transcript of the show and a follow-up Q&A below. Thanks to Andrea_Miotti, remember, and vonk for help posting...

Browse