Nate Soares, Author at Machine Intelligence Research Institute

Truth and Advantage: Response to a draft of “AI safety seems hard to measure”

Status: This was a response to a draft of Holden’s cold take “AI safety seems hard to measure”. It sparked a further discussion, that Holden recently posted a summary of. The follow-up discussion ended up focusing on some issues in...

Deep Deceptiveness

Meta This post is an attempt to gesture at a class of AI notkilleveryoneism (alignment) problem that seems to me to go largely unrecognized. E.g., it isn’t discussed (or at least I don’t recognize it) in the recent plans written...

Comments on OpenAI’s "Planning for AGI and beyond"

Sam Altman shared me on a draft of his OpenAI blog post Planning for AGI and beyond, and I left some comments, reproduced below without typos and with some added hyperlinks. Where the final version of the OpenAI post differs...

Focus on the places where you feel shocked everyone’s dropping the ball

Writing down something I’ve found myself repeating in different conversations: If you’re looking for ways to help with the whole “the world looks pretty doomed” business, here’s my advice: look around for places where we’re all being total idiots. Look...

What I mean by “alignment is in large part about making cognition aimable at all”

(Epistemic status: attempting to clear up a misunderstanding about points I have attempted to make in the past. This post is not intended as an argument for those points.) I have long said that the lion’s share of the AI...

A central AI alignment problem: capabilities generalization, and the sharp left turn

(This post was factored out of a larger post that I (Nate Soares) wrote, with help from Rob Bensinger, who also rearranged some pieces and added some text to smooth things out. I’m not terribly happy with it, but am...

Blog

Author: Nate Soares