January 2022 Newsletter

MIRI's $1.2 million Visible Thoughts Project bounty now has an FAQ, and an example of a successful partial run that you can use to inform your own runs.
Scott Alexander reviews the first part of the Yudkowsky/Ngo debate. See also Richard Ngo's reply, and Rohin Shah's review of several posts from the Late 2021 MIRI Conversations.
From Evan Hubinger: How do we become confident in the safety of an ML system? and A positive case for how we might succeed at prosaic AI alignment (with discussion in the comments).
The ML Alignment Theory Scholars program, mentored by Evan Hubinger and run by SERI, has produced a series of distillations and expansions of prior alignment-relevant research.
MIRI ran a small workshop this month on what makes some concepts better than others, motivated by the question of how revolutionary science (which is about discovering new questions to ask, new ontologies, and new concepts) works.

Daniel Dewey makes his version of the case that future advances in deep learning pose a "global risk".
Buck Shlegeris of Redwood Research discusses Worst-Case Thinking in AI Alignment.
From Paul Christiano: Why I'm excited about Redwood Research's current project.
Paul Christiano's Alignment Research Center is hiring researchers and research interns.

Browse

Search

Browse