November 2019 Newsletter

I'm happy to announce that Nate Soares and Ben Levinstein's “Cheating Death in Damascus” has been accepted for publication in The Journal of Philosophy (previously voted the second-highest-quality journal in philosophy).

In other news, MIRI researcher Buck Shlegeris has written over 12,000 words on a variety of MIRI-relevant topics in an EA Forum AMA. (Example topics: advice for software engineers; what alignment plans tend to look like; and decision theory.)

Other updates

Abram Demski's The Parable of Predict-O-Matic is a great read: the predictor/optimizer issues it covers are deep, but I expect a fairly wide range of readers to enjoy it and get something out of it.
Evan Hubinger's Gradient Hacking describes an important failure mode that hadn't previously been articulated.
Vanessa Kosoy's LessWrong shortform has recently discussed some especially interesting topics related to her learning-theoretic agenda.
Stuart Armstrong's All I Know Is Goodhart constitutes nice conceptual progress on expected value maximizers that are aware of Goodhart's law and trying to avoid it.
Reddy, Dragan, and Levine's paper on modeling human intent cites (of all things) Harry Potter and the Methods of Rationality as inspiration.

News and links

Artificial Intelligence Research Needs Responsible Publication Norms: Crootof provides a good review of the issue on Lawfare.
Stuart Russell's new book is out: Human Compatible: Artificial Intelligence and the Problem of Control (excerpt). Rohin Shah's review does an excellent job of contextualizing Russell's views within the larger AI safety ecosystem, and Rohin highlights the quote:

The task is, fortunately, not the following: given a machine that possesses a high degree of intelligence, work out how to control it. If that were the task, we would be toast. A machine viewed as a black box, a fait accompli, might as well have arrived from outer space. And our chances of controlling a superintelligent entity from outer space are roughly zero. Similar arguments apply to methods of creating AI systems that guarantee we won’t understand how they work; these methods include whole-brain emulation — creating souped-up electronic copies of human brains — as well as methods based on simulated evolution of programs. I won’t say more about these proposals because they are so obviously a bad idea.
Jacob Steinhardt releases an AI Alignment Research Overview.
Patrick LaVictoire's AlphaStar: Impressive for RL Progress, Not for AGI Progress raises some important questions about how capable today's state-of-the-art systems are.

Browse

November 2019 Newsletter

Other updates

News and links

Categories