- MIRI’s Evan Hubinger uses a notion of optimization power to define whether AI systems are compatible with the strategy-stealing assumption.
- MIRI’s Abram Demski discusses debate approaches to AI safety that don’t rely on factored cognition.
- Evan argues that the first AGI systems are likely to be very similar to each other, and discusses implications for alignment.
- Jack Clark’s Import AI newsletter discusses the negative research results from our end-of-year update.
- Richard Ngo shares high-quality discussion of his great introductory sequence AGI Safety from First Principles, featuring Paul Christiano, Max Daniel, Ben Garfinkel, Adam Gleave, Matthew Graves, Daniel Kokotajlo, Will MacAskill, Rohin Shah, Jaan Tallinn, and MIRI’s Evan Hubinger and Buck Shlegeris.
- Tom Chivers discusses the rationality community and Rationality: From AI to Zombies. (Contrary to the headline, COVID-19 receives little discussion.)
News and links
- Alex Flint argues that growing the field of AI alignment researchers should be a side-effect of optimizing for “research depth”, rather than functioning as a target in its own right — much as software projects shouldn’t optimize for larger teams or larger codebases. Flint also comments on strategy/policy research.
- Daniel Kokotajlo of the Center on Long-Term Risk argues that AGI may enable decisive strategic advantage before GDP accelerates. Cf. a pithier comment from Eliezer Yudkowsky.
- TAI Safety Bibliographic Database: Jess Riedel and Angelica Deibel release a database of AI safety research, and analyze recent trends in the field.
- DeepMind’s AI safety team investigates optimality properties of meta-trained RNNs and the tampering problem: “How can we design agents that pursue a given objective when all feedback mechanisms for describing that objective are influenceable by the agent?”
- Facebook launches Forecast, a new community prediction platform akin to Metaculus.
- Effective altruists have released the microCOVID calculator, a very handy tool for assessing activities’ COVID-19 infection risk. Meanwhile, Zvi Mowshowitz’s weekly updates on LessWrong continue to be a good (US-centric) resource for staying up to date on COVID-19 developments such as the B117 variant.
- Rethink Priorities researcher Linchuan Zhang summarizes things he’s learned forecasting COVID-19 in 2020: forming good outside views is often hard; effective altruists tend to overrate superforecasters; etc.