The AI Alignment Forum has left beta!
Dovetailing with the launch, MIRI researchers Scott Garrabrant and Abram Demski will be releasing a new sequence introducing our research over the coming week, beginning here: Embedded Agents. (Shorter illustrated version here.)
Other updates
- New posts to the forum: Cooperative Oracles; When Wishful Thinking Works; (A → B) → A; Towards a New Impact Measure; In Logical Time, All Games are Iterated Games; EDT Solves 5-and-10 With Conditional Oracles
- The Rocket Alignment Problem: Eliezer Yudkowsky considers a hypothetical world without knowledge of calculus and celestial mechanics, to illustrate MIRI’s research and what we take to be the world’s current level of understanding of AI alignment. (Also on LessWrong.)
- More on MIRI’s AI safety angle of attack: a comment on decision theory.
News and links
- DeepMind’s safety team launches their own blog, with an inaugural post on specification, robustness, and assurance.
- Will MacAskill discusses moral uncertainty on FLI’s AI safety podcast.
- Google Brain announces the Unrestricted Adversarial Examples Challenge.
- The 80,000 Hours job board has many new postings, including head of operations for FHI, COO for BERI, and programme manager for CSER. Also taking applicants: summer internships at CHAI, and a scholarships program from FHI.