Research updates
- “Incorrigibility in the CIRL Framework”: a new paper by MIRI assistant researcher Ryan Carey responds to Hadfield-Menell et al.’s “The Off-Switch Game”.
- New at IAFF: The Three Levels of Goodhart’s Curse; Conditioning on Conditionals; Stable Pointers to Value: An Agent Embedded in Its Own Utility Function; Density Zero Exploration; Autopoietic Systems and the Difficulty of AGI Alignment
- Ryan Carey is leaving MIRI to collaborate with the Future of Humanity Institute’s Owain Evans on AI safety work.
General updates
- As part of his engineering internship at MIRI, Max Harms assisted in the construction and extension of RL-Teacher, an open-source tool for training AI systems with human feedback based on the “Deep RL from Human Preferences” OpenAI / DeepMind research collaboration. See OpenAI’s announcement.
- MIRI COO Malo Bourgon participated in panel discussions on getting things done (video) and working in AI (video) at the Effective Altruism Global conference in San Francisco. AI Impacts researcher Katja Grace also spoke on AI safety (video). Other EAG talks on AI included Daniel Dewey’s (video) and Owen Cotton-Barratt’s (video), and a larger panel discussion (video).
- Announcing two winners of the Intelligence in Literature prize: Laurence Raphael Brothers’ “Houseproud” and Shane Halbach’s “Human in the Loop”.
- RAISE, a project to develop online AI alignment course material, is seeking volunteers.
News and links
- The Open Philanthropy Project is accepting applicants to an AI Fellows Program “to fully support a small group of the most promising PhD students in artificial intelligence and machine learning”. See also Open Phil’s partial list of key research topics in AI alignment.
- Call for papers: AAAI and ACM are running a new Conference on AI, Ethics, and Society, with submissions due by the end of October.
- DeepMind’s Viktoriya Krakovna argues for a portfolio approach to AI safety research.
- “Teaching AI Systems to Behave Themselves”: a solid article from the New York Times on the growing field of AI safety research. The Times also has an opening for an investigative reporter in AI.
- UC Berkeley’s Center for Long-term Cybersecurity is hiring for several roles, including researcher, assistant to the director, and program manager.
- Life 3.0: Max Tegmark releases a new book on the future of AI (podcast discussion).