- New posts to the new AI Alignment Forum: Buridan’s Ass in Coordination Games; Probability is Real, and Value is Complex; Safely and Usefully Spectating on AIs Optimizing Over Toy Worlds
- MIRI Research Associate Vadim Kosoy wins a $7500 AI Alignment Prize for “The Learning-Theoretic AI Alignment Research Agenda.” Applications for the prize’s next round will be open through December 31.
- Interns from MIRI and the Center for Human-Compatible AI collaborated at an AI safety research workshop.
- This year’s AI Summer Fellows Program was very successful, and its one-day blogathon resulted in a number of interesting write-ups, such as Dependent Type Theory and Zero-Shot Reasoning, Conceptual Problems with Utility Functions (and follow-up), Complete Class: Consequentialist Foundations, and Agents That Learn From Human Behavior Can’t Learn Human Values That Humans Haven’t Learned Yet.
- See Rohin Shah’s alignment newsletter for more discussion of recent posts to the new AI Alignment Forum.
News and links
- The Future of Humanity Institute is seeking project managers for its Research Scholars Programme and its Governance of AI Program.