Updates
- New research write-ups and discussions: Resource-Limited Reflective Oracles; Computing An Exact Quantilal Policy
- New at AI Impacts: Promising Research Projects
- MIRI research fellow Scott Garrabrant and associates Stuart Armstrong and Vanessa Kosoy are among the winners in the second round of the AI Alignment Prize. First place goes to Tom Everitt and Marcus Hutter’s “The Alignment Problem for History-Based Bayesian Reinforcement Learners.”
- Our thanks to our donors in REG’s Spring Matching Challenge and to online poker players Chappolini, donthnrmepls, FMyLife, ValueH, and xx23xx, who generously matched $47,000 in donations to MIRI, plus another $250,000 to the Good Food Institute, GiveDirectly, and other charities.
News and links
- OpenAI’s charter predicts that “safety and security concerns will reduce [their] traditional publishing in the future” and emphasizes the importance of “long-term safety” and avoiding late-stage races between AGI developers.
- Matthew Rahtz recounts lessons learned while reproducing Christiano et al.’s “Deep Reinforcement Learning from Human Preferences.”