MIRI Updates

Updates New posts to the new AI Alignment Forum: Buridan’s Ass in Coordination Games; Probability is Real, and Value is Complex; Safely and Usefully Spectating on AIs Optimizing Over Toy Worlds MIRI Research Associate Vanessa Kosoy wins a $7500 AI...

Updates A new paper: “Forecasting Using Incomplete Models“ New research write-ups and discussions: Prisoners’ Dilemma with Costs to Modeling; Counterfactual Mugging Poker Game; Optimization Amplifies Eliezer Yudkowsky, Paul Christiano, Jessica Taylor, and Wei Dai discuss Alex Zhu’s FAQ for Paul’s...

Updates New research write-ups and discussions: Logical Inductors Converge to Correlated Equilibria (Kinda) MIRI researcher Tsvi Benson-Tilsen and Alex Zhu ran an AI safety retreat for MIT students and alumni. Andrew Critch discusses what kind of advice to give to...

Updates New research write-ups and discussions: Resource-Limited Reflective Oracles; Computing An Exact Quantilal Policy New at AI Impacts: Promising Research Projects MIRI research fellow Scott Garrabrant and associates Stuart Armstrong and Vanessa Kosoy are among the winners in the second...

Updates A new paper: “Categorizing Variants of Goodhart’s Law” New research write-ups and discussions: Distributed Cooperation; Quantilal Control for Finite Markov Decision Processes New at AI Impacts: Transmitting Fibers in the Brain: Total Length and Distribution of Lengths Scott Garrabrant,...

Updates New research write-ups and discussions: Knowledge is Freedom; Stable Pointers to Value II: Environmental Goals; Toward a New Technical Explanation of Technical Explanation; Robustness to Scale New at AI Impacts: Likelihood of Discontinuous Progress Around the Development of AGI...

Browse
Browse
Subscribe
Follow us on