A major announcement today: the Open Philanthropy Project has granted MIRI $500,000 over the coming year to study the questions outlined in our agent foundations and machine learning research agendas, with a strong chance of renewal next year. This represents MIRI’s largest grant to date, and our second-largest single contribution.
Coming on the heels of a $300,000 donation by Blake Borgeson, this support will help us continue on the growth trajectory we outlined in our summer and winter fundraisers last year and effect another doubling of the research team. These growth plans assume continued support from other donors in line with our fundraising successes last year; we’ll be discussing our remaining funding gap in more detail in our 2016 fundraiser, which we’ll be kicking off later this month.
The Open Philanthropy Project is a joint initiative run by staff from the philanthropic foundation Good Ventures and the charity evaluator GiveWell. Open Phil has recently made it a priority to identify opportunities for researchers to address potential risks from advanced AI, and we consider their early work in this area promising: grants to Stuart Russell, Robin Hanson, and the Future of Life Institute, plus a stated interest in funding work related to “Concrete Problems in AI Safety,” a recent paper co-authored by four Open Phil technical advisers, Christopher Olah (Google Brain), Dario Amodei (OpenAI), Paul Christiano (UC Berkeley), and Jacob Steinhardt (Stanford), along with John Schulman (OpenAI) and Dan Mané (Google Brain).
Open Phil’s grant isn’t a full endorsement, and they note a number of reservations about our work in an extensive writeup detailing the thinking that went into the grant decision. Separately, Open Phil Executive Director Holden Karnofsky has written some personal thoughts about how his views of MIRI and the effective altruism community have evolved in recent years.
Open Phil’s decision was informed in part by their technical advisers’ evaluations of our recent work on logical uncertainty and Vingean reflection, together with reviews by seven anonymous computer science professors and one anonymous graduate student. The reviews, most of which are collected here, are generally negative: reviewers felt that “Inductive coherence” and “Asymptotic convergence in online learning with unbounded delays” were not important results and that these research directions were unlikely to be productive, and Open Phil’s advisers were skeptical or uncertain about the work’s relevance to aligning AI systems with human values.
It’s worth mentioning in that context that the results in “Inductive coherence” and “Asymptotic convergence…” led directly to a more significant unpublished result, logical induction, that we’ve recently discussed with Open Phil and members of the effective altruism community. The result is being written up, and we plan to put up a preprint soon. In light of this progress, we are more confident than the reviewers that Garrabrant et al.’s earlier papers represented important steps in the right direction. If this wasn’t apparent to reviewers, then it could suggest that our exposition is weak, or that the importance of our results was inherently difficult to assess from the papers alone.
In general, I think the reviewers’ criticisms are reasonable — either I agree with them, or I think it would take a longer conversation to resolve the disagreement. The level of detail and sophistication of the comments is also quite valuable.
The content of the reviews was mostly in line with my advance predictions, though my predictions were low-confidence. I’ve written up quick responses to some of the reviewers’ comments, with my predictions and some observations from Eliezer Yudkowsky included in appendices. This is likely to be the beginning of a longer discussion of our research priorities and progress, as we have yet to write up our views on a lot of these issues in any detail.
We’re very grateful for Open Phil’s support, and also for the (significant) time they and their advisers spent assessing our work. This grant follows a number of challenging and deep conversations with researchers at GiveWell and Open Phil about our organizational strategy over the years, which have helped us refine our views and arguments.
Past public exchanges between MIRI and GiveWell / Open Phil staff include:
- May/June/July 2012 – Holden Karnofsky’s critique of MIRI (then SI), Eliezer Yudkowsky’s reply, and Luke Muehlhauser’s reply.
- October 2013 – Holden, Eliezer, Luke, Jacob Steinhardt, and Dario Amodei’s discussion of MIRI’s strategy.
- January 2014 – Holden, Eliezer, and Luke’s discussion of existential risk.
- February 2014 – Holden, Eliezer, and Luke’s discussion of future-oriented philanthropy.