This post is a follow-up to Malo’s 2015 review, sketching out our new 2016-2017 plans. Briefly, our top priorities (in decreasing order of importance) are to (1) make technical progress on the research problems we’ve identified, (2) expand our team, and (3) build stronger ties to the wider research community.
As discussed in a previous blog post, the biggest update to our research plans is that we’ll be splitting our time going forward between our 2014 research agenda (the “agent foundations” agenda) and a new research agenda oriented toward machine learning work led by Jessica Taylor: “Alignment for Advanced Machine Learning Systems.”
Three additional news items:
1. I’m happy to announce that MIRI has received support from a major new donor: entrepreneur and computational biologist Blake Borgeson, who has made a $300,000 donation to MIRI. This is the second-largest donation MIRI has received in its history, beaten only by Jed McCaleb’s 2013 cryptocurrency donation. As a result, we’ve been able to execute on our growth plans with more speed, confidence, and flexibility.
2. This year, instead of running separate summer and winter fundraisers, we’re merging them into one more ambitious fundraiser, which will take place in September.
3. I’m also pleased to announce that Abram Demski has accepted a position as a MIRI research fellow. Additionally, Ryan Carey has accepted a position as an assistant research fellow, and we’ve hired some new administrative staff.
I’ll provide more details on these and other new developments below.
Priority 1: Make progress on open technical problems
Since 2013, MIRI’s primary goal has been to make technical progress on AI alignment. Nearly all of our other activities are either directly or indirectly aimed at producing more high-quality alignment research, either at MIRI or at other institutions.
As mentioned above, Jessica Taylor is now leading an “Alignment for Advanced Machine Learning Systems” program, which will occupy about half of our research efforts going forward. Our goal with this work will be to develop formal models and theoretical tools that we predict would aid in the alignment of highly capable AI systems, under the assumption that such systems will be qualitatively similar to present-day machine learning systems. Our research communications manager, Rob Bensinger, has summarized themes in our new work and its relationship to other AI safety research proposals.
Earlier in the year, I jotted down a summary of how much technical progress I thought we’d made on our research agenda in 2015 (noted by Malo in our 2015 review), relative to my expectations. In short, I expected modest progress in all of our research areas except value specification (which was low-priority for us in 2015). We made progress more quickly than expected on some problems, and more slowly than expected on others.
In naturalized induction and logical uncertainty, we exceeded my expectations, making sizable progress. In error tolerance, we undershot my expectations and made only limited progress. In our other research areas, we made about as much progress as I expected: modest progress in decision theory and Vingean reflection, and limited progress in value specification.
I also made personal predictions earlier in the year about how much progress we’d make through the end of 2016: modest progress in decision theory, error tolerance, and value specification; limited progress in Vingean reflection; and sizable progress in logical uncertainty and naturalized induction. (Starting in 2017, I’ll be making my predictions publicly early in the year.)
Breaking these down:
- Vingean reflection is a lower priority for us this year. This is in part because we’re less confident that there’s additional low-hanging fruit to be plucked here, absent additional progress in logical uncertainty or decision theory. Although we’ve been learning a lot about implementation snags through Benya Fallenstein, Ramana Kumar, and Jack Gallagher’s ongoing HOL-in-HOL project, we haven’t seen any major theoretical breakthroughs in this area since Benya developed model polymorphism in late 2012. Benya and Kaya Fallenstein are still studying this topic occasionally.
- In contrast, we’ve continued to make steady gains in the basic theory of logical uncertainty, naturalized induction, and decision theory over the years. Benya, Kaya, Abram, Scott Garrabrant, Vanessa Kosoy, and Tsvi Benson-Tilsen will be focusing on these areas over the coming months, and I expect to see advances in 2016 of similar importance to what we saw in 2015.
- Our machine learning agenda is primarily focused on error tolerance and value specification, making these much higher priorities for us this year. I expect to see modest progress from Jessica Taylor, Patrick LaVictoire, Andrew Critch, Stuart Armstrong, and Ryan Carey’s work on these problems. It’s harder to say whether there will be any big breakthroughs here, given how new the program is.
Eliezer Yudkowsky and I will be splitting our time between working on these problems and doing expository writing. Eliezer is writing about alignment theory, while I’ll be writing about MIRI strategy and forecasting questions.
We spent large portions of the first half of 2016 writing up existing results and research proposals and coordinating with other researchers (such as through our visit to FHI and our Colloquium Series on Robust and Beneficial AI), and we have a bit more writing ahead of us in the coming weeks. We managed to get a fair bit of research done — we’ll be announcing a sizable new logical uncertainty result once the aforementioned writing is finished — but we’re looking forward to a few months of uninterrupted research time at the end of the year, and I’m excited to see what comes of it.
Priority 2: Expand our team
Growing MIRI’s research team is a high priority. We’re also expanding our admin team, with a goal of freeing up more of my time and better positioning MIRI to positively influence the booming AI risk conversation.
After making significant contributions to our research over the past year as a research associate (e.g., “Inductive Coherence” and Structural Risk Mitigation) and participating in our CSRBAI and MIRI Summer Fellows programs, Abram Demski has signed on to join our core research team. Abram is planning to join in late 2016 or early 2017, after completing his computer science PhD at the University of Southern California. Mihály Bárász is also slated to join our core research team at a future date, and we are considering several other promising candidates for research fellowships.
In the nearer term, data scientist Ryan Carey has been collaborating with us on our machine learning agenda and will be joining us as an assistant research fellow in September.
We’ve also recently hired a new office manager, Aaron Silverbook, and a communications and development admin, Colm Ó Riain.
We have an open type theorist job ad, and are more generally seeking research fellows with strong mathematical intuitions and a talent for formalizing and solving difficult problems, or for fleshing out and writing up results for publication.
We’re also seeking communications and outreach specialists (e.g., computer programmers with very strong writing skills) to help us keep pace with the lively public and academic AI risk conversation. If you’re interested, send a résumé and nonfiction writing samples to Rob.
Priority 3: Collaborate and communicate with other researchers
There have been a number of new signs in 2016 that AI alignment is going (relatively) mainstream:
- Stuart Russell and his students’ recent work on value learning and corrigibility (including a joint grant project with MIRI);
- positive reactions from Eric Schmidt and the press to a Google DeepMind / Future of Humanity Institute collaboration on corrigibility (partly supported by MIRI);
- the new “Concrete Problems in AI Safety” research proposal announced by Google Research and OpenAI (along with the Open Philanthropy Project’s declaration of interest in funding such research);
- and other, smaller developments.
MIRI’s goal is to ensure that the AI alignment problem gets solved, whether it’s MIRI solving it or some other group. As such, we’re excited by the new influx of attention directed at the alignment problem, and view this as an important time to nurture the field.
As AI safety research goes more mainstream, the pool of researchers we can dialogue with is becoming larger. At the same time, our own approach to the problem — specifically focused on the most long-term, high-stakes, and poorly-understood parts of the problem, and the parts that are least concordant with academic and industry incentives — remains unusual. Absent MIRI, I think that this part of the conversation would be almost entirely neglected.
To help promote our approach and grow the field, we intend to host more workshops aimed at diverse academic audiences. We’ll be hosting a machine learning workshop in the near future, and might run more events like CSRBAI going forward. We also have a backlog of past technical results to write up, which we expect to be valuable for engaging more researchers in computer science, economics, mathematical logic, decision theory, and other areas.
We’re especially interested in finding ways to hit priorities 1 and 3 simultaneously, pursuing important research directions that also help us build stronger ties to the wider academic world. One of several reasons for our new research agenda is its potential to encourage more alignment work by the ML community.
Short version: in the medium term, our research program will have a larger focus on error-tolerance and value specification research, with more emphasis on ML-inspired AI approaches, and we’re increasing the size of our research team in pursuit of that goal.
Rob, Malo, and I will be saying more about our funding situation and organizational strategy in September, when we kick off our 2016 fundraising drive. As part of that series of posts, I’ll also be writing more about how our current strategy fits into our long-term goals and priorities.
Finally, if you’re attending Effective Altruism Global this weekend, note that we’ll be running two workshops (one on Jessica’s new project, another on the aforementioned new logical uncertainty results), as well as some office hours (both with the research team and with the admin team). If you’re there, feel free to drop by, say hello, and ask more about what we’ve been up to.