This is the 3rd part of my personal and qualitative self-review of MIRI in 2013, in which I begin to review MIRI’s 2013 research activities. By “research activities” I mean to include outreach efforts primarily aimed at researchers, and also three types of research performed by MIRI:
- Expository research aims to consolidate and clarify already-completed strategic research or Friendly AI research that hasn’t yet been explained with sufficient clarity or succinctness, e.g. “Intelligence Explosion: Evidence and Import” and “Robust Cooperation: A Case Study in Friendly AI Research.” (I consider this a form of “research” because it often requires significant research work to explain ideas clearly, cite relevant sources, etc.)
- Strategic research aims to clarify how the future is likely to unfold, and what we can do now to nudge the future toward good outcomes, and involves more novel thought and modeling than expository research — though, the distinction is fuzzy. See e.g. “Intelligence Explosion Microeconomics” and “How We’re Predicting AI — or Failing to.”1
- Friendly AI research aims to solve the technical sub-problems that seem most relevant to the challenge of designing a stably self-improving artificial intelligence with humane values. This often involves sharpening philosophical problems into math problems, and then developing the math problems into engineering problems. See e.g. “Tiling Agents for Self-Modifying AI” and “Robust Cooperation in the Prisoner’s Dilemma.”
I’ll review MIRI’s strategic and expository research in this post; my review of MIRI’s 2013 Friendly AI research will appear in a future post. For the rest of this post, I usually won’t try to distinguish which writings are “expository” vs. “strategic” research, since most of them are partially of both kinds.
Strategic and expository research in 2013
- In 2013, our public-facing strategic and expository research consisted of 4 papers published directly by MIRI, 4 journal-targeted papers, 4 chapters in a peer-reviewed book, 9 in-depth analysis blog posts, 14 short analysis blog posts, and 16 interviews with domain experts.
- I think these efforts largely accomplished the goals at which they were aimed, but in 2013 we learned a great deal about how to accomplish those goals more efficiently in the future. In particular…
- Expert interviews seem to be the most efficient way to accomplish some of those goals.
- Rather than conducting large strategic research projects ourselves, we should focus on writing up what is already known (“expository research”) and on describing open research questions so that others can examine them.
What we did in 2013 and why
Below I list the writings that constitute MIRI’s public-facing2 strategic and expository research in 2013.
- 4 papers published directly by MIRI: (1) Yudkowsky’s “Intelligence Explosion Microeconomics,”3 (2) Sotala & Yampolskiy’s “Responses to Catastrophic AGI Risk: A Survey,” (3) Grace’s “Algorithmic Progress in Six Domains,” and (4) Fallenstein & Mennen’s “Predicting AGI: What can we say when we know so little?“
- 4 journal-targeted papers, two of them published and two of them still being considered by target journals: (1) Shulman & Bostrom’s “Embryo Selection for Cognitive Enhancement,” (2) Armstrong et al.’s “Racing to the Precipice,” (3) Yampolskiy & Fox’s “Safety Engineering for Artificial General Intelligence,”4 and (4) Muehlhauser & Bostrom’s “Why We Need Friendly AI.”5
- 4 chapters in a peer-reviewed book from Springer called Singularity Hypotheses: A Scientific and Philosophical Assessment. Three chapters were written by MIRI staff members: “Intelligence Explosion: Evidence and Import,” “Intelligence Explosion and Machine Ethics,” and “Friendly Artificial Intelligence.”6 One additional chapter was co-authored by then-MIRI research associate Joshua Fox: “Artificial Intelligence and the Human Mental Model.” MIRI also contributed two short replies to other chapters, one reply by Yudkowsky and another by Michael Anissimov.7
- 9 in-depth “analysis” blog posts: (1) Sotala’s “A brief history of ethically concerned scientists,” (2) Kaas’ “Bayesian Adjustment Does Not Defeat Existential Risk Charity,” Yudkowsky’s (3) “The Robots, AI, and Unemployment Anti-FAQ” and (4) “Pascal’s Muggle,” and Muehlhauser’s (5) “AGI Impact Experts and Friendly AI Experts,” (6) “When Will AI Be Created?“, (7) “Transparency in Safety-Critical Systems,” (8) “How effectively can we plan for future decades? (initial findings),” and (9) “How well will policy-makers handle AGI? (initial findings).”
- 14 short analysis blog posts: Yudkowsky’s (1) “Five theses, two lemmas, and a couple of strategic implications,” (2) “After critical event W happens, they still won’t believe you,” (3) “Do Earths with slower economic growth have a better chance at FAI?,” and (4) “Being Half-Rational About Pascal’s Wager is Even Worse,” and Muehlhauser’s (5) “Friendly AI Research as Effective Altruism,” (6) “What is intelligence?“, (7) “What is AGI?“, (8) “AI Risk and the Security Mindset,” (9) “Mathematical Proofs Improve But Don’t Guarantee Security, Safety, and Friendliness,” (10) “Richard Posner on AI Dangers,” (11) “Russell and Norvig on Friendly AI,” (12) “From Philosophy to Math to Engineering,” (13) “Intelligence Amplification and Friendly AI,” and (14) “Model Combination and Adjustment.”
- 16 interviews with domain experts: (1) James Miller on unusual incentives facing AGI companies, (2) Roman Yampolskiy on AI safety engineering, (3) Nick Beckstead on the importance of the far future, (4) Benja Fallenstein on the Löbian obstacle to self-modifying systems, (5) Holden Karnofsky on transparent research analyses, (6) Stephen Hsu on Cognitive Genomics, (7) Laurent Orseau on Artificial General Intelligence, (8) Paul Rosenbloom on Cognitive Architectures, (9) Ben Goertzel on AGI as a Field, (10) Hadi Esmaeilzadeh on Dark Silicon, (11) Bas Steunebrink on Self-Reflective Programming, (12) Markus Schmidt on Risks from Novel Biotechnologies, (13) Robin Hanson on Serious Futurism, (14) Greg Morrisett on Secure and Reliable Systems, (15) Scott Aaronson on Philosophical Progress, and (16) Josef Urban on Machine Learning and Automated Reasoning.8
- One recorded and transcribed conversation about effective altruism in general, with other members of the effective altruism movement: Effective Altruism and Flow-Through Effects.
MIRI staff members have varying opinions about the value and purpose of strategic and expository research. Speaking for myself, I supported or conducted the above research activities in order to:9
- Test our assumptions and try to understand the views of people who (might) disagree with us. Examples: “How effectively can we plan for future decades?”, “How well will policy-makers handle AGI?”, and the Greg Morrisett interview.
- Learn new things that can inform strategic action concerning existential risk and Friendly AI. Examples: “Algorithmic Progress in Six Domains,” the Hadi Esmaeilzadeh interview, and the Josef Urban interview.
- Make it easier for other researchers to contribute, by performing small bits of initial work on questions of strategic significance, or by explaining how an open question in superintelligence strategy could be studied in more depth. Examples: “Intelligence Explosion Microeconomics,” “Algorithmic Progress in Six Domains,” and “How effectively can we plan for future decades?”
- Build relationships with researchers who might one day contribute to strategic, expository, or Friendly AI research. Examples: many of the interviews.
- Explain small “pieces of the puzzle” that contribute to MIRI-typical views about existential risk and Friendly AI. Examples: “When Will AI Be Created?”, “Mathematical Proofs Improve But Don’t Guarantee…,” and the Nick Beckstead interview.
How well did these efforts achieve their goals?
We have not yet implemented quantitative methods for measuring how well our strategic and expository research efforts are meeting the goals at which they are aimed.10 For now, I can only share my subjective, qualitative impressions, based on my own reasoning and a few conversations I had with some people who follow our research closely, after showing them a near-complete draft of the previous section.
Re: goal (1). It’s difficult to locate cheap, strong tests of our assumptions. So, the research aimed at this goal conducted in 2013 either weakly confirmed some of our assumptions (e.g. see the Greg Morrisett interview11 ) or could make only small steps toward providing good tests of our assumptions (e.g. see “How effectively can we plan for future decades?” and “How well will policy-makers handle AGI?”).
Re: goal (2). Similarly, it’s difficult to locate inexpensive evidence that robustly pins down the value of an important strategic variable (e.g. AI timelines, AI takeoff speed, or the strength of the “convergent instrumental values” attractor in mind design space). Hence, research aimed at learning new things typically only provides small updates (for us, anyway), e.g. about the prospects for Moore’s law (the Hadi Esmaeilzadeh interview) and about the current state of automated mathematical reasoning (the Josef Urban interview).
My own reaction to the difficulty of obtaining additional high-likelihood-ratio evidence about long-term AI futures goes something like this:
Well, the good news is that humanity seems to have seized most of the low-hanging fruit about future machine superintelligence, which wasn’t the case 15 years ago. The bad news is that the low-hanging fruit alone doesn’t make it clear how we go about winning. But since the stakes are really high, we just have to accept that long-term forecasting is hard, and then try harder. We need to get more researchers involved so more research can be produced, and we must be prepared to accept that it might take 10 PhD theses worth of work before we get a 2:1 Bayesian update about a strategically relevant variable. Also, it’s probably good to “marinate” one’s brain in relevant fields even if one isn’t sure which specific updates one will be able to make as a result, because filling one’s brain with facts about relevant fields will likely improve one’s intuitions in general about those fields and adjacent fields.12
Re: goal (3). I don’t have a good sense of how useful MIRI’s 2013 strategic and expository research has been for other researchers, but such effects typically require several years to materialize.13 I’m optimistic about this work enabling further research by others simply because that’s how things typically work in other fields of research, and I don’t see much reason to think that superintelligence strategy will be any different.
Re: goal (4). Yes, many of the interviews built new relationships with helpful domain experts.
Re: goal (5). Again, I don’t have good measures of the effects here, but I do receive frequent comments from community members that “such-and-such post was really clarifying.” Some of the analyses are also linked regularly by other groups. For example, both GiveWell and 80,000 Hours have linked to our model combination post when explaining their own research strategies.
Looking ahead to 2014
As discussed above and in my operations review, we still need to find better ways to measure the impact of our research. A plausible first-try measurement technique would be to survey a subset of the people we hope to impact in various ways, and ask how our research has impacted them.
Even before we can learn from improved impact measurement, however, I think I can say a few things about what I’ve learned about doing strategic and expository research, and what we plan to do differently in 2014.
First, interviews with domain experts are a highly efficient way to achieve some of the goals I have for expository and strategic research. Each interview required only a few hours of staff time, whereas a typical “short” analysis post cost between 5 and 25 person-hours, and a typical “in-depth” analysis post cost between 10 and 60 person-hours.
In 2013 we published 16 domain expert interviews between July 1st and December 30th, an average of 2.66 interviews per month. In 2014 I intend to publish 4 or more interviews per month on average.
Second, expository research tends to be more valuable per unit effort than new strategic research. MIRI (in conjunction with our collaborators at FHI) has an uncommonly large backlog of strategic research that has been “completed” but not explained clearly anywhere. Obviously, it takes less effort to explain already-completed strategic research than it takes to conduct original strategic research and then also explain it.
Third, we can prioritize expository (and sometimes strategic) research projects by dialoguing with intelligent critics who are representative of populations we want to influence (e.g. AI researchers, mega-philanthropists) and then preparing the writings most relevant to their concerns. We can then dialogue with them again after they’ve read the new exposition, and see whether that particular objection remains, and if so why, and if not then what other objections remain — which can in turn inform our prioritization of future writings, and also potentially reveal flaws in our models.
Fourth, students want to know which research projects they could do that would help clarify superintelligence strategy. Unfortunately, experienced professors are not yet knocking down our door to ask us which papers they could research and write to clarify superintelligence strategy, but many graduate students are. Also, I’ve had a few conversations with graduate student advisors who said they have to put lots of time into helping their students find good projects, and that it would be helpful if somebody else prepared research project proposals suitable for their students and their department.
Furthermore, there is some historical precedent for this strategy working, even within the young, narrow domain of superintelligence strategy. The clearest example is that of Nick Beckstead, who wrote a useful philosophy dissertation on the importance of shaping the far future, in part due to conversations with FHI. João Lourenço is currently writing a philosophy dissertation about the prospects for moral enhancement, in part due to conversations with FHI and MIRI. Jeremy Miller is in the early planning stages of a thesis project about universal measures of intelligence, in part due to conversations with MIRI. I think there are other examples, but I haven’t been able to confirm them yet.
So, in 2014 we plan to publish short descriptions of research projects which could inform superintelligence strategy. This will be much easier to do once Nick Bostrom’s Superintelligence book is published, so we’ll probably wait until that happens this summer.
Fifth, Nick Bostrom’s forthcoming scholarly monograph on machine superintelligence provides a unique opportunity to engage more researchers in superintelligence strategy. As such, some of our “outreach to potential strategic researchers” work in 2014 will consist in helping to promote Bostrom’s book. We also plan to release a reading guide for the book, to increase the frequency with which people finish, and benefit from, the book.
- Note that what I call “MIRI’s strategic research” or “superintelligence strategy research” is a superintelligence-focused subset of what GiveWell would call “strategic cause selection research” and CEA would call this “cause prioritization research.” ↩↩
- As usual, we also did significant strategic research in 2013 that is not public-facing (at least not yet), for example 100+ hours of feedback on various drafts of Nick Bostrom’s forthcoming book Superintelligence: Paths, Dangers, Strategies, 15+ hours of feedback on early drafts of Robin Hanson’s forthcoming book about whole brain emulation, and much work on forthcoming MIRI publications. ↩↩
- Yudkowsky labeled this as “open problem in Friendly AI #1”, but I categorize it as strategic research rather than Friendly AI research. ↩↩
- At the time of publication, Joshua Fox was a MIRI research associate. ↩↩
- “Why We Need Friendly AI” was published in an early 2014 issue of the journal Think, but it was released online in 2013. ↩↩
- The “Friendly Artificial Intelligence” chapter is merely an abridged version of Yudkowsky’s earlier “Artificial Intelligence as a Positive and Negative Factor in Global Risk.” ↩↩
- These chapters were written during 2011 and 2012, but not published in the book until 2013. ↩↩
- There were also two very short interviews with Eliezer Yudkowsky: “Yudkowsky on Logical Uncertainty” and “Yudkowsky on ‘What can we do now?’“ ↩↩
- I have an additional goal for some of our outreach and research activities, which is to address difficult problems in epistemology, because they are more relevant to MIRI’s research than to (e.g.) business or the practice of “normal science” (in the Kuhnian sense). “Pascal’s Muggle” is one example. Also, some of our expository and strategic research doubles as general outreach, e.g. the popular interview with Scott Aaronson. ↩↩
- Well, we can share some basic web traffic data. According to Google Analytics, the pages (of 2013’s strategic or expository research) with the most “unique pageviews” since they were created are: “When will AI be created?” (~15.5k), the Scott Aaronson interview (~13.5k), the Hadi Esmaeilzadeh interview (~13.5k), “The Robots, AI, and Unemployment Anti-FAQ” (~12k), “What is intelligence?” (~5k), “Pascal’s Muggle” (~5k), “A brief history of ethically concerned scientists” (~4.5k), “Intelligence explosion microeconomics” (~3.5k), and “From philosophy to math to engineering” (~3.5k). Naturally, this list is biased in favor of articles published earlier. Also, Google Analytics doesn’t track PDF downloads, so we don’t have numbers for those. ↩↩
- E.g. see his statements “Yes, I completely agree with [the ‘Mathematical Proofs Improve…‘ post]” and “I think re-architecting and re-coding things will almost always lead to a win in terms of security, when compared to bolt-on approaches.” ↩↩
- This last bit is part of my motivation for listening to so many nonfiction audiobooks since September 2013. ↩↩
- “Intelligence Explosion Microeconomics” enabled “Algorithmic Progress in Six Domains,” but it was still the case that MIRI had to commission “Algorithmic Progress in Domains.” ↩↩