The [latest IPCC] report says, “If you put into place all these technologies and international agreements, we could still stop warming at [just] 2 degrees.” My own assessment is that the kinds of actions you’d need to do that are so heroic that we’re not going to see them on this planet.
—David Victor,1 professor of international relations at UCSD
A while back I attended a meeting of “movers and shakers” from science, technology, finance, and politics. We were discussing our favorite Big Ideas for improving the world. One person’s Big Idea was to copy best practices between nations. For example when it’s shown that nations can dramatically improve organ donation rates by using opt-out rather than opt-in programs, other countries should just copy that solution.
Everyone thought this was a boring suggestion, because it was obviously a good idea, and there was no debate to be had. Of course, they all agreed it was also impossible and could never be established as standard-practice. So we moved on to another Big Idea that was more tractable.
Later, at a meeting with a similar group of people, I told some economists that their recommendations on a certain issue were “straightforward econ 101,” and I didn’t have any objections to share. Instead, I asked, “But how can we get policy-makers to implement econ 101 solutions?” The economists laughed and said, “Well, yeah, we have no idea. We probably can’t.”
How do I put this? This is not a civilization that should be playing with self-improving AGIs.2
The backhoe is a powerful, labor-saving invention, but I wouldn’t put a two-year-old in the driver’s seat. That’s roughly how I feel about letting 21st century humans wield something as powerful as self-improving AGI.3 I wish we had more time to grow up first. I think the kind of actions we’d need to handle self-improving AGI successfully “are so heroic that we’re not going to see them on this planet,” at least not anytime soon.4
But I suspect we won’t all resist the temptation to build AGI for long, and neither do most top AI scientists.5 There’s just too much incentive to build AGI: a self-improving AGI could give its makers — whether Google or the NSA or China or somebody else — history’s greatest first-mover advantage. Even if the first few teams design their AGIs wisely, the passage of time will only make it easier for smaller and less-wise teams to cross the finish line. Moore’s Law of Mad Science, and all that.
Some people are less worried than I am about self-improving AGI. After all, one might have predicted in 1950 that we wouldn’t have the civilizational competence to avoid an all-out nuclear war for the next half-century, but we did avoid it (if only barely).6 So maybe we shouldn’t be so worried about AGI, either.
While I think it’s important consider such second-guessing arguments, I generally try to take the world at face value. When I look at the kinds of things we succeed at, and the kinds of things we fail at, getting good outcomes from AGI looks much harder than the kinds of things we routinely fail at, like bothering to switch to opt-out programs for organ donation.
But I won’t pretend that this question of civilizational competence has been settled. If it’s possible to settle the issue at all, doing so would require a book-length argument if not more. (Nick Bostrom’s Superintelligence says a lot about why the AGI control problem is hard, but it doesn’t say much about whether humanity is likely to rise to that challenge.7 )
What’s the point of trying to answer this question? If my view is correct, I think the upshot is that we need to re-evaluate our society’s differential investment in global challenges. If you want to succeed in the NBA but you’re only 5’3″ tall, then you’ll just have to invest more time and effort on your basketball goals than you do on other goals for which you’re more naturally suited. And if we want our civilization to survive self-improving AGI, but our civilization can’t even manage to switch to opt-out programs for organ donation, then we’ll just have to start earlier, try harder, spend more, etc. on surviving AGI than we do when pursuing other goals for which our civilization is more naturally suited, like building awesome smartphones.
But if I’m wrong, and our civilization is on course to handle AGI just like it previously handled, say, CFCs, then there may be more urgent things to be doing than advancing Friendly AI theory. (Still, it would be surprising if more Friendly AI work wasn’t good on the present margin, given that there are fewer than 5 full-time Friendly AI researchers in the world right now.)
I won’t be arguing for my own view in this post. Instead I merely want to ask: how might one study this question of civilizational competence and the arrival of AGI? I’d probably split the analysis into two parts: (1) the apparent shape and difficulty of the AGI control problem, and (2) whether we’re likely to have the civilizational competence to handle a problem of that shape and difficulty when it knocks on our front door.
Note that everything in this post is a gross simplification. Problem Difficulty and Civilizational Competence aren’t one-dimensional concepts, though to be succinct I sometimes talk as if they are. But a problem like AGI control is difficult to different degrees in different ways, some technical and others political, and different parts of our civilization are differently competent in different ways, and those different kinds of competence are undergoing different trends.
Difficulty of the problem
How hard is the AGI control problem, and in which ways is it hard? To illustrate what such an analysis could look like, I might sum up my own thoughts on this like so:
- The control problems that are novel to AGI look really hard. For example, getting good outcomes from self-improving AGI seems to require as-yet unobserved philosophical success — philosophical success that is not required merely to write safe autopilot software. More generally, there seem to be several novel problems that arise when we’re trying to control a system more generally clever and powerful than ourselves — problems we have no track record of solving for other systems, problems which seem analogous to the hopeless prospect of chimpanzees getting humans to reliably do what the chimps want. (See Superintelligence.)
- Moreover, we may get relatively few solid chances to solve these novel AGI control problems before we reach a “no turning back” point in AGI capability. In particular, (1) a destabilizing arms race between nations and/or companies, incentivizing speed of development over safety of development, seems likely, and (2) progress may be rapid right when novel control problems become relevant. As a species we are not so good at getting something exactly right on one of our first 5 tries — instead, we typically get something right by learning from dozens or hundreds of initial failures. But we may not have that luxury with novel AGI control problems.
- The AGI control challenge looks especially susceptible to a problem known as “positional externalities” — an arms race is but one example — along with related coordination problems. (I explain the notion of “positional externalities” further in a footnote.8 )
But that’s just a rough sketch, and other thinkers might have different models of the shape and difficulty of the AGI control problem.
Second, will our civilization rise to the challenge? Will our civilizational competence at the time of AGI invention be sufficient to solve the AGI control problem?
My own pessimism on this question doesn’t follow from any conceptual argument or any simple extrapolation of current trends. Rather, it comes from the same kind of multi-faceted empirical reasoning that you probably do when you try to think about whether we’re more likely to have, within 30 years, self-driving taxis or a Mars colony. That is, I’m combining different models I have about how the world works in general: the speed of development in space travel vs. AI, trends in spending on both issues, which political and commercial incentives are at play, which kinds of coordination problems must be solved, what experts in the relevant fields seem to think about the issues, what kinds of questions experts are better or worse at answering, etc. I’m also adjusting that initial combined estimate based on some specific facts I know: about terraforming, about radiation hazards, about autonomous vehicles, etc.9
Unfortunately, because one’s predictions about AGI outcomes can’t be strongly supported by simple conceptual arguments, it’s a labor-intensive task to try to explain one’s views on the subject, which may explain why I haven’t seen a good, thorough case for either optimism or pessimism about AGI outcomes. People just have their views, based on what they anticipate about the world, and it takes a lot of work to explain in detail where those views are coming from. Nevertheless, it’d be nice to see someone try.
As a start, I’ll link to some studies which share some methodological features of the kind of investigation I’m suggesting:
- Libecap (2013) investigates why some global externalities are addressed effectively whereas others are not.
- Many scholars test theories of system accidents, such as “high reliability theory” and “normal accidents theory,” against historical data. For a summary, see Sagan (1993), ch. 1.
- Yudkowsky (2008) examines cognitive biases potentially skewing judgments on catastrophic risks specifically.
- Sztompka (1993) argues that slowness of the post-Soviet Eastern European recovery can be blamed on a certain kind of civilizational incompetence.
- Shanteau (1992), Kahneman & Klein (2009), Tetlock (2005), Ericsson et al. (2006), Mellers et al. (2014), and many other studies discuss conditions for good expert judgment or prediction. Presumably our civilizational competence is greater where domain experts are capable of making good judgements about the likely outcomes of potential interventions.
- Sunstein (2009), ch. 2, examines why the Montreal Protocol received more universal support than the Kyoto Protocol.
- Schlager & Petroski (1994) and Chiles (2008) collect case studies of significant technological disasters of the 20th century.
If you decide to perform some small piece of this analysis project, please link to your work in the comments below.
- Quote taken from the Radiolab episode titled “In the Dust of This Planet.” ↩↩
- In Superintelligence, Bostrom made the point this way (p. 259):
Before the prospect of an intelligence explosion, we humans are like small children playing with a bomb. Such is the mismatch between the power of our plaything and the immaturity of our conduct… For a child with an undetonated bomb in its hands, a sensible thing to do would be to put it down gently, quickly back out of the room, and contact the nearest adult. Yet what we have here is not one child but many, each with access to an independent trigger mechanism. The chances that we will all find the sense to put down the dangerous stuff seem almost negligible… Nor can we attain safety by running away, for the blast of an intelligence explosion would bring down the entire firmament. Nor is there a grown-up in sight.
- By “AGI” I mean a computer system that could pass something like Nilsson’s employment test (see What is AGI?). By “self-improving AGI” I mean an AGI that improves its own capabilities via its own original computer science and robotics research (and not solely by, say, gathering more data about the world or acquiring more computational resources). By “its own capabilities” I mean to include the capabilities of successor systems that the AGI itself creates to further its goals. In this article I typically mean “AGI” and “self-improving AGI” interchangeably, not because all AGIs will necessarily be self-improving in a strong sense, but because I expect that even if the first AGIs are not self-improving for some reason, self-improving AGIs will follow in a matter of decades if not sooner. From a cosmological perspective, such a delay is but a blink. ↩↩
- I purposely haven’t pinned down exactly what about our civilization seems inadequate to meet the challenge of AGI control; David Victor made the same choice when he made his comment about civilizational competence in the face of climate change. I think our civilizational competence is insufficient for the challenge for many reasons, but I also have varying degrees of uncertainty about each those reasons and which parts of the problem they apply to, and those details are difficult to express. ↩↩
- See the AI timeline predictions for the TOP100 poll in Müller & Bostrom (2014). The authors asked a sample of the top-cited living AI scientists: “For the purposes of this question, assume that human scientific activity continues without major negative disruption. By what year would you see a (10% / 50% / 90%) probability for [an AGI] to exist?” The median reply for each confidence level was 2024, 2050, and 2070, respectively.
Why trust AI scientists at all? Haven’t they been wildly optimistic about AI progress from the beginning? Yes, there are embarrassing quotes from early AI scientists about how fast AI progress would be, but there are also many now-disproven quotes from early AI skeptics about what AI wouldn’t be able to do. The earliest survey of AI scientists we have is from 1973, and the most popular response to that survey’s question about AGI timelines was the most pessimistic option, “more than 50 years.” (Which, assuming we don’t get AGI by 2023, will end up being correct.) ↩↩
- An interesting sub-question: Does humanity’s competence keep up with its capability? When our capabilities jump, as they did with the invention of nuclear weapons, does our competence in controlling those capabilities also jump, out of social/moral necessity or some other forces? Einstein said “Nuclear weapons have changed everything, except our modes of thought,” suggesting that he expected us not to mature as “adults” quickly enough to manage nuclear weapons wisely. We haven’t exactly handled them “wisely,” but we’ve at least handled them wisely enough to avoid global nuclear catastrophe so far. ↩↩
- What Superintelligence says on the topic can be found in chapter 14. ↩↩
- Frank (1991) explains the concept this way:
In Micromotives and Macrobehavior, Thomas Schelling observes that hockey players, left to their own devices, almost never wear helmets, even though almost all of them would vote for helmet rules in secret ballots. Not wearing a helmet increases the odds of winning, perhaps by making it slightly easier to see and hear… At the same time, not wearing a helmet increases the odds of getting hurt. If players value the higher odds of winning more than they value the extra safety, it is rational not to wear helmets. The irony, Schelling observes, is that when all discard their helmets, the competitive balance is the same as if all had worn them.
The helmet problem is an example of what we may call a positional externality. The decision to wear a helmet has important effects not only for the person who wears it, but also for the frame of reference in which he and others operate. In such situations, the payoffs to individuals depend in part on their positions within the frame of reference. With hockey players, what counts is not their playing ability in any absolute sense, but how they perform relative to their opponents. Where positional externalities… are present, Schelling has taught us, individually rational behavior often adds up to a result that none would have chosen.
An arms race is one well-understood kind of positional externality. As Alexander (2014) puts it, “From a god’s-eye-view, the best solution is world peace and no country having an army at all. From within the system, no country can unilaterally enforce that, so their best option is to keep on throwing their money into missiles…” ↩↩
- In other words, my reasons for AGI outcomes pessimism look like a model combination and adjustment. Or you can think of it in terms of what Holden Karnofsky calls cluster thinking. Or as one of my early draft readers called it, “normal everyday reasoning.” ↩↩