It’s a good piece. Go read it and then come back here so I can make a few clarifications.
1. Smarter-than-human AI probably isn’t coming “soon.”
“Computers will soon become more intelligent than us,” the story begins, but few experts I know think this is likely.
A recent survey asked the world’s top-cited living AI scientists by what year they’d assign a 10% / 50% / 90% chance of human-level AI (aka AGI), assuming scientific progress isn’t massively disrupted. The median reply for a 10% chance of AGI was 2024, for a 50% chance of AGI it was 2050, and for a 90% chance of AGI it was 2070. So while AI scientists think it’s possible we might get AGI soon, they largely expect AGI to be an issue for the second half of this century.
Moreover, many of those who specialize in thinking about AGI safety actually think AGI is further away than the top-cited AI scientists do. For example, relative to the surveyed AI scientists, Nick Bostrom and I both think more probability should be placed on later years. We advocate more work on the AGI safety challenge today not because we think AGI is likely in the next decade or two, but because AGI safety looks to be an extremely difficult challenge — more challenging than managing climate change, for example — and one requiring several decades of careful preparation.
The greatest risks from both climate change and AI are several decades away, but thousands of smart researchers and policy-makers are already working to understand and mitigate climate change, and only a handful are working on the safety challenges of advanced AI. On the present margin, we should have much less top-flight cognitive talent going into climate change mitigation, and much more going into AGI safety research.
Today we release a new technical report from MIRI research associate Tsvi Benson-Tilsen: “UDT with known search order.” Abstract:
We consider logical agents in a predictable universe running a variant of updateless decision theory. We give an algorithm to predict the behavior of such agents in the special case where the order in which they search for proofs is simple, and where they know this order. As a corollary, “playing chicken with the universe” by diagonalizing against potential spurious proofs is the only way to guarantee optimal behavior for this class of simple agents.
Earlier today I was alerted to the existence of Singularity2014.com (archived screenshot). MIRI has nothing to do with that website and we believe it is a fake.
The website claims there is a “Singularity 2014″ conference “in the Bay Area” on “November 9, 2014.” We believe that there is no such event. No venue is listed, tickets are supposedly sold out already, and there are no links to further information. The three listed speakers are unknown to us, and their supposed photos are stock photos (1, 2, 3). The website prominently features an image of Ray Kurzweil, but Ray Kurzweil’s press staff confirms that he has nothing to do with this event. The website also features childish insults and a spelling error.
The website claims the event is “staged and produced by former organizers of the Singularity Summit from the Machine Intelligence Research Institute,” and that “All profits benefit the Machine Intelligence Research Institute,” but MIRI has nothing to do with this supposed event.
The Singularity2014.com domain name was registered via eNom reseller NameCheap.com on September 15th, 2014 by someone other than us, and is associated with a P.O. Box in Panama.
MIRI is collaborating with Singularity University to have the website taken down. If you have information about who is responsible for this, please contact email@example.com.
The next Singularity Summit will be organized primarily by Singularity University; for more information see here.
Update: The website has been taken down.
Today we release a paper describing a new problem area in Friendly AI research we call corrigibility. The report (PDF) is co-authored by MIRI’s Friendly AI research team (Eliezer Yudkowsky, Benja Fallenstein, Nate Soares) and also Stuart Armstrong from the Future of Humanity Institute at Oxford University.
The abstract reads:
As artificially intelligent systems grow in intelligence and capability, some of their available options may allow them to resist intervention by their programmers. We call an AI system “corrigible” if it cooperates with what its creators regard as a corrective intervention, despite default incentives for rational agents to resist attempts to shut them down or modify their preferences. We introduce the notion of corrigibility and analyze utility functions that attempt to make an agent shut down safely if a shutdown button is pressed, while avoiding incentives to prevent the button from being pressed or cause the button to be pressed, and while ensuring propagation of the shutdown behavior as it creates new subsystems or self-modifies. While some proposals are interesting, none have yet been demonstrated to satisfy all of our intuitive desiderata, leaving this simple problem in corrigibility wide-open.
This paper was accepted to the AI & Ethics workshop at AAAI-2015.
Update: The slides for Nate Soares’ presentation at AAAI-15 are available here.
The [latest IPCC] report says, “If you put into place all these technologies and international agreements, we could still stop warming at [just] 2 degrees.” My own assessment is that the kinds of actions you’d need to do that are so heroic that we’re not going to see them on this planet.
—David Victor,1 professor of international relations at UCSD
A while back I attended a meeting of “movers and shakers” from science, technology, finance, and politics. We were discussing our favorite Big Ideas for improving the world. One person’s Big Idea was to copy best practices between nations. For example when it’s shown that nations can dramatically improve organ donation rates by using opt-out rather than opt-in programs, other countries should just copy that solution.
Everyone thought this was a boring suggestion, because it was obviously a good idea, and there was no debate to be had. Of course, they all agreed it was also impossible and could never be established as standard-practice. So we moved on to another Big Idea that was more tractable.
Later, at a meeting with a similar group of people, I told some economists that their recommendations on a certain issue were “straightforward econ 101,” and I didn’t have any objections to share. Instead, I asked, “But how can we get policy-makers to implement econ 101 solutions?” The economists laughed and said, “Well, yeah, we have no idea. We probably can’t.”
Before the prospect of an intelligence explosion, we humans are like small children playing with a bomb. Such is the mismatch between the power of our plaything and the immaturity of our conduct… For a child with an undetonated bomb in its hands, a sensible thing to do would be to put it down gently, quickly back out of the room, and contact the nearest adult. Yet what we have here is not one child but many, each with access to an independent trigger mechanism. The chances that we will all find the sense to put down the dangerous stuff seem almost negligible… Nor can we attain safety by running away, for the blast of an intelligence explosion would bring down the entire firmament. Nor is there a grown-up in sight.
On September 18th, MIRI research fellow Nate Soares spoke at Purdue University’s Dawn or Doom seminar. Slides, video, and a transcript of his talk — “Why ain’t you rich? Why Our Current Understanding of ‘Rational Choice’ Isn’t Good Enough for Superintelligence” — are now available.