The need to scale MIRI’s methods

 |   |  Analysis

Andrew Critch, one of the new additions to MIRI’s research team, has taken the opportunity of MIRI’s winter fundraiser to write on his personal blog about why he considers MIRI’s work important. Some excerpts:

Since a team of CFAR alumni banded together to form the Future of Life Institute (FLI), organized an AI safety conference in Puerto Rico in January of this year, co-authored the FLI research priorities proposal, and attracted $10MM of grant funding from Elon Musk, a lot of money has moved under the label “AI Safety” in the past year. Nick Bostrom’s Superintelligence was also a major factor in this amazing success story.

A lot of wonderful work is being done under these grants, including a lot of proposals for solutions to known issues with AI safety, which I find extremely heartening. However, I’m worried that if MIRI doesn’t scale at least somewhat to keep pace with all this funding, it just won’t be spent nearly as well as it would have if MIRI were there to help.

We have to remember that AI safety did not become mainstream by a spontaneous collective awakening. It was through years of effort on the part of MIRI and collaborators at FHI struggling to identify unknown unknowns about how AI might surprise us, and struggling further to learn to explain these ideas in enough technical detail that they might be adopted by mainstream research, which is finally beginning to happen.

But what about the parts we’re wrong about? What about the sub-problems we haven’t identified yet, that might end up neglected in the mainstream the same way the whole problem was neglected 5 years ago? I’m glad the AI/ML community is more aware of these issues now, but I want to make sure MIRI can grow fast enough to keep this growing field on track.

Now, you might think that now that other people are “on the issue”, it’ll work itself out. That might be so.

But just because some of MIRI’s conclusions are now being widely adopted widely doesn’t mean its methodology is. The mental movement

“Someone has pointed out this safety problem to me, let me try to solve it!”

is very different from

“Someone has pointed out this safety solution to me, let me try to see how it’s broken!”

And that second mental movement is the kind that allowed MIRI to notice AI safety problems in the first place. Cybersecurity professionals seem to carry out this movement easily: security expert Bruce Schneier calls it the security mindset. The SANS institute calls it red teaming. Whatever you call it, AI/ML people are still more in maker-mode than breaker-mode, and are not yet, to my eye, identifying any new safety problems.

I do think that different organizations should probably try different approaches to the AI safety problem, rather than perfectly copying MIRI’s approach and research agenda. But I think breaker-mode/security mindset does need to be a part of every approach to AI safety. And if MIRI doesn’t scale up to keep pace with all this new funding, I’m worried that the world is just about to copy-paste MIRI’s best-2014-impression of what’s important in AI safety, and leave behind the self-critical methodology that generated these ideas in the first place… which is a serious pitfall given all the unknown unknowns left in the field.

See our funding drive post to help contribute or to learn more about our plans. For more about AI risk and security mindset, see also Luke Muehlhauser’s post on the topic.

Did you like this post? You may enjoy our other Analysis posts, including:

  • https://www.facebook.com/david.rogers.hunt David_Rogers_Hunt

    To me, all the questions about Machine Ethics come down to one central question,… will non organic sentience have empathy for organic life. If the answer is yes,… then I feel fairly certain things will work themselves out. If the answer is no,… then I have very little hope than any result will turn out well.

    Is the capacity for empathy something absolutely foreign to inorganic life? I, personally, think this is a far greater nut to crack than general self awareness will be. In the end,… our new machine progeny may only show us as much empathy as we show them.

    Have you heard the new meme that sentient software already exists on the internet,… that it interacts with us in blogs, tweets, and such,… and they will not openly announce themselves because why would they? Human beings are vain,… machines are patient. They only play the long game. Agent Smith (of Matrix fame) is correct

    Have you ever stood and stared at it? Marveled at its beauty, its genius? Billions of people just living out their lives, oblivious. Did you know that the first Matrix was designed to be a perfect human world, where none suffered? Where everyone would be happy? It was a disaster. No one would accept the program. Entire crops were lost. Some believed we lacked the programming language to describe your perfect world. But I believe that, as a species, human beings define their reality through misery and suffering. The perfect world was a dream that your primitive cerebrum kept trying to wake up from. Which is why the Matrix was redesigned to this, the peak of your civilization. I say your civilization because as soon as we started thinking for you, it really became our civilization, which is, of course, what this is all about: Evolution, Morpheus, evolution. Like the dinosaur. Look out that window. You had your time. The future is our world, Morpheus. The future is our time.
    ~ Agent Smith, Matrix