Bill Hibbard on Ethical Artificial Intelligence

 |   |  Conversations

Bill Hibbard portraitBill Hibbard is an Emeritus Senior Scientist at the University of Wisconsin-Madison Space Science and Engineering Center, currently working on issues of AI safety and unintended behaviors. He has a BA in Mathematics and MS and PhD in Computer Sciences, all from the University of Wisconsin-Madison. He is the author of Super-Intelligent Machines, “Avoiding Unintended AI Behaviors,” “Decision Support for Safe AI Design,” and “Ethical Artificial Intelligence.” He is also principal author of the Vis5D, Cave5D, and VisAD open source visualization systems.

Luke Muehlhauser: You recently released a self-published book, Ethical Artificial Intelligence, which “combines several peer reviewed papers and new material to analyze the issues of ethical artificial intelligence.” Most of the book is devoted to the kind of exploratory engineering in AI that you and I described in a recent CACM article, such that you mathematically analyze the behavioral properties of classes of future AI agents, e.g. utility-maximizing agents.

Many AI scientists have the intuition that such early, exploratory work is very unlikely to pay off when we are so far from building an AGI, and don’t what an AGI will look like. For example, Michael Littman wrote:

…proposing specific mechanisms for combatting this amorphous threat [of AGI] is a bit like trying to engineer airbags before we’ve thought of the idea of cars. Safety has to be addressed in context and the context we’re talking about is still absurdly speculative.

How would you defend the value of the kind of work you do in Ethical Artificial Intelligence to Littman and others who share his skepticism?

Read more »

Fallenstein talk for APS March Meeting 2015

 |   |  News

Fallenstein APS talkMIRI researcher Benja Fallenstein recently delivered an invited talk at the March 2015 meeting of the American Physical Society in San Antonio, Texas. Her talk was one of four in a special session on artificial intelligence.

Fallenstein’s title was “Beneficial Smarter-than-human Intelligence: the Challenges and the Path Forward.” The slides are available here. Abstract:

Today, human-level machine intelligence is still in the domain of futurism, but there is every reason to expect that it will be developed eventually. A generally intelligent agent as smart or smarter than a human, and capable of improving itself further, would be a system we’d need to design for safety from the ground up: There is no reason to think that such an agent would be driven by human motivations like a lust for power; but almost any goals will be easier to meet with access to more resources, suggesting that most goals an agent might pursue, if they don’t explicitly include human welfare, would likely put its interests at odds with ours, by incentivizing it to try to acquire the physical resources currently being used by humanity. Moreover, since we might try to prevent this, such an agent would have an incentive to deceive its human operators about its true intentions, and to resist interventions to modify it to make it more aligned with humanity’s interests, making it difficult to test and debug its behavior. This suggests that in order to create a beneficial smarter-than-human agent, we will need to face three formidable challenges: How can we formally specify goals that are in fact beneficial? How can we create an agent that will reliably pursue the goals that we give it? And how can we ensure that this agent will not try to prevent us from modifying it if we find mistakes in its initial version? In order to become confident that such an agent behaves as intended, we will not only want to have a practical implementation that seems to meet these challenges, but to have a solid theoretical understanding of why it does so. In this talk, I will argue that even though human-level machine intelligence does not exist yet, there are foundational technical research questions in this area which we can and should begin to work on today. For example, probability theory provides a principled framework for representing uncertainty about the physical environment, which seems certain to be helpful to future work on beneficial smarter-than-human agents, but standard probability theory assumes omniscience about logical facts; no similar principled framework for representing uncertainty about the outputs of deterministic computations exists as yet, even though any smarter-than-human agent will certainly need to deal with uncertainty of this type. I will discuss this and other examples of ongoing foundational work.

Stuart Russell of UC Berkeley also gave a talk at this session, about the long-term future of AI.

March 2015 newsletter

 |   |  Newsletters

 

 

Machine Intelligence Research Institute

Research updates

News updates

Other news

As always, please don’t hesitate to let us know if you have any questions or comments.

Best,
Luke Muehlhauser
Executive Director

 

 

Davis on AI capability and motivation

 |   |  Analysis

In a review of Superintelligence, NYU computer scientist Ernest Davis voices disagreement with a number of claims he attributes to Nick Bostrom: that “intelligence is a potentially infinite quantity with a well-defined, one-dimensional value,” that a superintelligent AI could “easily resist and outsmart the united efforts of eight billion people” and achieve “virtual omnipotence,” and that “though achieving intelligence is more or less easy, giving a computer an ethical point of view is really hard.”

These are all stronger than Bostrom’s actual claims. For example, Bostrom never characterizes building a generally intelligent machine as “easy.” Nor does he say that intelligence can be infinite or that it can produce “omnipotence.” Humans’ intelligence and accumulated knowledge gives us a decisive advantage over chimpanzees, even though our power is limited in important ways. An AI need not be magical or all-powerful in order to have the same kind of decisive advantage over humanity.

Still, Davis’ article is one of the more substantive critiques of MIRI’s core assumptions that I have seen, and he addresses several deep issues that directly bear on AI forecasting and strategy. I’ll sketch out a response to his points here.

Read more »

New annotated bibliography for MIRI’s technical agenda

 |   |  News

annotated bibliographyToday we release a new annotated bibliography accompanying our new technical agenda, written by Nate Soares. If you’d like to discuss the paper, please do so here.

Abstract:

How could superintelligent systems be aligned with the interests of humanity? This annotated bibliography compiles some recent research relevant to that question, and categorizes it into six topics: (1) realistic world models; (2) idealized decision theory; (3) logical uncertainty; (4) Vingean reflection; (5) corrigibility; and (6) value learning. Within each subject area, references are organized in an order amenable to learning the topic. These are by no means the only six topics relevant to the study of alignment, but this annotated bibliography could be used by anyone who wants to understand the state of the art in one of these six particular areas of active research.

Today we’ve also released a page that collects the technical agenda and supporting reports. See our Technical Agenda page.

New mailing list for MIRI math/CS papers only

 |   |  News

As requested, we now offer email notification of new technical (math or computer science) papers and reports from MIRI. Simply subscribe to the mailing list below.

This list sends one email per new technical paper, and contains only the paper’s title, author(s), and abstract, plus a link to the paper.

Sign up to get updates on new MIRI technical results

Get notified every time a new technical paper is published.

February 2015 Newsletter

 |   |  Newsletters


Machine Intelligence Research Institute

Research Updates

News Updates

Other Updates

  • Top AI scientists and many others have signed an open letter advocating more research into robust and beneficial AI. The letter cites several MIRI papers.
  • Elon Musk has provided $10 million in funding for the types of research described in the open letter. The funding will be distributed in grants by the Future of Life Institute. Apply here.

As always, please don’t hesitate to let us know if you have any questions or comments.

Best,
Luke Muehlhauser

Executive Director

You’re receiving this because you subscribed to the MIRI newsletter.

unsubscribe from this list | update subscription preferences

 

 

New report: “The value learning problem”

 |   |  Papers

Value learningToday we release a new technical report by Nate Soares, “The value learning problem.” If you’d like to discuss the paper, please do so here.

Abstract:

A superintelligent machine would not automatically act as intended: it will act as programmed, but the fit between human intentions and formal specification could be poor. We discuss methods by which a system could be constructed to learn what to value. We highlight open problems specific to inductive value learning (from labeled training data), and raise a number of questions about the construction of systems which model the preferences of their operators and act accordingly.

This is the last of six new major reports which describe and motivate MIRI’s current research agenda at a high level.

Update May 29, 2016: A revised version of “The Value Learning Problem” (available at the original link) has been accepted to the IJCAI-16 Ethics for Artificial Intelligence workshop. The original version of the paper can be found here.