John Fox on AI safety

 |   |  Conversations

John Fox portrait John Fox is an interdisciplinary scientist with theoretical interests in AI and computer science, and an applied focus in medicine and medical software engineering. After training in experimental psychology at Durham and Cambridge Universities and post-doctoral fellowships at CMU and Cornell in the USA and UK (MRC) he joined the Imperial Cancer Research Fund (now Cancer Research UK) in 1981 as a researcher in medical AI. The group’s research was explicitly multidisciplinary and it subsequently made significant contributions in basic computer science, AI and medical informatics, and developed a number of successful technologies which have been commercialised.

In 1996 he and his team were awarded the 20th Anniversary Gold Medal of the European Federation of Medical Informatics for the development of PROforma, arguably the first formal computer language for modeling clinical decision and processes. Fox has published widely in computer science, cognitive science and biomedical engineering, and was the founding editor of the Knowledge Engineering Review (Cambridge University Press). Recent publications include a research monograph Safe and Sound: Artificial Intelligence in Hazardous Applications (MIT Press, 2000) which deals with the use of AI in safety-critical fields such as medicine.

Luke Muehlhauser: You’ve spent many years studying AI safety issues, in particular in medical contexts, e.g. in your 2000 book with Subrata Das, Safe and Sound: Artificial Intelligence in Hazardous Applications. What kinds of AI safety challenges have you focused on in the past decade or so?

John Fox: From my first research job, as a post-doc with AI founders Allen Newell and Herb Simon at CMU, I have been interested in computational theories of high level cognition. As a cognitive scientist I have been interested in theories that subsume a range of cognitive functions, from perception and reasoning to the uses of knowledge in autonomous decision-making. After I came back to the UK in 1975 I began to combine my theoretical interests with the practical goals of designing and deploying AI systems in medicine.

Since our book was published in 2000 I have been committed to testing the ideas in it by designing and deploying many kind of clinical systems, and demonstrating that AI techniques can significantly improve quality and safety of clinical decision-making and process management. Patient safety is fundamental to clinical practice so, alongside the goals of building systems that can improve on human performance, safety and ethics have always been near the top of my research agenda.

Luke Muehlhauser: Was it straightforward to address issues like safety and ethics in practice?

John Fox: While our concepts and technologies have proved to be clinically successful we have not achieved everything we hoped for. Our attempts to ensure, for example, that practical and commercial deployments of AI technologies should explicitly honor ethical principles and carry out active safety management have not yet achieved the traction that we need to achieve. I regard this as a serious cause for concern, and unfinished business in both scientific and engineering terms.

The next generation of large-scale knowledge based systems and software agents that we are now working on will be more intelligent and will have far more autonomous capabilities than current systems. The challenges for human safety and ethical use of AI that this implies are beginning to mirror those raised by the singularity hypothesis. We have much to learn from singularity researchers, and perhaps our experience in deploying autonomous agents in human healthcare will offer opportunities to ground some of the singularity debates as well.

Read more »

Daniel Roy on probabilistic programming and AI

 |   |  Conversations

Daniel Roy portrait Daniel Roy is an Assistant Professor of Statistics at the University of Toronto. Roy earned an S.B. and M.Eng. in Electrical Engineering and Computer Science, and a Ph.D. in Computer Science, from MIT.  His dissertation on probabilistic programming received the department’s George M Sprowls Thesis Award.  Subsequently, he held a Newton International Fellowship of the Royal Society, hosted by the Machine Learning Group at the University of Cambridge, and then held a Research Fellowship at Emmanuel College. Roy’s research focuses on theoretical questions that mix computer science, statistics, and probability.

Luke Muehlhauser: The abstract of Ackerman, Freer, and Roy (2010) begins:

As inductive inference and machine learning methods in computer science see continued success, researchers are aiming to describe even more complex probabilistic models and inference algorithms. What are the limits of mechanizing probabilistic inference? We investigate the computability of conditional probability… and show that there are computable joint distributions with noncomputable conditional distributions, ruling out the prospect of general inference algorithms.

In what sense does your result (with Ackerman & Freer) rule out the prospect of general inference algorithms?

Daniel Roy: First, it’s important to highlight that when we say “probabilistic inference” we are referring to the problem of computing conditional probabilities, while highlighting the role of conditioning in Bayesian statistical analysis.

Bayesian inference centers around so-called posterior distributions. From a subjectivist standpoint, the posterior represents one’s updated beliefs after seeing (i.e., conditioning on) the data. Mathematically, a posterior distribution is simply a conditional distribution (and every conditional distribution can be interpreted as a posterior distribution in some statistical model), and so our study of the computability of conditioning also bears on the problem of computing posterior distributions, which is arguably one of the core computational problems in Bayesian analyses.

Second, it’s important to clarify what we mean by “general inference”. In machine learning and artificial intelligence (AI), there is a long tradition of defining formal languages in which one can specify probabilistic models over a collection of variables. Defining distributions can be difficult, but these languages can make it much more straightforward.

The goal is then to design algorithms that can use these representations to support important operations, like computing conditional distributions. Bayesian networks can be thought of as such a language: You specify a distribution over a collection of variables by specifying a graph over these variables, which breaks down the entire distribution into “local” conditional distributions corresponding with each node, which are themselves often represented as tables of probabilities (at least in the case where all variables take on only a finite set of values). Together, the graph and the local conditional distributions determine a unique distribution over all the variables.

An inference algorithms that support the entire class of all finite, discrete, Bayesian networks might be called general, but as a class of distributions, those having finite, discrete Bayesian networks is a rather small one.

In this work, we are interested in the prospect of algorithms that work on very large classes of distributions. Namely, we are considering the class of samplable distributions, i.e., the class of distributions for which there exists a probabilistic program that can generate a sample using, e.g., uniformly distributed random numbers or independent coin flips as a source of randomness. The class of samplable distributions is a natural one: indeed it is equivalent to the class of computable distributions, i.e., those for which we can devise algorithms to compute lower bounds on probabilities from descriptions of open sets. The class of samplable distributions is also equivalent to the class of distributions for which we can compute expectations from descriptions of bounded continuous functions.

The class of samplable distributions is, in a sense, the richest class you might hope to deal with. The question we asked was: is there an algorithm that, given a samplable distribution on two variables X and Y, represented by a program that samples values for both variables, can compute the conditional distribution of, say, Y given X=x, for almost all values for X? When X takes values in a finite, discrete set, e.g., when X is binary valued, there is a general algorithm, although it is inefficient. But when X is continuous, e.g., when it can take on every value in the unit interval [0,1], then problems can arise. In particular, there exists a distribution on a pair of numbers in [0,1] from which one can generate perfect samples, but for which it is impossible to compute conditional probabilities for one of the variables given the other. As one might expect, the proof reduces the halting problem to that of conditioning a specially crafted distribution.

This pathological distribution rules out the possibility of a general algorithm for conditioning (equivalently, for probabilistic inference). The paper ends by giving some further conditions that, when present, allow one to devise general inference algorithms. Those familiar with computing conditional distributions for finite-dimensional statistical models will not be surprised that conditions necessary for Bayes’ theorem are one example.
Read more »

MIRI’s September Newsletter

 |   |  Newsletters



Machine Intelligence Research Institute

Thanks to the generosity of 100+ donors, we successfully completed our 2014 summer matching challenge on August 15th, raising more than $400,000 total for our research program. Our deepest thanks to all our supporters!

News updates

  • MIRI is running an online reading group for Nick Bostrom’s Superintelligence. Join the discussion here!
  • MIRI participated in the 2014 Effective Altruism Summit. Slides from our talks are available here.

Other updates

As always, please don’t hesitate to let us know if you have any questions or comments.

Luke Muehlhauser
Executive Director



Superintelligence reading group

 |   |  News

Nick Bostrom’s eagerly awaited Superintelligence comes out in the US this week. To help you get the most out of it, MIRI is running an online reading group where you can join with others to ask questions, discuss ideas, and probe the arguments more deeply.

The reading group will “meet” on a weekly post on the LessWrong discussion forum. For each ‘meeting’, we will read about half a chapter of Superintelligence, then come together virtually to discuss. I’ll summarize the chapter, and offer a few relevant notes, thoughts, and ideas for further investigation. (My notes will also be used as the source material for the final reading guide for the book.)

Discussion will take place in the comments. I’ll offer some questions, and invite you to bring your own, as well as thoughts, criticisms and suggestions for interesting related material. Your contributions to the reading group might also (with permission) be used in our final reading guide for the book.

We welcome both newcomers and veterans on the topic. Content will aim to be intelligible to a wide audience, and topics will range from novice to expert level. All levels of time commitment are welcome. We especially encourage AI researchers and practitioners to participate. Just use a pseudonym if you don’t want your questions and comments publicly linked to your identity.

We will follow this preliminary reading guide, produced by MIRI, reading one section per week.

If you have already read the book, don’t worry! To the extent you remember what it says, your superior expertise will only be a bonus. To the extent you don’t remember what it says, now is a good time for a review! If you don’t have time to read the book, but still want to participate, you are also welcome to join in. I will provide summaries, and many things will have page numbers, in case you want to skip to the relevant parts.

If this sounds good to you, first grab a copy of Superintelligence. You may also want to sign up here to be emailed when the discussion begins each week. The first virtual meeting (forum post) will go live at 6pm Pacific on Monday, September 15th. Following meetings will start at 6pm every Monday, so if you’d like to coordinate for quick fire discussion with others, put that into your calendar. If you prefer flexibility, come by any time! And remember that if there are any people you would especially enjoy discussing Superintelligence with, link them to this post!

Topics for the first week will include impressive displays of artificial intelligence, why computers play board games so well, and what a reasonable person should infer from the agricultural and industrial revolutions.


New paper: “Exploratory engineering in artificial intelligence”

 |   |  News

Exploratory engineeringLuke Muehlhauser and Bill Hibbard have a new paper in the September 2014 issue of Communications of the ACM, the world’s most-read peer-reviewed computer science publication. The title is “Exploratory Engineering in Artificial Intelligence.”


We regularly see examples of new artificial intelligence (AI) capabilities… No doubt such automation will produce tremendous economic value, but will we be able to trust these advanced autonomous systems with so much capability?

Today, AI safety engineering mostly consists in a combination of formal methods and testing. Though powerful, these methods lack foresight: they can be applied only to particular extant systems. We describe a third, complementary approach that aims to predict the (potentially hazardous) properties and behaviors of broad classes of future AI agents, based on their mathematical structure (for example, reinforcement learning)… We call this approach “exploratory engineering in AI.”

In this Viewpoint, we focus on theoretical AI models inspired by Marcus Hutter’s AIXI, an optimal agent model for maximizing an environmental reward signal…

Autonomous intelligent machines have the potential for large impacts on our civilization. Exploratory engineering gives us the capacity to have some foresight into what these impacts might be, by analyzing the properties of agent designs based on their mathematical form. Exploratory engineering also enables us to identify lines of research — such as the study of Dewey’s value-learning agents — that may be important for anticipating and avoiding unwanted AI behaviors. This kind of foresight will be increasingly valuable as machine intelligence comes to play an ever-larger role in our world.

2014 Summer Matching Challenge Completed!

 |   |  News

Thanks to the generosity of 100+ donors, today we successfully completed our 2014 summer matching challenge, raising more than $400,000 total for our research program.

Our deepest thanks to all our supporters!

Also, Jed McCaleb’s new crypto-currency Stellar was launched during MIRI’s fundraiser, and we decided to accept donated stellars. These donations weren’t counted toward the matching drive, and their market value is unstable at this early stage, but as of today we’ve received 850,000+ donated stellars from 3000+ different stellar accounts. Our thanks to everyone who donated in stellar!

MIRI’s recent effective altruism talks

 |   |  News

EA Summit 14MIRI recently participated in the 2014 Effective Altruism Retreat and Effective Altruism Summit organized by Leverage Research. We gave four talks, participated in a panel, and held “office hours” during which people could stop by and ask us questions.

The slides for our talks are available below:

If videos of these talks become available, we’ll link them from here as well.

See also our earlier posts Friendly AI Research as Effective Altruism and Why MIRI?

Groundwork for AGI safety engineering

 |   |  Analysis

Improvements in AI are resulting in the automation of increasingly complex and creative human behaviors. Given enough time, we should expect artificial reasoners to begin to rival humans in arbitrary domains, culminating in artificial general intelligence (AGI).

A machine would qualify as an ‘AGI’, in the intended sense, if it could adapt to a very wide range of situations to consistently achieve some goal or goals. Such a machine would behave intelligently when supplied with arbitrary physical and computational environments, in the same sense that Deep Blue behaves intelligently when supplied with arbitrary chess board configurations — consistently hitting its victory condition within that narrower domain.

Since generally intelligent software could help automate the process of thinking up and testing hypotheses in the sciences, AGI would be uniquely valuable for speeding technological growth. However, this wide-ranging productivity also makes AGI a unique challenge from a safety perspective. Knowing very little about the architecture of future AGIs, we can nonetheless make a few safety-relevant generalizations:

  • Because AGIs are intelligent, they will tend to be complex, adaptive, and capable of autonomous action, and they will have a large impact where employed.
  • Because AGIs are general, their users will have incentives to employ them in an increasingly wide range of environments. This makes it hard to construct valid sandbox tests and requirements specifications.
  • Because AGIs are artificial, they will deviate from human agents, causing them to violate many of our natural intuitions and expectations about intelligent behavior.

Today’s AI software is already tough to verify and validate, thanks to its complexity and its uncertain behavior in the face of state space explosions. Menzies & Pecheur (2005) give a good overview of AI verification and validation (V&V) methods, noting that AI, and especially adaptive AI, will often yield undesired and unexpected behaviors.

An adaptive AI that acts autonomously, like a Mars rover that can’t be directly piloted from Earth, represents an additional large increase in difficulty. Autonomous safety-critical AI agents need to make irreversible decisions in dynamic environments with very low failure rates. The state of the art in safety research for autonomous systems is improving, but continues to lag behind system capabilities work. Hinchman et al. (2012) write:

As autonomous systems become more complex, the notion that systems can be fully tested and all problems will be found is becoming an impossible task. This is especially true in unmanned/autonomous systems. Full test is becoming increasingly challenging on complex system. As these systems react to more environmental [stimuli] and have larger decision spaces, testing all possible states and all ranges of the inputs to the system is becoming impossible. [...] As systems become more complex, safety is really risk hazard analysis, i.e. given x amount of testing, the system appears to be safe. A fundamental change is needed. This change was highlighted in the 2010 Air Force Technology Horizon report, “It is possible to develop systems having high levels of autonomy, but it is the lack of suitable V&V methods that prevents all but relatively low levels of autonomy from being certified for use.” [...]

The move towards more autonomous systems has lifted this need [for advanced verification and validation techniques and methodologies] to a national level.

AI acting autonomously in arbitrary domains, then, looks particularly difficult to verify. If AI methods continue to see rapid gains in efficiency and versatility, and especially if these gains further increase the opacity of AI algorithms to human inspection, AI safety engineering will become much more difficult in the future. In the absence of any reason to expect a development in the lead-up to AGI that would make high-assurance AGI easy (or AGI itself unlikely), we should be worried about the safety challenges of AGI, and that worry should inform our research priorities today.

Below, I’ll give reasons to doubt that AGI safety challenges are just an extension of narrow-AI safety challenges, and I’ll list some research avenues people at MIRI expect to be fruitful.

Read more »

As featured in:     CNN Money   Forbes   The Independent   TIME   WIRED