January 2015 Newsletter

 |   |  Newsletters

 

Machine Intelligence Research Institute

Thanks to the generosity of 80+ donors, we completed our winter 2014 matching challenge, raising $200,000 for our research program. Many, many thanks to all who contributed!

Research Updates

News Updates

Other Updates

  • Eric Horvitz has provided initial funding for a 100-year Stanford program to study the social impacts of artificial intelligence. The white paper lists 18 example research areas, two of which amount to what Nick Bostrom calls the superintelligence control problem, MIRI’s research focus. No word yet on how soon anyone funded through this program will study open questions relevant to superintelligence control.

As always, please don’t hesitate to let us know if you have any questions or comments.

Best,
Luke Muehlhauser
Executive Director

 

Our new technical research agenda overview

 |   |  Papers

technical agenda overviewToday we release a new overview of MIRI’s technical research agenda, “Aligning Superintelligence with Human Interests: A Technical Research Agenda,” by Nate Soares and Benja Fallenstein. The preferred place to discuss this report is here.

The report begins:

The characteristic that has enabled humanity to shape the world is not strength, not speed, but intelligence. Barring catastrophe, it seems clear that progress in AI will one day lead to the creation of agents meeting or exceeding human-level general intelligence, and this will likely lead to the eventual development of systems which are “superintelligent” in the sense of being “smarter than the best human brains in practically every field” (Bostrom 2014)…

…In order to ensure that the development of smarter-than-human intelligence has a positive impact on humanity, we must meet three formidable challenges: How can we create an agent that will reliably pursue the goals it is given? How can we formally specify beneficial goals? And how can we ensure that this agent will assist and cooperate with its programmers as they improve its design, given that mistakes in the initial version are inevitable?

This agenda discusses technical research that is tractable today, which the authors think will make it easier to confront these three challenges in the future. Sections 2 through 4 motivate and discuss six research topics that we think are relevant to these challenges. Section 5 discusses our reasons for selecting these six areas in particular.

We call a smarter-than-human system that reliably pursues beneficial goals “aligned with human interests” or simply “aligned.” To become confident that an agent is aligned in this way, a practical implementation that merely seems to meet the challenges outlined above will not suffice. It is also necessary to gain a solid theoretical understanding of why that confidence is justified. This technical agenda argues that there is foundational research approachable today that will make it easier to develop aligned systems in the future, and describes ongoing work on some of these problems.

This report also refers to six key supporting papers which go into more detail for each major research problem area:

  1. Corrigibility
  2. Toward idealized decision theory
  3. Questions of reasoning under logical uncertainty
  4. Vingean reflection: reliable reasoning for self-improving agents
  5. Formalizing two problems of realistic world-models
  6. The value learning problem

Update July 15, 2016: Our overview paper is scheduled to be released in the Springer anthology The Technological Singularity: Managing the Journey in 2017, under the new title “Agent Foundations for Aligning Machine Intelligence with Human Interests.” The new title is intended to help distinguish this agenda from another research agenda we’ll be working on in parallel with the agent foundations agenda: “Value Alignment for Advanced Machine Learning Systems.”

New report: “Computable probability distributions which converge…”

 |   |  Papers

Computable probability distributions which convergeBack in July 2013, Will Sawin (Princeton) and Abram Demski (USC) wrote a technical report describing a result from that month’s MIRI research workshop. We are finally releasing that report today. It is titled “Computable probability distributions which converge on believing true Π1 sentences will disbelieve true Π2 sentences.”

Abstract:

It might seem reasonable that after seeing unboundedly many examples of a true Π1 statement that a rational agent ought to be able to become increasingly confident, converging toward probability 1, that this statement is true. However, we have proven that this plus some plausible coherence properties, necessarily implies arbitrarily low limiting probabilities assigned to some short true Π2 statements.

New report: “Toward Idealized Decision Theory”

 |   |  Papers

Toward IdealizedToday we release a new technical report by Nate Soares and Benja Fallenstein, “Toward idealized decision theory.” If you’d like to discuss the paper, please do so here.

Abstract:

This paper motivates the study of decision theory as necessary for aligning smarter-than-human artificial systems with human interests. We discuss the shortcomings of two standard formulations of decision theory, and demonstrate that they cannot be used to describe an idealized decision procedure suitable for approximation by artificial systems. We then explore the notions of strategy selection and logical counterfactuals, two recent insights into decision theory that point the way toward promising paths for future research.

This is the 2nd of six new major reports which describe and motivate MIRI’s current research agenda at a high level. The first was our Corrigibility paper, which was accepted to the AI & Ethics workshop at AAAI-2015. We will also soon be releasing a technical agenda overview document and an annotated bibliography for this emerging field of research.

New report: “Tiling agents in causal graphs”

 |   |  Papers

TA in CGToday we release a new technical report by Nate Soares, “Tiling agents in causal graphs.”

The report begins:

Fallenstein and Soares [2014] demonstrates that it’s possible for certain types of proof-based agents to “tile” (license the construction of successor agents similar to themselves while avoiding Gödelian diagonalization issues) in environments about which the agent can prove some basic nice properties. In this technical report, we show via a similar proof that causal graphs (with a specific structure) are one such environment. We translate the proof given by Fallenstein and Soares [2014] into the language of causal graphs, and we do this in such a way as to simplify the conditions under which a tiling meliorizer can be constructed.

New paper: “Concept learning for safe autonomous AI”

 |   |  Papers

Concept learningMIRI research associate Kaj Sotala has released a new paper, accepted to the AI & Ethics workshop at AAAI-2015, titled “Concept learning for safe autonomous AI.”

The abstract reads:

Sophisticated autonomous AI may need to base its behavior on fuzzy concepts such as well-being or rights. These concepts cannot be given an explicit formal definition, but obtaining desired behavior still requires a way to instill the concepts in an AI system. To solve the problem, we review evidence suggesting that the human brain generates its concepts using a relatively limited set of rules and mechanisms. This suggests that it might be feasible to build AI systems that use similar criteria for generating their own concepts, and could thus learn similar concepts as humans do. Major challenges to this approach include the embodied nature of human thought, evolutionary vestiges in cognition, the social nature of concepts, and the need to compare conceptual representations between humans and AI systems.

December newsletter

 |   |  Newsletters

 

Machine Intelligence Research Institute

MIRI’s winter fundraising challenge has begun! Every donation made to MIRI between now and January 10th will be matched dollar-for-dollar, up to a total of $100,000!

 

Donate now to double your impact while helping us raise up to $200,000 (with matching) to fund our research program.

Research Updates

News Updates

Other Updates

  • Our friends at the Center for Effective Altruism will pay you $1,000 if you introduce them to somebody new that they end up hiring for one of their five open positions.

As always, please don’t hesitate to let us know if you have any questions or comments.

Best,
Luke Muehlhauser
Executive Director