New report: "Computable probability distributions which converge…"

Computable probability distributions which convergeBack in July 2013, Will Sawin (Princeton) and Abram Demski (USC) wrote a technical report describing a result from that month’s MIRI research workshop. We are finally releasing that report today. It is titled “Computable probability distributions which converge on believing true Π1 sentences will disbelieve true Π2 sentences.”


It might seem reasonable that after seeing unboundedly many examples of a true Π1 statement that a rational agent ought to be able to become increasingly confident, converging toward probability 1, that this statement is true. However, we have proven that this plus some plausible coherence properties, necessarily implies arbitrarily low limiting probabilities assigned to some short true Π2 statements.

New report: "Toward Idealized Decision Theory"

Toward IdealizedToday we release a new technical report by Nate Soares and Benja Fallenstein, “Toward idealized decision theory.” If you’d like to discuss the paper, please do so here.


This paper motivates the study of decision theory as necessary for aligning smarter-than-human artificial systems with human interests. We discuss the shortcomings of two standard formulations of decision theory, and demonstrate that they cannot be used to describe an idealized decision procedure suitable for approximation by artificial systems. We then explore the notions of strategy selection and logical counterfactuals, two recent insights into decision theory that point the way toward promising paths for future research.

This is the 2nd of six new major reports which describe and motivate MIRI’s current research agenda at a high level. The first was our Corrigibility paper, which was accepted to the AI & Ethics workshop at AAAI-2015. We will also soon be releasing a technical agenda overview document and an annotated bibliography for this emerging field of research.

New report: "Tiling agents in causal graphs"

TA in CGToday we release a new technical report by Nate Soares, “Tiling agents in causal graphs.”

The report begins:

Fallenstein and Soares [2014] demonstrates that it’s possible for certain types of proof-based agents to “tile” (license the construction of successor agents similar to themselves while avoiding Gödelian diagonalization issues) in environments about which the agent can prove some basic nice properties. In this technical report, we show via a similar proof that causal graphs (with a specific structure) are one such environment. We translate the proof given by Fallenstein and Soares [2014] into the language of causal graphs, and we do this in such a way as to simplify the conditions under which a tiling meliorizer can be constructed.

New paper: "Concept learning for safe autonomous AI"

Concept learningMIRI research associate Kaj Sotala has released a new paper, accepted to the AI & Ethics workshop at AAAI-2015, titled “Concept learning for safe autonomous AI.”

The abstract reads:

Sophisticated autonomous AI may need to base its behavior on fuzzy concepts such as well-being or rights. These concepts cannot be given an explicit formal definition, but obtaining desired behavior still requires a way to instill the concepts in an AI system. To solve the problem, we review evidence suggesting that the human brain generates its concepts using a relatively limited set of rules and mechanisms. This suggests that it might be feasible to build AI systems that use similar criteria for generating their own concepts, and could thus learn similar concepts as humans do. Major challenges to this approach include the embodied nature of human thought, evolutionary vestiges in cognition, the social nature of concepts, and the need to compare conceptual representations between humans and AI systems.

December newsletter

Machine Intelligence Research Institute

MIRI’s winter fundraising challenge has begun! Every donation made to MIRI between now and January 10th will be matched dollar-for-dollar, up to a total of $100,000!


Donate now to double your impact while helping us raise up to $200,000 (with matching) to fund our research program.

Research Updates

News Updates

Other Updates

  • Our friends at the Center for Effective Altruism will pay you $1,000 if you introduce them to somebody new that they end up hiring for one of their five open positions.

As always, please don’t hesitate to let us know if you have any questions or comments.

Luke Muehlhauser
Executive Director


Three misconceptions in's conversation on "The Myth of AI"

 |   |  Analysis

A recent conversation — “The Myth of AI” — is framed in part as a discussion of points raised in Bostrom’s Superintelligenceand as a response to much-repeated comments by Elon Musk and Stephen Hawking that seem to have been heavily informed by Superintelligence.

Unfortunately, some of the participants fall prey to common misconceptions about the standard case for AI as an existential risk, and they probably haven’t had time to read Superintelligence yet.

Of course, some of the participants may be responding to arguments they’ve heard from others, even if they’re not part of the arguments typically made by FHI and MIRI. Still, for simplicity I’ll reply from the perspective of the typical arguments made by FHI and MIRI.1


1. We don’t think AI progress is “exponential,” nor that human-level AI is likely ~20 years away.

Lee Smolin writes:

I am puzzled by the arguments put forward by those who say we should worry about a coming AI, singularity, because all they seem to offer is a prediction based on Moore’s law.

That’s not the argument made by FHI, MIRI, or Superintelligence.

Some IT hardware and software domains have shown exponential progress, and some have not. Likewise, some AI subdomains have shown rapid progress of late, and some have not. And unlike computer chess, most AI subdomains don’t lend themselves to easy measures of progress, so for most AI subdomains we don’t even have meaningful subdomain-wide performance data through which one might draw an exponential curve (or some other curve).

Thus, our confidence intervals for the arrival of human-equivalent AI tend to be very wide, and the arguments we make for our AI timelines are fox-ish (in Tetlock’s sense). Read more »

  1. I could have also objected to claims and arguments made in the conversation, for example Lanier’s claim that “The AI component would be only ambiguously there and of little importance [relative to the actuators component].” To me, this is like saying that humans rule the planet because of our actuators, not because of our superior intelligence. Or in response to Kevin Kelly’s claim that “So far as I can tell, AIs have not yet made a decision that its human creators have regretted,” I can for example point to the automated trading algorithms that nearly bankrupted Knight Capital faster than any human could react. But in this piece I will focus instead on claims that seem to be misunderstandings of the positive case that’s being made for AI as an existential risk.