End-of-the-year matching challenge!

 |   |  News

Update 2017-12-27: We’ve blown past our 3rd and final target, and reached the matching cap of $300,000 for the Matching Challenge! Thanks so much to everyone who supported us!

All donations made before 23:59 PST on Dec 31st will continue to be counted towards our fundraiser total. The fundraiser total includes projected matching funds from the Challenge.


Professional poker players Martin Crowley, Tom Crowley, and Dan Smith, in partnership with Raising for Effective Giving, have just announced a $1 million Matching Challenge and included MIRI among the 10 organizations they are supporting!

Give to any of the organizations involved before noon (PST) on December 31 for your donation to be eligible for a dollar-for-dollar match, up to the $1 million limit!

The eligible organizations for matching are:

  • Animal welfare — Effective Altruism Funds’ animal welfare fund, The Good Food Institute
  • Global health and development — Against Malaria Foundation, Schistosomiasis Control Initiative, Helen Keller International’s vitamin A supplementation program, GiveDirectly
  • Global catastrophic risk — MIRI
  • Criminal justice reform — Brooklyn Community Bail Fund, Massachusetts Bail Fund, Just City Memphis

The Matching Challenge’s website lists two options for MIRI donors to get matched: (1) donating on 2017charitydrive.com, or (2) donating directly on MIRI’s website and sending the receipt to receiptsforcharity@gmail.com. We recommend option 2, particularly for US tax residents (because MIRI is a 501(c)(3) organization) and those looking for a wider array of payment methods.


In other news, we’ve hit our first fundraising target ($625,000)!

We’re also happy to announce that we’ve received a $368k bitcoin donation from Christian Calderon, a cryptocurrency enthusiast, and also a donation worth $59k from early bitcoin investor Marius van Voorden.

In total, so far, we’ve received donations valued $697,638 from 137 distinct donors, 76% of it in the form of cryptocurrency (48% if we exclude Christian’s donation). Thanks as well to Jacob Falkovich for his fundraiser/matching post whose opinion distribution curves plausibly raised over $27k for MIRI this week, including his match.

Our funding drive will be continuing through the end of December, along with the Matching Challenge. Current progress (updated live):



Correction December 17: I previously listed GiveWell as one of the eligible organizations for matching, which is not correct.

ML Living Library Opening

 |   |  News

The Machine Intelligence Research Institute is looking for a very specialized autodidact to keep us up to date on developments in machine learning—a “living library” of new results.

ML is a fast-moving and diverse field, making it a challenge for any group to stay updated on all the latest and greatest developments. To support our AI alignment research efforts, we want to hire someone to read every interesting-looking paper about AI and machine learning, and keep us abreast of noteworthy developments, including new techniques and insights.

We expect that this will sound like a very fun job to a lot of people! However, this role is important to us, and we need to be appropriately discerning—we do not recommend applying if you do not already have a proven ability in this or neighboring domains.

Our goal is to hire full-time, ideally for someone who would be capable of making a multi-year commitment—we intend to pay you to become an expert on the cutting edge of machine learning, and don’t want to make the human capital investment unless you’re interested in working with us long-term.


Read more »

A reply to Francois Chollet on intelligence explosion

 |   |  Analysis

This is a reply to Francois Chollet, the inventor of the Keras wrapper for the Tensorflow and Theano deep learning systems, on his essay “The impossibility of intelligence explosion.”

In response to critics of his essay, Chollet tweeted:


If you post an argument online, and the only opposition you get is braindead arguments and insults, does it confirm you were right? Or is it just self-selection of those who argue online?

And he earlier tweeted:


Don’t be overly attached to your views; some of them are probably incorrect. An intellectual superpower is the ability to consider every new idea as if it might be true, rather than merely checking whether it confirms/contradicts your current views.

Chollet’s essay seemed mostly on-point and kept to the object-level arguments. I am led to hope that Chollet is perhaps somebody who believes in abiding by the rules of a debate process, a fan of what I’d consider Civilization; and if his entry into this conversation has been met only with braindead arguments and insults, he deserves a better reply. I’ve tried here to walk through some of what I’d consider the standard arguments in this debate as they bear on Chollet’s statements.

As a meta-level point, I hope everyone agrees that an invalid argument for a true conclusion is still a bad argument. To arrive at the correct belief state we want to sum all the valid support, and only the valid support. To tally up that support, we need to have a notion of judging arguments on their own terms, based on their local structure and validity, and not excusing fallacies if they support a side we agree with for other reasons.

My reply to Chollet doesn’t try to carry the entire case for the intelligence explosion as such. I am only going to discuss my take on the validity of Chollet’s particular arguments. Even if the statement “an intelligence explosion is impossible” happens to be true, we still don’t want to accept any invalid arguments in favor of that conclusion.

Without further ado, here are my thoughts in response to Chollet.

Read more »

December 2017 Newsletter

 |   |  Newsletters


Our annual fundraiser is live. Discussed in the fundraiser post:

  • News  — What MIRI’s researchers have been working on lately, and more.
  • Goals — We plan to grow our research team 2x in 2018–2019. If we raise $850k this month, we think we can do that without dipping below a 1.5-year runway.
  • Actual goals — A bigger-picture outline of what we think is the likeliest sequence of events that could lead to good global outcomes.

Our funding drive will be running until December 31st.


Research updates


General updates


MIRI’s 2017 Fundraiser

 |   |  MIRI Strategy, News

Update 2017-12-27: We’ve blown past our 3rd and final target, and reached the matching cap of $300,000 for the $2 million Matching Challenge! Thanks so much to everyone who supported us!

All donations made before 23:59 PST on Dec 31st will continue to be counted towards our fundraiser total. The fundraiser total includes projected matching funds from the Challenge.


MIRI’s 2017 fundraiser is live through the end of December! Our progress so far (updated live):


$2,504,625 raised in total!

358 donors contributed


MIRI is a research nonprofit based in Berkeley, California with a mission of ensuring that smarter-than-human AI technology has a positive impact on the world. You can learn more about our work at “Why AI Safety?” or via MIRI Executive Director Nate Soares’ Google talk on AI alignment.

In 2015, we discussed our interest in potentially branching out to explore multiple research programs simultaneously once we could support a larger team. Following recent changes to our overall picture of the strategic landscape, we’re now moving ahead on that goal and starting to explore new research directions while also continuing to push on our agent foundations agenda. For more on our new views, see “There’s No Fire Alarm for Artificial General Intelligence” and our 2017 strategic update. We plan to expand on our relevant strategic thinking more in the coming weeks.

Our expanded research focus means that our research team can potentially grow big, and grow fast. Our current goal is to hire around ten new research staff over the next two years, mostly software engineers. If we succeed, our point estimate is that our 2018 budget will be $2.8M and our 2019 budget will be $3.5M, up from roughly $1.9M in 2017.1

We’ve set our fundraiser targets by estimating how quickly we could grow while maintaining a 1.5-year runway, on the simplifying assumption that about 1/3 of the donations we receive between now and the beginning of 2019 will come during our current fundraiser.2

Hitting Target 1 ($625k) then lets us act on our growth plans in 2018 (but not in 2019); Target 2 ($850k) lets us act on our full two-year growth plan; and in the case where our hiring goes better than expected, Target 3 ($1.25M) would allow us to add new members to our team about twice as quickly, or pay higher salaries for new research staff as needed.

We discuss more details below, both in terms of our current organizational activities and how we see our work fitting into the larger strategy space.


Read more »

  1. Note that this $1.9M is significantly below the $2.1–2.5M we predicted for the year in April. Personnel costs are MIRI’s most significant expense, and higher research staff turnover in 2017 meant that we had fewer net additions to the team this year than we’d budgeted for. We went under budget by a relatively small margin in 2016, spending $1.73M versus a predicted $1.83M.

    Our 2018–2019 budget estimates are highly uncertain, with most of the uncertainty coming from substantial uncertainty about how quickly we’ll be able to take on new research staff. 

  2. This is roughly in line with our experience in previous years, when excluding expected grants and large surprise one-time donations. We’ve accounted for the former in our targets but not the latter, since we think it unwise to bank on unpredictable windfalls.

    Note that in previous years, we’ve set targets based on maintaining a 1-year runway. Given the increase in our size, I now think that a 1.5-year runway is more appropriate. 

Security Mindset and the Logistic Success Curve

 |   |  Analysis

Follow-up to:   Security Mindset and Ordinary Paranoia


(Two days later, Amber returns with another question.)


AMBER:  Uh, say, Coral. How important is security mindset when you’re building a whole new kind of system—say, one subject to potentially adverse optimization pressures, where you want it to have some sort of robustness property?

CORAL:  How novel is the system?

AMBER:  Very novel.

CORAL:  Novel enough that you’d have to invent your own new best practices instead of looking them up?

AMBER:  Right.

CORAL:  That’s serious business. If you’re building a very simple Internet-connected system, maybe a smart ordinary paranoid could look up how we usually guard against adversaries, use as much off-the-shelf software as possible that was checked over by real security professionals, and not do too horribly. But if you’re doing something qualitatively new and complicated that has to be robust against adverse optimization, well… mostly I’d think you were operating in almost impossibly dangerous territory, and I’d advise you to figure out what to do after your first try failed. But if you wanted to actually succeed, ordinary paranoia absolutely would not do it.

AMBER:  In other words, projects to build novel mission-critical systems ought to have advisors with the full security mindset, so that the advisor can say what the system builders really need to do to ensure security.

CORAL:  (laughs sadly)  No.


Read more »

Security Mindset and Ordinary Paranoia

 |   |  Analysis

The following is a fictional dialogue building off of AI Alignment: Why It’s Hard, and Where to Start.


(AMBER, a philanthropist interested in a more reliable Internet, and CORAL, a computer security professional, are at a conference hotel together discussing what Coral insists is a difficult and important issue: the difficulty of building “secure” software.)


AMBER:  So, Coral, I understand that you believe it is very important, when creating software, to make that software be what you call “secure”.

CORAL:  Especially if it’s connected to the Internet, or if it controls money or other valuables. But yes, that’s right.

AMBER:  I find it hard to believe that this needs to be a separate topic in computer science. In general, programmers need to figure out how to make computers do what they want. The people building operating systems surely won’t want them to give access to unauthorized users, just like they won’t want those computers to crash. Why is one problem so much more difficult than the other?

CORAL:  That’s a deep question, but to give a partial deep answer: When you expose a device to the Internet, you’re potentially exposing it to intelligent adversaries who can find special, weird interactions with the system that make the pieces behave in weird ways that the programmers did not think of. When you’re dealing with that kind of problem, you’ll use a different set of methods and tools.

AMBER:  Any system that crashes is behaving in a way the programmer didn’t expect, and programmers already need to stop that from happening. How is this case different?

CORAL:  Okay, so… imagine that your system is going to take in one kilobyte of input per session. (Although that itself is the sort of assumption we’d question and ask what happens if it gets a megabyte of input instead—but never mind.) If the input is one kilobyte, then there are 28,000 possible inputs, or about 102,400 or so. Again, for the sake of extending the simple visualization, imagine that a computer gets a billion inputs per second. Suppose that only a googol, 10100, out of the 102,400 possible inputs, cause the system to behave a certain way the original designer didn’t intend.

If the system is getting inputs in a way that’s uncorrelated with whether the input is a misbehaving one, it won’t hit on a misbehaving state before the end of the universe. If there’s an intelligent adversary who understands the system, on the other hand, they may be able to find one of the very rare inputs that makes the system misbehave. So a piece of the system that would literally never in a million years misbehave on random inputs, may break when an intelligent adversary tries deliberately to break it.

AMBER:  So you’re saying that it’s more difficult because the programmer is pitting their wits against an adversary who may be more intelligent than themselves.

CORAL:  That’s an almost-right way of putting it. What matters isn’t so much the “adversary” part as the optimization part. There are systematic, nonrandom forces strongly selecting for particular outcomes, causing pieces of the system to go down weird execution paths and occupy unexpected states. If your system literally has no misbehavior modes at all, it doesn’t matter if you have IQ 140 and the enemy has IQ 160—it’s not an arm-wrestling contest. It’s just very much harder to build a system that doesn’t enter weird states when the weird states are being selected-for in a correlated way, rather than happening only by accident. The weirdness-selecting forces can search through parts of the larger state space that you yourself failed to imagine. Beating that does indeed require new skills and a different mode of thinking, what Bruce Schneier called “security mindset”.

AMBER:  Ah, and what is this security mindset?

CORAL:  I can say one or two things about it, but keep in mind we are dealing with a quality of thinking that is not entirely effable. If I could give you a handful of platitudes about security mindset, and that would actually cause you to be able to design secure software, the Internet would look very different from how it presently does. That said, it seems to me that what has been called “security mindset” can be divided into two components, one of which is much less difficult than the other. And this can fool people into overestimating their own safety, because they can get the easier half of security mindset and overlook the other half. The less difficult component, I will call by the term “ordinary paranoia”.

AMBER:  Ordinary paranoia?

CORAL:  Lots of programmers have the ability to imagine adversaries trying to threaten them. They imagine how likely it is that the adversaries are able to attack them a particular way, and then they try to block off the adversaries from threatening that way. Imagining attacks, including weird or clever attacks, and parrying them with measures you imagine will stop the attack; that is ordinary paranoia.

AMBER:  Isn’t that what security is all about? What do you claim is the other half?

CORAL:  To put it as a platitude, I might say… defending against mistakes in your own assumptions rather than against external adversaries.
Read more »

Announcing “Inadequate Equilibria”

 |   |  News

MIRI Senior Research Fellow Eliezer Yudkowsky has a new book out today: Inadequate Equilibria: Where and How Civilizations Get Stuck, a discussion of societal dysfunction, exploitability, and self-evaluation. From the preface:

Inadequate Equilibria is a book about a generalized notion of efficient markets, and how we can use this notion to guess where society will or won’t be effective at pursuing some widely desired goal.An efficient market is one where smart individuals should generally doubt that they can spot overpriced or underpriced assets. We can ask an analogous question, however, about the “efficiency” of other human endeavors.

Suppose, for example, that someone thinks they can easily build a much better and more profitable social network than Facebook, or easily come up with a new treatment for a widespread medical condition. Should they question whatever clever reasoning led them to that conclusion, in the same way that most smart individuals should question any clever reasoning that causes them to think AAPL stock is underpriced? Should they question whether they can “beat the market” in these areas, or whether they can even spot major in-principle improvements to the status quo? How “efficient,” or adequate, should we expect civilization to be at various tasks?

There will be, as always, good ways and bad ways to reason about these questions; this book is about both.

The book is available from Amazon (in print and Kindle), on iBooks, as a pay-what-you-want digital download, and as a web book at equilibriabook.com. The book has also been posted to Less Wrong 2.0.

The book’s contents are:

1.  Inadequacy and Modesty

A comparison of two “wildly different, nearly cognitively nonoverlapping” approaches to thinking about outperformance: modest epistemology, and inadequacy analysis.

2.  An Equilibrium of No Free Energy

How, in principle, can society end up neglecting obvious low-hanging fruit?

3.  Moloch’s Toolbox

Why does our civilization actually end up neglecting low-hanging fruit?

4.  Living in an Inadequate World

How can we best take into account civilizational inadequacy in our decision-making?

5.  Blind Empiricism

Three examples of modesty in practical settings.

6.  Against Modest Epistemology

An argument against the “epistemological core” of modesty: that we shouldn’t take our own reasoning and meta-reasoning at face value in cases in the face of disagreements or novelties.

7.  Status Regulation and Anxious Underconfidence

On causal accounts of modesty.

Although Inadequate Equilibria isn’t about AI, I consider it one of MIRI’s most important nontechnical publications to date, as it helps explain some of the most basic tools and background models we use when we evaluate how promising a potential project, research program, or general strategy is.