Brooks and Searle on AI volition and timelines

January 8, 2015 | Rob Bensinger | Analysis

Nick Bostrom’s concerns about the future of AI have sparked a busy public discussion. His arguments were echoed by leading AI researcher Stuart Russell in “Transcending complacency on superintelligent machines” (co-authored with Stephen Hawking, Max Tegmark, and Frank Wilczek), and a number of journalists, scientists, and technologists have subsequently chimed in. Given the topic’s complexity, I’ve been surprised by the positivity and thoughtfulness of most of the coverage (some overused clichés aside).

Unfortunately, what most people probably take away from these articles is ‘Stephen Hawking thinks AI is scary!’, not the chains of reasoning that led Hawking, Russell, or others to their present views. When Elon Musk chimes in with his own concerns and cites Bostrom’s book Superintelligence: Paths, Dangers, Strategies, commenters seem to be more interested in immediately echoing or dismissing Musk’s worries than in looking into his source.

The end result is more of a referendum on people’s positive or negative associations with the word ‘AI’ than a debate over Bostrom’s substantive claims. If ‘AI’ calls to mind science fiction dystopias for you, the temptation is to squeeze real AI researchers into your ‘mad scientists poised to unleash an evil robot army’ stereotype. Equally, if ‘AI’ calls to mind your day job testing edge detection algorithms, that same urge to force new data into old patterns makes it tempting to squeeze Bostrom and Hawking into the ‘naïve technophobes worried about the evil robot uprising’ stereotype.

Thus roboticist Rodney Brooks’ recent blog post “Artificial intelligence is a tool, not a threat” does an excellent job dispelling common myths about the cutting edge of AI, and philosopher John Searle’s review of Superintelligence draws out some important ambiguities in our concepts of subjectivity and mind; but both writers scarcely intersect with Bostrom’s (or Russell’s, or Hawking’s) ideas. Both pattern-match Bostrom to the nearest available ‘evil robot panic’ stereotype, and stop there.

Brooks and Searle don’t appreciate how new the arguments in Superintelligence are. In the interest of making it easier to engage with these important topics, and less appealing to force the relevant technical and strategic questions into the model of decades-old debates, I’ll address three of the largest misunderstandings one might come away with after seeing Musk, Searle, Brooks, and others’ public comments: conflating present and future AI risks, conflating risk severity with risk imminence, and conflating risk from autonomous algorithmic decision-making with risk from human-style antisocial dispositions.

Matthias Troyer on Quantum Computers

January 7, 2015 | Luke Muehlhauser | Conversations

Dr. Matthias Troyer is a professor of Computational Physics at ETH Zürich. Before that, he finished University Studies in “Technischer Physik” at the Johannes Kepler Universität Linz, Austria, as well as Diploma in Physics and Interdisciplinary PhD thesis at the ETH Zürich.

His research interest and experience focuses on High Performance Scientific Simulations on architectures, quantum lattice models and relativistic and quantum systems. Troyer is known for leading the research team of the D-Wave One Computer System. He was awarded an Assistant Professorship by the Swiss National Science Foundation.

Luke Muehlhauser: Your tests of D-Wave’s (debated) quantum computer have gotten much attention recently. Our readers can get up to speed on that story via your arxiv preprint, its coverage at Scott Aaronson’s blog, and Will Bourne’s article for Inc. For now, though, I’d like to ask you about some other things.

If you’ll indulge me, I’ll ask you to put on a technological forecasting hat for a bit, and respond to a question I also asked Ronald de Wolf: “What is your subjective probability that we’ll have a 500-qubit quantum computer, which is uncontroversially a quantum computer, within the next 20 years? And, how do you reason about a question like that?”

Matthias Troyer: In order to have an uncontroversial quantum computer as you describe it we will need to take three steps. First we need to have at least ONE qubit that is long term stable. The next step is to couple two such qubits, and the final step is to scale to more qubits.

The hardest step is the first one, obtaining a single long-term stable qubit. Given intrinsic decoherence mechanisms that cannot be avoided in any real device, such a qubit will have to built from many (hundreds to thousands) of physical qubits. These physical qubits will each have a finite coherence time, but they will be coupled in such a way (using error correcting codes) as to jointly generate one long term stable “logical” qubit. These error correction codes require the physical qubits to be better than a certain threshold quality. Recently qubits started to approach these thresholds, and I am thus confident that within the next 5-10 years one will be able to couple them to form a long-time stable logical qubit.

Coupling two qubits is something that will happen on the same time scale. The remaining challenge will thus be to scale to your target size of e.g. 500 qubits. This may be a big engineering challenge but I do not see any fundamental stumbling block given that enough resources are invested. I am confident that this can be achieved is less than ten years once we have a single logical qubit. Overall I am thus very confident that a 500-qubit quantum computer will exist in 20 years.

January 2015 Newsletter

January 1, 2015 | Jake | Newsletters

Thanks to the generosity of 80+ donors, we completed our winter 2014 matching challenge, raising $200,000 for our research program. Many, many thanks to all who contributed!

Research Updates

Our major project of the past five months has been a new overview of our technical research agenda, plus six supporting papers which cover each research area in more detail. The overview report is now available, and so far we’ve released two of the supporting papers, on corrigibility and decision theory.
Two more reports and one paper: “Computable probability distributions which converge…“, “Tiling agents in causal graphs,” and “Concept learning for safe autonomous AI.”
A MIRI technical report from 2013, “Responses to catastrophic AGI risk: a survey,” has now been published in Physica Scripta.

News Updates

Luke wrote a short analysis for the World Economic Forum’s blog: “Two mistakes about the threat from artificial intelligence.”
Our Superintelligence online reading group is in its 16th week, discussing Tool AIs.

Other Updates

Eric Horvitz has provided initial funding for a 100-year Stanford program to study the social impacts of artificial intelligence. The white paper lists 18 example research areas, two of which amount to what Nick Bostrom calls the superintelligence control problem, MIRI’s research focus. No word yet on how soon anyone funded through this program will study open questions relevant to superintelligence control.

As always, please don’t hesitate to let us know if you have any questions or comments.

Best,

Luke Muehlhauser
Executive Director

Our new technical research agenda overview

December 23, 2014 | Luke Muehlhauser | Papers

Today we release a new overview of MIRI’s technical research agenda, “Aligning Superintelligence with Human Interests: A Technical Research Agenda,” by Nate Soares and Benja Fallenstein. The preferred place to discuss this report is here.

The report begins:

The characteristic that has enabled humanity to shape the world is not strength, not speed, but intelligence. Barring catastrophe, it seems clear that progress in AI will one day lead to the creation of agents meeting or exceeding human-level general intelligence, and this will likely lead to the eventual development of systems which are “superintelligent” in the sense of being “smarter than the best human brains in practically every field” (Bostrom 2014)…

…In order to ensure that the development of smarter-than-human intelligence has a positive impact on humanity, we must meet three formidable challenges: How can we create an agent that will reliably pursue the goals it is given? How can we formally specify beneficial goals? And how can we ensure that this agent will assist and cooperate with its programmers as they improve its design, given that mistakes in the initial version are inevitable?

This agenda discusses technical research that is tractable today, which the authors think will make it easier to confront these three challenges in the future. Sections 2 through 4 motivate and discuss six research topics that we think are relevant to these challenges. Section 5 discusses our reasons for selecting these six areas in particular.

We call a smarter-than-human system that reliably pursues beneficial goals “aligned with human interests” or simply “aligned.” To become confident that an agent is aligned in this way, a practical implementation that merely seems to meet the challenges outlined above will not suffice. It is also necessary to gain a solid theoretical understanding of why that confidence is justified. This technical agenda argues that there is foundational research approachable today that will make it easier to develop aligned systems in the future, and describes ongoing work on some of these problems.

This report also refers to six key supporting papers which go into more detail for each major research problem area:

Update July 15, 2016: Our overview paper is scheduled to be released in the Springer anthology The Technological Singularity: Managing the Journey in 2017, under the new title “Agent Foundations for Aligning Machine Intelligence with Human Interests.” The new title is intended to help distinguish this agenda from another research agenda we’ll be working on in parallel with the agent foundations agenda: “Value Alignment for Advanced Machine Learning Systems.”

2014 Winter Matching Challenge Completed!

December 18, 2014 | Malo Bourgon | News

Wow! Thanks to the generosity of 75+ donors, today we successfully completed our 2014 Winter Matching Challenge—over 3 weeks ahead of our deadline—raising more than $200,000 total (with matching) for our research program.

Many, many thanks to everyone who contributed!

New report: “Computable probability distributions which converge…”

December 16, 2014 | Luke Muehlhauser | Papers

Back in July 2013, Will Sawin (Princeton) and Abram Demski (USC) wrote a technical report describing a result from that month’s MIRI research workshop. We are finally releasing that report today. It is titled “Computable probability distributions which converge on believing true Π₁ sentences will disbelieve true Π₂ sentences.”

Abstract:

It might seem reasonable that after seeing unboundedly many examples of a true Π₁ statement that a rational agent ought to be able to become increasingly confident, converging toward probability 1, that this statement is true. However, we have proven that this plus some plausible coherence properties, necessarily implies arbitrarily low limiting probabilities assigned to some short true Π₂ statements.

New report: “Toward Idealized Decision Theory”

December 16, 2014 | Luke Muehlhauser | Papers

Today we release a new technical report by Nate Soares and Benja Fallenstein, “Toward idealized decision theory.” If you’d like to discuss the paper, please do so here.

Abstract:

This paper motivates the study of decision theory as necessary for aligning smarter-than-human artificial systems with human interests. We discuss the shortcomings of two standard formulations of decision theory, and demonstrate that they cannot be used to describe an idealized decision procedure suitable for approximation by artificial systems. We then explore the notions of strategy selection and logical counterfactuals, two recent insights into decision theory that point the way toward promising paths for future research.

This is the 2nd of six new major reports which describe and motivate MIRI’s current research agenda at a high level. The first was our Corrigibility paper, which was accepted to the AI & Ethics workshop at AAAI-2015. We will also soon be releasing a technical agenda overview document and an annotated bibliography for this emerging field of research.

New report: “Tiling agents in causal graphs”

December 16, 2014 | Luke Muehlhauser | Papers

Today we release a new technical report by Nate Soares, “Tiling agents in causal graphs.”

The report begins:

Fallenstein and Soares [2014] demonstrates that it’s possible for certain types of proof-based agents to “tile” (license the construction of successor agents similar to themselves while avoiding Gödelian diagonalization issues) in environments about which the agent can prove some basic nice properties. In this technical report, we show via a similar proof that causal graphs (with a specific structure) are one such environment. We translate the proof given by Fallenstein and Soares [2014] into the language of causal graphs, and we do this in such a way as to simplify the conditions under which a tiling meliorizer can be constructed.

Brooks and Searle on AI volition and timelines

Matthias Troyer on Quantum Computers

January 2015 Newsletter

Our new technical research agenda overview

2014 Winter Matching Challenge Completed!

New report: “Computable probability distributions which converge…”

New report: “Toward Idealized Decision Theory”

New report: “Tiling agents in causal graphs”

Search

Browse

Subscribe