Why MIRI?

 |   |  MIRI Strategy

I wrote a short profile of MIRI for a forthcoming book on effective altruism. It leaves out many important details, but hits many of the key points pretty succinctly:

The Machine Intelligence Research Institute (MIRI) was founded in 2000 on the premise that creating smarter-than-human artificial intelligence with a positive impact — “Friendly AI” — might be a particularly efficient way to do as much good as possible.

First, because future people vastly outnumber presently existing people, we think that “From a global perspective, what matters most (in expectation) is that we do what is best (in expectation) for the general trajectory along which our descendants develop over the coming millions, billions, and trillions of years.” (See Nick Beckstead’s On the Overwhelming Importance of Shaping the Far Future.)

Second, as an empirical matter, we think that smarter-than-human AI is humanity’s most significant point of leverage on that “general trajectory along which our descendants develop.” If we handle advanced AI wisely, it could produce tremendous goods which endure for billions of years. If we handle advanced AI poorly, it could render humanity extinct. No other future development has more upside or downside. (See Nick Bostrom’s Superintelligence: Paths, Dangers, Strategies.)

Third, we think that Friendly AI research is tractable, urgent, and uncrowded.

Tractable: Our staff researchers and visiting workshop participants tackle open problems in Friendly AI theory, such as: How can we get an AI to preserve its original goals even as it learns new things and modifies its own code? How do we load desirable goals into a self-modifying AI? How do we ensure that advanced AIs will cooperate with each other and with modified versions of themselves? This work is currently at a theoretical stage, but we are making clear conceptual progress, and growing a new community of researchers devoted to solving these problems.

Urgent: Surveys of AI scientists, as well as our own estimates, expect the invention of smarter-than-human AI in the 2nd half of the 21st century if not sooner. Unfortunately, mathematical challenges such as those we need to solve to build Friendly AI often require several decades of research to overcome, with each new result building on the advances that came before. Moreover, because the invention of smarter-than-human AI is so difficult to predict, it may arrive with surprising swiftness, leaving us with little time to prepare.

Uncrowded: Very few researchers, perhaps fewer than five worldwide, are explicitly devoted to full-time Friendly AI research.

The overwhelming power of machine superintelligence will reshape our world, dominating other causal factors. Our intended altruistic effects on the vast majority of beings who will ever live must largely reach them via the technical design of the first self-improving smarter-than-human AIs. Many ongoing efforts — on behalf of better altruism, better reasoning, better global coordination, etc. — will play a role in this story, but we think it is crucial to also directly address the core challenge: the design of stably self-improving AIs with desirable goals. Failing to solve that problem will render humanity’s other efforts moot.

If our mission appeals to you, you can either fund our research or get involved in other ways.

miri_logo

Thomas Bolander on self-reference and agent introspection

 |   |  Conversations

Thomas Bolander portrait Thomas Bolander, Ph.D., is associate professor at DTU Compute, Technical University of Denmark.

He is doing research in logic and artificial intelligence with primary focus on the use of logic to model human-like planning, reasoning and problem solving.

Of special interest is the modelling of social phenomena and social intelligence with the aim of creating computer systems that can interact intelligently with humans and other computer systems.

Luke Muehlhauser: Bolander (2003) and some of your subsequent work studies paradoxes of self-reference in the context of logical/computational agents, as does e.g. Weaver (2013). Do you think your work on these paradoxes will have practical import for AI researchers who are designing computational agents, or are you merely using the agent framework to explore the more philosophical aspects of self-reference?

Read more »

Jonathan Millen on covert channel communication

 |   |  Conversations

Jonathan Millen portrait Jonathan Millen started work at the MITRE Corporation in 1969, after graduation from Rensselaer Polytechnic Institute with a Ph.D. in Mathematics. He retired from MITRE in 2012 as a Senior Principal in the Information Security Division. From 1997 to 2004 he enjoyed an interlude as a Senior Computer Scientist in the SRI International Computer Science Laboratory. He has given short courses at RPI Hartford, University of Bologna Summer School, ETH Zurich, and Taiwan University of Science and Technology. He organized the IEEE Computer Security Foundations Symposium (initially a workshop) in 1988, and co-founded (with S. Jajodia) the Journal of Computer Security in 1992. He has held positions as General and Program Chair of the IEEE Security and Privacy Symposium, Chair of the IEEE Computer Society Technical Committee on Security and Privacy, and associate editor of the ACM Transactions on Information and System Security.

The theme of his computer security interests is verification of formal specifications, of security kernels and cryptographic protocols. At MITRE he supported the DoD Trusted Product Evaluation Program, and later worked on the application of Trusted Platform Modules. He wrote several papers on information flow as applied to covert channel detection and measurement. His 2001 paper (with V. Shmatikov) on the Constraint Solver for protocol analysis received the SIGSAC Test of Time award in 2011. He received the ACM SIGSAC Outstanding Innovation award in 2009.

Luke Muehlhauser: Since you were a relatively early researcher in the field of covert channel communication, I’d like to ask you about the field’s early days, which are usually said to have begun with Lampson (1973). Do you know when the first covert channel attack was uncovered “in the wild”? My impression is that Lampson identified the general problem a couple decades before it was noticed being exploited in the wild; is that right?


Jonathan Millen: We might never know when real covert channel attacks were first noticed, or when they first occurred. When information is stolen by covert channel, the original data is still in place, so the theft can go unnoticed. Even if an attack is discovered, the victims are as reluctant as the perpetrators to acknowledge it. This is certainly the case with classified information, since a known attack is often classified higher than the information it compromises. The only evidence I have of real attacks before 1999 is from Robert Morris (senior), a pioneer in UNIX security, and for a while the Chief Scientist of the National Computer Security Center, which was organizationally within NSA. He stated at a security workshop that there had been real attacks. He wouldn’t say anything more; it was probably difficult enough to get clearance for that much.
Read more »

Wolf Kohn on hybrid systems control

 |   |  Conversations

Wolf Kohn portrait Dr. Wolf Kohn is the Chief Scientist at Atigeo, LLC, and a Research Professor in Industrial and Systems Engineering at the University of Washington. He is the founder and co-founder of two successful start-up companies: Clearsight Systems, Corp., and Kohn-Nerode, Inc. Both companies explore applications in the areas of advanced optimal control, rule-based optimization, and quantum hybrid control applied to enterprise problems and nano-material shaping control. Prof. Kohn, with Prof. Nerode of Cornell, established theories and algorithms that initiated the field of hybrid systems. Prof. Kohn has a Ph.D. in Electrical Engineering and Computer Science from MIT, at the Laboratory of Information and Decision Systems. Dr. Kohn is the author or coauthor of over 100 referred papes, 6 book chapters and with Nerode and Zabinsky has written a book in Distributed Cooperative inferencing. Dr. Kohn Holds 10 US and international patents.

Luke Muehlhauser: You co-founded the field of hybrid systems control with Anil Nerode. Anil gave his impressions of the seminal 1990 Pacifica meeting here. What were your own impressions of how that meeting developed? Is there anything in particular you’d like to add to Anil’s account?


Wolf Kohn: The discussion on the first day of the conference centered on the problem of how to incorporate heterogeneous descriptions of complex dynamical systems into a common representation for designing large scale automation. What came almost immediately were observations from Colonel Mettala and others that established as a goal the finding of alternatives to classic approaches based on combining expert systems with conventional control and system identification techniques.

These approaches did not lead to robust designs. More important, they did not lead to a theory for the systematic treatment of the systems DOD was deploying at the time. I was working on control architectures based on constraints defined by rules, so after intense discussions among the participants Nerode and I moved to a corner and came up with a proposal to amalgamate models by extending the concepts of automata theory and optimal control to characterize the evolution of complex dynamical systems in a manifold in which the topology was defined by rules of operation, and behavior constraints and trajectories were generated by variational methods. This was the beginning of what we would be defined later on as “hybrid systems.”
Read more »

MIRI’s April 2014 Newsletter

 |   |  Newsletters

Machine Intelligence Research Institute

Research Updates

News Updates

  • We are actively hiring for three positions: math researcher, science writer, and executive assistant. Compensation is competitive and visa are possible.
  • We’re also accepting applications on a rolling basis for future MIRI research workshops. Please apply if you’re a curious academic looking for exposure to our material. If you’re accepted, we’ll contact you about potential workshops over time, as we schedule them.
  • We published our 2013 in Review: Fundraising post.
  • Louie was interviewed at io9 about the unsuitability of Asimov’s Three Laws of Robotics for machine ethics.
  • Luke was interviewed at The Information about MIRI’s research. Unfortunately, the interviewer’s editing introduced several errors: see here.

Other Updates

  • April 11th is your last chance to vote for Nick Bostrom and others in Prospect‘s World Thinkers 2014 poll.
  • Oxford University is hosting an effective altruism conference in July. Speakers include Nick Bostrom and Derek Parfit. Register here.
  • Registration is also open for Effective Altruism Summit 2014, in August in the Bay Area.
  • Our frequent collaborators at the Future of Humanity Institute are currently hiring a research fellow in philosophy, for a focus on population ethics.

As always, please don’t hesitate to let us know if you have any questions or comments.

Best,
Luke Muehlhauser
Executive Director

New Report: Botworld

 |   |  Papers

robot swarm

Today MIRI releases a new technical report: “Botworld 1.0” (pdf) by recent hires Nate Soares and Benja Fallenstein. The report is a “literate” Haskell file, available from MIRI’s Github page.

Soares explains the report on his accompanying Less Wrong post, which is also the preferred place for discussion of the report:

This report introduces Botworld, a cellular automaton that provides a toy environment for studying self-modifying agents.

The traditional agent framework, used for example in Markov Decision Processes and in Marcus Hutter’s universal agent AIXI, splits the universe into an agent and an environment, which interact only via discrete input and output channels.

Such formalisms are perhaps ill-suited for real self-modifying agents, which are embedded within their environments. Indeed, the agent/environment separation is somewhat reminiscent of Cartesian dualism: any agent using this framework to reason about the world does not model itself as part of its environment. For example, such an agent would be unable to understand the concept of the environment interfering with its internal computations, e.g. by inducing errors in the agent’s RAM through heat.

Intuitively, this separation does not seem to be a fatal flaw, but merely a tool for simplifying the discussion. We should be able to remove this “Cartesian” assumption from formal models of intelligence. However, the concrete non-Cartesian models that have been proposed (such as Orseau and Ring’s formalism for space-time embedded intelligence, Vladimir Slepnev’s models of updateless decision theory, and Yudkowsky and Herreshoff’s tiling agents) depart significantly from their Cartesian counterparts.

Botworld is a toy example of the type of universe that these formalisms are designed to reason about: it provides a concrete world containing agents (“robots”) whose internal computations are a part of the environment, and allows us to study what happens when the Cartesian barrier between an agent and its environment breaks down. Botworld allows us to write decision problems where the Cartesian barrier is relevant, program actual agents, and run the system.

As it turns out, many interesting problems arise when agents are embedded in their environment. For example, agents whose source code is readable may be subjected to Newcomb-like problems by entities that simulate the agent and choose their actions accordingly.

Furthermore, certain obstacles to self-reference arise when non-Cartesian agents attempt to achieve confidence in their future actions. Some of these issues are raised by Yudkowsky and Herreshoff; Botworld gives us a concrete environment in which we can examine them.

One of the primary benefits of Botworld is concreteness: when working with abstract problems of self-reference, it is often very useful to see a concrete decision problem (“game”) in a fully specified world that directly exhibits the obstacle under consideration. Botworld makes it easier to visualize these obstacles.

Conversely, Botworld also makes it easier to visualize suggested agent architectures, which in turn makes it easier to visualize potential problems and probe the architecture for edge cases.

Finally, Botworld is a tool for communicating. It is our hope that Botworld will help others understand the varying formalisms for self-modifying agents by giving them a concrete way to visualize such architectures being implemented. Furthermore, Botworld gives us a concrete way to illustrate various obstacles, by implementing Botworld games in which the obstacles arise.

Botworld has helped us gain a deeper understanding of varying formalisms for self-modifying agents and the obstacles they face. It is our hope that Botworld will help others more concretely understand these issues as well.

Paulo Tabuada on program synthesis for cyber-physical systems

 |   |  Conversations

Paulo Tabuada portrait Paulo Tabuada was born in Lisbon, Portugal, one year after the Carnation Revolution. He received his “Licenciatura” degree in Aerospace Engineering from Instituto Superior Tecnico, Lisbon, Portugal in 1998 and his Ph.D. degree in Electrical and Computer Engineering in 2002 from the Institute for Systems and Robotics, a private research institute associated with Instituto Superior Tecnico. Between January 2002 and July 2003 he was a postdoctoral researcher at the University of Pennsylvania. After spending three years at the University of Notre Dame, as an Assistant Professor, he joined the Electrical Engineering Department at the University of California, Los Angeles, where he established and directs the Cyber-Physical Systems Laboratory.

Paulo Tabuada’s contributions to cyber-physical systems have been recognized by multiple awards including the NSF CAREER award in 2005, the Donald P. Eckman award in 2009 and the George S. Axelby award in 2011. In 2009 he co-chaired the International Conference Hybrid Systems: Computation and Control (HSCC’09) and in 2012 he was program co-chair for the 3rd IFAC Workshop on Distributed Estimation and Control in Networked Systems (NecSys’12). He also served on the editorial board of the IEEE Embedded Systems Letters and the IEEE Transactions on Automatic Control. His latest book, on verification and control of hybrid systems, was published by Springer in 2009.

Luke Muehlhauser: In “Abstracting and Refining Robustness for Cyber-Physical Systems,” you and your co-author write that:

…we present a design methodology for robust cyber-physical systems (CPS) [which]… captures two intuitive aims of a robust design: bounded disturbances have bounded consequences and the effect of sporadic disturbances disappears as time progresses.

You use an “abstraction and refinement” procedure for this. How does an abstraction and refinement procedure work, in this context?


Paulo Tabuada: Cyber-physical systems are notoriously difficult to design and verify because of the complex interactions between the cyber and the physical components. Although control theorists have developed powerful techniques for designing and analyzing physical components, say described by differential equations, and computer scientists have developed powerful techniques for designing and analyzing cyber components, say described by finite-state models, these techniques are for the most part incompatible. The latter rely on discrete mathematics while the former rely on continuous mathematics. Our approach is based on replacing all the physical components by cyber abstractions so that all the remaining design and verification tasks can be done in the cyber world. The construction of these abstractions is based on rigorous numerical simulation combined with an analysis of the differential equation models to guarantee that the original physical components and its abstractions are equivalent up to a desired precision. Technically, “equivalent up to a desired precision” means approximately bisimilar and intuitively this means that both models generate the same set of behaviors up to a desired precision.
Read more »

Diana Spears on the safety of adaptive agents

 |   |  Conversations

Diana Spears portrait Diana Spears is an Owner and Research Scientist at Swarmotics, LLC. Previously, she worked at US government laboratories (Goddard, NIST, NRL) and afterwards she was an Associate Professor of Computer Science at the University of Wyoming. She received both the MS and PhD (1990) degrees in Computer Science from the University of Maryland, College Park.

Dr. Spears’s research interests include machine learning, adaptive swarm robotic sensing networks/grids, computational fluid dynamics based algorithms for multi-robot chemical/biological plume tracing and plume mapping, and behaviorally-assured adaptive and machine learning systems. Dr. Spears pioneered the field of safe adaptive agents with her award-winning (2001 NRL Alan Berman Research Publication Award) publication entitled, “Asimovian adaptive agents.” Most recently she and her husband co-edited the book Physicomimetics: Physics-Based Swarm Intelligence,” published by Springer-Verlag in 2012.

Luke Muehlhauser: In Spears (2006) and other publications, you’ve discussed the challenge of ensuring the safety of adaptive (learning) agents:

a designer cannot possibly foresee all circumstances that will be encountered by the agent. Therefore, in addition to supplying an agent with a plan, it is essential to also enable the agent to learn and modify its plan to adapt to unforeseen circumstances. The introduction of learning, however, often makes the agent’s behavior significantly harder to predict. The goal of this research is to verify the behavior of adaptive agents. In particular, our objective is to develop efficient methods for determining whether the behavior of learning agents remains within the bounds of prespecified constraints… after learning…

…Our results include proofs that… with respect to important classes of properties… if the [safety] property holds for the agent’s plan prior to learning, then it is guaranteed to still hold after learning. If an agent uses these “safe” learning operators, it will be guaranteed to preserve the properties with no reverification required. This is the best one could hope for in an online situation where rapid response time is critical. For other learning operators and property classes our a priori results are negative. However, for these cases we have developed incremental reverification algorithms that can save time over total reverification from scratch.

What do you mean by “incremental” reverification algorithms, as in the last sentence I quoted?


Diana Spears: Verification (model checking, in particular) consists of proving that a computer program satisfies a desirable property/constraint and, if it does not, a counterexample is provided. In my work, I assume that this program is a (multi)agent plan for action.  In most real-world applications, plans are typically enormous and therefore verification may be quite time-consuming. Suppose the safety property/constraint that agent A must always obey is that “agent A should always be at least M units away from agent B” (to prevent collisions). Let’s assume that initial verification proved that the entire plan (consisting of all action sequences that agent A could ever possibly take) is guaranteed to obey this property in all circumstances. Furthermore, let’s assume that adaptation is required after the agent has been fielded, where adaptation consists of applying a machine learning operator to modify the plan. For example, suppose a specific part of the plan states that agent A should “move from location 3 to location 4 if there is a clear path between 3 and 4, the ground is fairly level (e.g., nowhere higher than X or lower than Y between locations 3 and 4), and if the schedule permits such movement at this time.” Then an example machine learning operator might change the “4” to “6” based on new information about the task being performed.

Note that this learning operator only modifies one condition in one miniscule portion of the entire plan.Therefore, why re-verify that the entire plan still satisfies the desired property after learning? Why not re-verify only the specific portion of the plan that was modified, as well as any parts of the plan that depend on the modified portion?  That is what “incremental re-verification” does. It localizes post-adaptation verification to only those parts of the plan that are essential to re-verify.  In doing so, it improves the time complexity of re-verification. Time complexity is a very important and practical consideration for online systems – especially those that operate in real-time or in time-critical situations. In my research I ran numerous experiments comparing the CPU time of re-verifying an entire plan versus localized incremental re-verification of the plan after learning. My results showed as much as a 1/2-billion-fold speedup using incremental re-verification! And that’s using a plan that’s tiny compared to what agents would typically use in the real world.
Read more »