MIRI’s March 2014 Newsletter

 |   |  Newsletters

Machine Intelligence Research Institute

Research Updates

News Updates

Other Updates

  • Video of the inaugural lectures of the Center for the Study of Existential Risk at Cambridge University.

As always, please don’t hesitate to let us know if you have any questions or comments.

Best,
Luke Muehlhauser
Executive Director

Recent Hires at MIRI

 |   |  News

MIRI is proud to announce several new team members (see our Team page for more details):

Benja Fallenstein attended four of MIRI’s past workshops, and has contributed to several novel results in Friendly AI theory, including Löbian cooperation, parametric polymorphism, and “Fallenstein’s monster.” Her research focus is Friendly AI theory.

Nate Soares worked through much of the MIRI’s courses list in time to attend MIRI’s December 2013 workshop, where he demonstrated his ability to contribute to the research program in a variety of ways, including writing. He and Fallenstein are currently collaborating on several papers in Friendly AI theory.

Robby Bensinger works part-time for MIRI, describing open problems in Friendly AI in collaboration with Eliezer Yudkowsky. His current project is to explain the open problem of naturalized induction.

Katja Grace has also been hired in a part-time role to study questions related to the forecasting part of MIRI’s research program. She previously researched and wrote Algorithmic Progress in Six Domains for MIRI.

MIRI continues to collaborate on a smaller scale with many other valued researchers, including Jonah Sinick, Vipul Naik, and our many research associates.

If you’re interested in joining our growing team, apply to attend a future MIRI research workshop. We’re also still looking to fill several non-researcher positions.

Toby Walsh on computational social choice

 |   |  Conversations

Toby Walsh portraitToby Walsh is a professor of artificial intelligence at NICTA and the University of New South Wales. He has served as Scientific Director of NICTA, Australia’s centre of excellence for ICT research. He has also held research positions in England, Scotland, Ireland, France, Italy, Sweden and Australia. He has been Editor-in-Chief of the Journal of Artificial Intelligence Research, and of AI Communications. He is Editor of the Handbook of Constraint Programming, and of the Handbook of Satisfiability.

Luke Muehlhauser: In Rossi et al. (2011), you and your co-authors quickly survey a variety of methods in computational social choice, including methods for preference aggregation, e.g. voting rules. In Narodytska et al. (2012), you and your co-authors examine the issue of combining voting rules to perform a run-off between the different winners of each voting rule. What do you think are some plausible practical applications of this work — either soon or after further theoretical development?


Toby Walsh: As humans, we’re all used to voting: voting for our politicians, or voting for where to go out. In the near future, we’ll hand over some of that responsibility to computational agents that will help organize our lives. Think Siri on steroids. In such situations, we often have many choices as there can be a combinatorial number of options. This means we need to consider computational questions: How do we get computer(s) to work with such rich decision spaces? How do we efficiently collect and represent users’ preferences?

I should note that computer systems are already voting. The SCATS system for controlling traffic lights has the controllers of different intersections vote for what should be the common cycle time for the lights. Similarly, the Space Shuttle had 5 control computers which voted on whose actions to follow.

Computational social choice is, however, more than just voting. It covers many other uses of preferences. Preferences are used to allocate scarce resources. I prefer, for example, a viewing slot on this expensive telescope when the moon is high from the horizon. Preferences are also used to allocate people to positions. I prefer, for example, to be matched to a hospital with a good pediatrics depts. Lloyd Shapley won the Nobel Prize in Economics recently for looking at such allocation problems. There are many appealing applications in areas like kidney transplant, and school choice.

One interesting thing we’ve learnt from machine learning is that you often make better decisions when you combine the opinions of several methods. It’s therefore likely that we’ll get better results by combining together voting methods. For this reason, we’ve been looking at how voting rules combine together.

Read more »

Randall Larsen and Lynne Kidder on USA bio-response

 |   |  Conversations

Randall Larsen portraitColonel Randall Larsen, USAF (Ret), is the National Security Advisor at the UPMC Center for Health Security, and a Senior Fellow at the Homeland Security Policy Institute, George Washington University. He previously served as the Executive Director of the Commission on the Prevention of Weapons of Mass Destruction Proliferation and Terrorism (2009-2010); the Founding Director and CEO of the Bipartisan WMD Terrorism Research Center (2010-2012), the Founding Director of the ANSER Institute for Homeland Security (2000-2003), and the chairman of the Department of Military Strategy and Operations at the National War College (1998-2000).

Lynne Kidder portraitLynne Kidder is the former President of the Bipartisan WMD Terrorism Research Center (the WMD Center) and was the principal investigator for the Center’s Bio-Response Report Card.  She is currently a Boulder, CO-based consultant, a research affiliate with the University of Colorado’s Natural Hazards Center, and also serves as the co-chair of the Institute of Medicine’s Forum on Medical and Public Health Preparedness for Catastrophic Events. Her previous positions include Sr. Vice President at Business Executives for National Security, Senior Advisor to the Center for Excellence in Disaster Management and Humanitarian Assistance (US Pacific Command), and eight years as professional staff in the U.S. Senate.

Luke Muehlhauser: Your Bio-Response Report Card assesses the USA’s bio-response capabilities. Before we explore your findings, could you say a bit about how the report was produced, and what motivated its creation?


Randall Larsen: The 9/11 Commission recommended that a separate commission examine the terrorism threat from weapons of mass destruction (WMD). The bipartisan leadership in the Senate and House asked former US Senators Bob Graham (D-FL) and Jim Talent (R-MO) to head the Congressional Commission on the Prevention of Weapons of Mass Destruction Proliferation and Terrorism (WMD Commission). The WMD Commission completed its work in December 2008 and published a report, World at Risk. In March 2009, the bipartisan leadership of Congress asked Senators Graham and Talent to re-establish the Commission to continue its work and provide a report card on progress. This was the first Congressional Commission to be extended for a second year.

I became the Executive Director for the WMD Commission’s second year, and in January 2010, the Commission released a WMD Report Card assessing 37 aspects of the WMD threat. The grades ranged from A’s to F’s. The failing grade that received the most attention, both on Capitol Hill and in the press, was the F grade for “preparedness to respond to a biological attack.”

At the commissioners’ final meeting in December 2009, they encouraged Senators Graham and Talent to continue their work with a focus on the biological threat. To do so, a not-for-profit organization (501c3) was created in March 2010, The Bipartisan WMD Terrorism Research Center (WMD Center). Senators Graham and Talented agreed to serve on the board of advisors, I became the CEO, and recruited Lynne Kidder to serve as the President.

Launching the project was a bit of a challenge, since many of the traditional national security organizations that support such work were solely focused on the nuclear threat—a reflection of the Congressional perspective. The legislation that created the WMD Commission had not contained the words bioterrorism or biology—ironic since World at Risk clearly identified bioterrorism as the most likely WMD threat.

We began work on the Bio-Response Report Card in January 2011 by recruiting a world-class team of senior advisors. They included a former Deputy Administrator of the Food and Drug Administration, a former Special Assistant to the President for Biodefense, the Director of Disaster Response at the American Medical Association, the VP and Director of RAND Health, the Founding President of the Sabin Vaccine Institute, and experts in the fields of public health, emergency medicine, and environmental remediation.

The Board of Advisors helped inform methodology of the project, helped define the categories of bio-response, and then proposed metrics in the form of questions, by which to assess capabilities in each category.

Read more »

John Ridgway on safety-critical systems

 |   |  Conversations

John Ridgway portraitJohn Ridgway studied physics at the University of Newcastle Upon Tyne and Sussex University before embarking upon a career in software engineering. As part of that career he worked for 28 years in the field of Intelligent Transport Systems (ITS), undertaking software quality management and systems safety engineering roles on behalf of his employer, Serco Transportation Systems. In particular, John provided design assurance for Serco’s development of the Stockholm Ring Road Central Technical System (CTS) for the Swedish National Roads Administration (SNRA), safety analysis and safety case development for Serco’s M42 Active Traffic Management (ATM) Computer Control System for the UK Highways Agency (HA), and safety analysis for the National Traffic Control Centre (NTCC) for the HA.

John is a regular contributor to the Safety Critical Systems Club (SCSC) Newsletter, in which he encourages fellow practitioners to share his interest in the deeper issues associated with the conceptual framework encapsulated by the terms ‘uncertainty’, ‘chance’ and ‘risk’. Although now retired, John recently received the honour of providing the after-banquet speech for the SCSC 2014 Annual Symposium.

Luke Muehlhauser: What is the nature of your expertise and interest in safety engineering?


John Ridgway: I am not an expert and I would not wish to pass myself off as one. I am, instead, a humble practitioner, and a retired one at that. Having been educated as a physicist, I started my career as a software engineer, rising eventually to a senior position within Serco Transportation Systems, UK, in which I was responsible for ensuring the establishment and implementation of processes designed to foster and demonstrate the integrity of computerised systems. The systems concerned (road traffic management systems) were not, initially, considered to be safety-related, and so lack of integrity in the delivered product was held to have little more than a commercial or political significance. However, following a change of safety policy within the procurement departments of the UK Highways Agency, I recognised that a change of culture would be required within my organisation, if it were to continue as an approved supplier.

If there is any legitimacy in my contributing to this forum, it is this: Even before safety had become an issue, I had always felt that the average practitioner’s track record in the management of risk would benefit greatly from taking a closer interest in (what some may deem to be) philosophical issues. Indeed, over the years, I became convinced that many of the factors that have hampered software engineering’s development into a mature engineering discipline (let’s say on a par with civil or mechanical engineering) have at their root, a failure to openly address such issues. I believe the same could also be said with regard to functional safety engineering. The heart of the problem lies in the conceptual framework encapsulated by the terms ‘uncertainty’, ‘chance’ and ‘risk’, all of which appear to be treated by practitioners as intuitive when, in fact, none of them are. This is not an academic concern, since failure to properly apprehend the deeper significance of this conceptual framework can, and does, lead practitioners towards errors of judgement. If I were to add to this the accusation that practitioners habitually fail to appreciate the extent to which their rationality is undermined by cognitive biases, then I feel there is more than enough justification for insisting that they pay more attention to what is going in the world of academia and research organisations, particularly in the fields of cognitive science, decision theory and, indeed, neuroscience. This, at least, became my working precept.

Read more »

David Cook on the VV&A process

 |   |  Conversations

Emil Vassev portraitDr. David A. Cook is Associate Professor of Computer Science at Stephen F. Austin State University, where he teaches Software Engineering, Modeling and Simulation, and Enterprise Security. Prior to this, he was Senior Research Scientist and Principal Member of the Technical Staff at AEgis Technologies, working as a Verification, Validation, and Accreditation agent supporting the Airborne Laser. Dr. Cook has over 40 years’ experience in software development and management. He was an associate professor and department research director at USAF Academy and former deputy department head of Software Professional Development Program at AFIT. He has been a consultant for the Software Technology Support Center, Hill AFB, UT for 19 years.

Dr. Cook has a Ph.D. in Computer Science from Texas A&M University, is a Team Chair for ABET, Past President for the Society for Computer Simulation, International, and Chair of ACM SIGAda.

Luke Muehlhauser: In various articles and talks (e.g. Cook 2006), you’ve discussed the software verification, validation, and accreditation (VV&A) process. Though the general process is used widely, the VV&A term is often used when discussing projects governed by DoD 5000.61. Can you explain to whom DoD 5000.61 applies, and how it is used in practice?


David A Cook: DOD 5000.81 applies to all Department of Defense activities involving modeling and simulation.  For all practical purposes, it applies to all models and simulations that are used by the DOD.  This implies that it also applies to all models and simulations created by civilian contractors that are used for DOD purposes.

The purpose of the directive, aside from specifying who is the “accreditation authority” (more on this later) is to require Verification and Validation for all models and simulation, and then also to require that each model and simulation by accredited for its intended use.  This is the critical part, as verification and validation has almost universally been a part of software development within the DOD.  Verification asks the question “Are we building the system in a quality manner?”, or “are we building the system right?”.  Verification, in a model (and the resulting execution of the model providing a simulation) goes a bit further – and asks the question “Does the model build and the results of the simulation actually represent the conceptual design and specifications of the system we built?”  The difference is that in a model and simulation, you have to show that your design and specifications of the system you envision are correctly translated into code, and that the data provided to the code also matches specification.

Validation asks the question “are we building a system that meets the users’ actual needs?”, or “are we building the right system?”  Again, the verification of a model and resulting simulation is a bit more complex than non-M&S ”verification”.  In modeling and simulation, verification has to show that the model and the simulation both accurately represent the “real world” from the perspective of the intended use.

These two activities are extremely difficult when you are building models and providing simulation results for notional systems that might not actually exist in the real world.  For example, it would be difficult to provide V&V for a manned Mars mission, because, in the real world, there is not a manned Mars lander yet!   Therefore, for notional systems, V&V might require estimation and guesswork.  However, guesswork and estimation might be the best you can do!

5000.61 further requires that there be an ultimate authority, the “accreditation authority”, that is willing to say “based on the Verification and Validation performed on this model, I certify that it provides answers that are acceptable for its intended use”.  Again, if you are building a notional system, this requires experts to say “These are guesses, but they are the best guesses available, and the system is as close a model to the real world as possible.  We accredit this system to provide simulation results that are acceptable.”   If, for example, an accredited  simulation shows that a new proposed airplane would be able to carry 100,000 pounds of payload – but the result airplane, once built, can only carry 5,000 pounds – the accreditation authority would certainly bear some of the blame for the problem.

In practice, there are process for providing VV&A.  Military Standard  3022 provides a standard template for recording VV&A activities, and many DOD agencies have their own VV&A repository where common models and simulation VV&A artifacts (and associated documentation) are kept.

There are literally hundreds of ways to verify and validate a model (and it’s associated simulation execution).  The V&V “agents” (who have been tasked with performing V&V) provide a recommendation to the Accreditation Authority, listing what are acceptable uses, and (the critical part) the limits of the model and simulation.  For example, a model and simulation might provide an accurate representation of the propagation of a laser beam (in the upper atmosphere) during daylight hours, but not be a valid simulation at night, due to temperature-related atmospheric propagation.  The same model and simulation might be a valid predictor of a laser bouncing off of a “flat surface”, but not bouncing off of uneven terrain.

Read more »

Robert Constable on correct-by-construction programming

 |   |  Conversations

Robert L. Constable heads the Nuprl research group in automated reasoning and software verification, and joined the Cornell faculty in 1968. He has supervised over forty PhD students in computer science, including the very first graduate of the CS department. He is known for his work connecting programs and mathematical proofs, which has led to new ways of automating the production of reliable software. He has written three books on this topic as well as numerous research articles. Professor Constable is a graduate of Princeton University where he worked with Alonzo Church, one of the pioneers of computer science.

Luke Muehlhauser: In some of your work, e.g. Bickford & Constable (2008), you discuss “correct-by-construction” and “secure-by-construction” methods for designing programs. Could you explain what such methods are, and why they are used?

Read more »

Armando Tacchella on Safety in Future AI Systems

 |   |  Conversations

Armando Tacchella is Associate Professor of Information Systems at the Faculty of Engineering, at the University of Genoa. He obtained his Ph.D in Electrical and Computer Engineering from the University of Genoa in 2001 and his “Laurea” (M.Sc equivalent) in Computer Engineering in 1997. His teaching activities include graduate courses in AI, formal languages, compilers, and machine learning as well as undergraduate courses in design and analysis of algorithms. His research interest are mainly in the field of AI, with a focus on systems and techniques for automated reasoning and machine learning, and applications to modeling, verification and monitoring of cyber-physical systems. His recent publications focus on improving the dependability of complex control architectures using formal methods, from the design stage till the operational stage of the system. He has published more than forty papers in international conferences and journals including AAAI, IJCAI, CAV, IJCAR, JAI, JAIR, IEEE-TCAD. In 2007 he was awarded by the Italian Association of Artificial Intelligence (AI*IA) the prize “Marco Somalvico” for the best young Italian researcher in AI.

Luke Muehlhauser: My summary of Transparency in Safety-Critical Systems was:

Black box testing can provide some confidence that a system will behave as intended, but if a system is built such that it is transparent to human inspection, then additional methods of reliability verification are available. Unfortunately, many of AI’s most useful methods are among its least transparent. Logic-based systems are typically more transparent than statistical methods, but statistical methods are more widely used. There are exceptions to this general rule, and some people are working to make statistical methods more transparent.

The last sentence applies to a 2009 paper you co-authored with Luca Pulina, in which you show formal guarantees about the behavior of a trained multi-layer perceptron (MLP). Could you explain roughly how that works, and what kind of guarantees you were able to prove?

Read more »