John Ridgway on safety-critical systems

 |   |  Conversations

John Ridgway portraitJohn Ridgway studied physics at the University of Newcastle Upon Tyne and Sussex University before embarking upon a career in software engineering. As part of that career he worked for 28 years in the field of Intelligent Transport Systems (ITS), undertaking software quality management and systems safety engineering roles on behalf of his employer, Serco Transportation Systems. In particular, John provided design assurance for Serco’s development of the Stockholm Ring Road Central Technical System (CTS) for the Swedish National Roads Administration (SNRA), safety analysis and safety case development for Serco’s M42 Active Traffic Management (ATM) Computer Control System for the UK Highways Agency (HA), and safety analysis for the National Traffic Control Centre (NTCC) for the HA.

John is a regular contributor to the Safety Critical Systems Club (SCSC) Newsletter, in which he encourages fellow practitioners to share his interest in the deeper issues associated with the conceptual framework encapsulated by the terms ‘uncertainty’, ‘chance’ and ‘risk’. Although now retired, John recently received the honour of providing the after-banquet speech for the SCSC 2014 Annual Symposium.

Luke Muehlhauser: What is the nature of your expertise and interest in safety engineering?


John Ridgway: I am not an expert and I would not wish to pass myself off as one. I am, instead, a humble practitioner, and a retired one at that. Having been educated as a physicist, I started my career as a software engineer, rising eventually to a senior position within Serco Transportation Systems, UK, in which I was responsible for ensuring the establishment and implementation of processes designed to foster and demonstrate the integrity of computerised systems. The systems concerned (road traffic management systems) were not, initially, considered to be safety-related, and so lack of integrity in the delivered product was held to have little more than a commercial or political significance. However, following a change of safety policy within the procurement departments of the UK Highways Agency, I recognised that a change of culture would be required within my organisation, if it were to continue as an approved supplier.

If there is any legitimacy in my contributing to this forum, it is this: Even before safety had become an issue, I had always felt that the average practitioner’s track record in the management of risk would benefit greatly from taking a closer interest in (what some may deem to be) philosophical issues. Indeed, over the years, I became convinced that many of the factors that have hampered software engineering’s development into a mature engineering discipline (let’s say on a par with civil or mechanical engineering) have at their root, a failure to openly address such issues. I believe the same could also be said with regard to functional safety engineering. The heart of the problem lies in the conceptual framework encapsulated by the terms ‘uncertainty’, ‘chance’ and ‘risk’, all of which appear to be treated by practitioners as intuitive when, in fact, none of them are. This is not an academic concern, since failure to properly apprehend the deeper significance of this conceptual framework can, and does, lead practitioners towards errors of judgement. If I were to add to this the accusation that practitioners habitually fail to appreciate the extent to which their rationality is undermined by cognitive biases, then I feel there is more than enough justification for insisting that they pay more attention to what is going in the world of academia and research organisations, particularly in the fields of cognitive science, decision theory and, indeed, neuroscience. This, at least, became my working precept.

Read more »

David Cook on the VV&A process

 |   |  Conversations

Emil Vassev portraitDr. David A. Cook is Associate Professor of Computer Science at Stephen F. Austin State University, where he teaches Software Engineering, Modeling and Simulation, and Enterprise Security. Prior to this, he was Senior Research Scientist and Principal Member of the Technical Staff at AEgis Technologies, working as a Verification, Validation, and Accreditation agent supporting the Airborne Laser. Dr. Cook has over 40 years’ experience in software development and management. He was an associate professor and department research director at USAF Academy and former deputy department head of Software Professional Development Program at AFIT. He has been a consultant for the Software Technology Support Center, Hill AFB, UT for 19 years.

Dr. Cook has a Ph.D. in Computer Science from Texas A&M University, is a Team Chair for ABET, Past President for the Society for Computer Simulation, International, and Chair of ACM SIGAda.

Luke Muehlhauser: In various articles and talks (e.g. Cook 2006), you’ve discussed the software verification, validation, and accreditation (VV&A) process. Though the general process is used widely, the VV&A term is often used when discussing projects governed by DoD 5000.61. Can you explain to whom DoD 5000.61 applies, and how it is used in practice?


David A Cook: DOD 5000.81 applies to all Department of Defense activities involving modeling and simulation.  For all practical purposes, it applies to all models and simulations that are used by the DOD.  This implies that it also applies to all models and simulations created by civilian contractors that are used for DOD purposes.

The purpose of the directive, aside from specifying who is the “accreditation authority” (more on this later) is to require Verification and Validation for all models and simulation, and then also to require that each model and simulation by accredited for its intended use.  This is the critical part, as verification and validation has almost universally been a part of software development within the DOD.  Verification asks the question “Are we building the system in a quality manner?”, or “are we building the system right?”.  Verification, in a model (and the resulting execution of the model providing a simulation) goes a bit further – and asks the question “Does the model build and the results of the simulation actually represent the conceptual design and specifications of the system we built?”  The difference is that in a model and simulation, you have to show that your design and specifications of the system you envision are correctly translated into code, and that the data provided to the code also matches specification.

Validation asks the question “are we building a system that meets the users’ actual needs?”, or “are we building the right system?”  Again, the verification of a model and resulting simulation is a bit more complex than non-M&S ”verification”.  In modeling and simulation, verification has to show that the model and the simulation both accurately represent the “real world” from the perspective of the intended use.

These two activities are extremely difficult when you are building models and providing simulation results for notional systems that might not actually exist in the real world.  For example, it would be difficult to provide V&V for a manned Mars mission, because, in the real world, there is not a manned Mars lander yet!   Therefore, for notional systems, V&V might require estimation and guesswork.  However, guesswork and estimation might be the best you can do!

5000.61 further requires that there be an ultimate authority, the “accreditation authority”, that is willing to say “based on the Verification and Validation performed on this model, I certify that it provides answers that are acceptable for its intended use”.  Again, if you are building a notional system, this requires experts to say “These are guesses, but they are the best guesses available, and the system is as close a model to the real world as possible.  We accredit this system to provide simulation results that are acceptable.”   If, for example, an accredited  simulation shows that a new proposed airplane would be able to carry 100,000 pounds of payload – but the result airplane, once built, can only carry 5,000 pounds – the accreditation authority would certainly bear some of the blame for the problem.

In practice, there are process for providing VV&A.  Military Standard  3022 provides a standard template for recording VV&A activities, and many DOD agencies have their own VV&A repository where common models and simulation VV&A artifacts (and associated documentation) are kept.

There are literally hundreds of ways to verify and validate a model (and it’s associated simulation execution).  The V&V “agents” (who have been tasked with performing V&V) provide a recommendation to the Accreditation Authority, listing what are acceptable uses, and (the critical part) the limits of the model and simulation.  For example, a model and simulation might provide an accurate representation of the propagation of a laser beam (in the upper atmosphere) during daylight hours, but not be a valid simulation at night, due to temperature-related atmospheric propagation.  The same model and simulation might be a valid predictor of a laser bouncing off of a “flat surface”, but not bouncing off of uneven terrain.

Read more »

Robert Constable on correct-by-construction programming

 |   |  Conversations

Robert L. Constable heads the Nuprl research group in automated reasoning and software verification, and joined the Cornell faculty in 1968. He has supervised over forty PhD students in computer science, including the very first graduate of the CS department. He is known for his work connecting programs and mathematical proofs, which has led to new ways of automating the production of reliable software. He has written three books on this topic as well as numerous research articles. Professor Constable is a graduate of Princeton University where he worked with Alonzo Church, one of the pioneers of computer science.

Luke Muehlhauser: In some of your work, e.g. Bickford & Constable (2008), you discuss “correct-by-construction” and “secure-by-construction” methods for designing programs. Could you explain what such methods are, and why they are used?

Read more »

Armando Tacchella on Safety in Future AI Systems

 |   |  Conversations

Armando Tacchella is Associate Professor of Information Systems at the Faculty of Engineering, at the University of Genoa. He obtained his Ph.D in Electrical and Computer Engineering from the University of Genoa in 2001 and his “Laurea” (M.Sc equivalent) in Computer Engineering in 1997. His teaching activities include graduate courses in AI, formal languages, compilers, and machine learning as well as undergraduate courses in design and analysis of algorithms. His research interest are mainly in the field of AI, with a focus on systems and techniques for automated reasoning and machine learning, and applications to modeling, verification and monitoring of cyber-physical systems. His recent publications focus on improving the dependability of complex control architectures using formal methods, from the design stage till the operational stage of the system. He has published more than forty papers in international conferences and journals including AAAI, IJCAI, CAV, IJCAR, JAI, JAIR, IEEE-TCAD. In 2007 he was awarded by the Italian Association of Artificial Intelligence (AI*IA) the prize “Marco Somalvico” for the best young Italian researcher in AI.

Luke Muehlhauser: My summary of Transparency in Safety-Critical Systems was:

Black box testing can provide some confidence that a system will behave as intended, but if a system is built such that it is transparent to human inspection, then additional methods of reliability verification are available. Unfortunately, many of AI’s most useful methods are among its least transparent. Logic-based systems are typically more transparent than statistical methods, but statistical methods are more widely used. There are exceptions to this general rule, and some people are working to make statistical methods more transparent.

The last sentence applies to a 2009 paper you co-authored with Luca Pulina, in which you show formal guarantees about the behavior of a trained multi-layer perceptron (MLP). Could you explain roughly how that works, and what kind of guarantees you were able to prove?

Read more »

Anders Sandberg on Space Colonization

 |   |  Conversations

Anders Sandberg works at the Future of Humanity Institute, a part of the Oxford Martin School and the Oxford University philosophy faculty. Anders’ research at the FHI centres on societal and ethical issues surrounding human enhancement, estimating the capabilities and underlying science of future technologies, and issues of global catastrophic risk. In particular he has worked on cognitive enhancement, whole brain emulation and risk model uncertainty. He is senior researcher for the FHI-Amlin Research Collaboration on Systemic Risk of Modelling, a unique industry collaboration investigating how insurance modelling contributes to or can mitigate systemic risks.

Anders has a background in computer science and neuroscience. He obtained his Ph.D in computational neuroscience from Stockholm University, Sweden, for work on neural network modelling of human memory. He is co-founder and writer for the think tank Eudoxa, and a regular participant in international public debates about emerging technology.

Luke Muehlhauser: In your paper with Stuart Armstrong, “Eternity in Six Hours,” you run through a variety of calculations based on known physics, and show that “Given certain technological assumptions, such as improved automation, the task of constructing Dyson spheres, designing replicating probes, and launching them at distant galaxies, become quite feasible. We extensively analyze the dynamics of such a project, including issues of deceleration and collision with particles in space.”

You frame the issue in terms of the Fermi paradox, but I’d like to ask about your paper from the perspective of “How hard would it be for an AGI-empowered, Earth-based civilization to colonize the stars?”

In section 6.3. you comment on the robustness of the result:

In the estimation of the authors, the assumptions on intergalactic dust and on the energy efficiency of the rockets represent the most vulnerable part of the whole design; small changes to these assumptions result in huge increases in energy and material required (though not to a scale unfeasible on cosmic timelines). If large particle dust density were an order of magnitude larger, reaching outside the local group would become problematic without shielding methods.

What about the density of intragalactic dust? Given your technological assumptions, do you think it would be fairly straightforward to colonize most of the Milky Way from Earth?

Read more »

The world’s distribution of computation (initial findings)

 |   |  Analysis

supercomputer

What is the world’s current distribution of computation, and what will it be in the future?

This question is relevant to several issues in AGI safety strategy. To name just three examples:

  • If a large government or corporation wanted to quickly and massively upgrade its computing capacity so as to make a push for AGI or WBE, how quickly could it do so?
  • If a government thought that AGI or WBE posed a national security threat or global risk, how much computation could it restrict, how quickly?
  • How much extra computing is “immediately” available to a successful botnet or government, simply by running existing computers near 100% capacity rather than at current capacity?

To investigate these questions, MIRI recently contracted Vipul Naik to gather data on the world’s current distribution of computation, including current trends. This blog post summarizes our initial findings by briefly responding to a few questions. Naik’s complete research notes are available here (22 pages). This work is meant to provide a “quick and dirty” launchpad for future, more thorough research into the topic.

Read more »

Nik Weaver on Paradoxes of Rational Agency

 |   |  Conversations

Nik Weaver portraitNik Weaver is a professor of mathematics at Washington University in St. Louis. He did his graduate work at Harvard and Berkeley and received his Ph.D. in 1994. His main interests are functional analysis, quantization, and the foundations of mathematics. He is best known for his work on independence results in C*-algebra and his role in the recent solution of the Kadison-Singer problem. His most recent book is Forcing for Mathematicians.

Luke Muehlhauser: In Weaver (2013) you discuss some paradoxes of rational agency. Can you explain roughly what these “paradoxes” are, for someone who might not be all that familiar with provability logic?


Nik Weaver: Sure. First of all, these are “paradoxes” in the sense of being highly counterintuitive — they’re not outright contradictions.

They all relate to the basic Löbian difficulty that if you reason within a fixed axiomatic system S, and you know that some statement A is provable within S, you’re generally not able to deduce that A is true. This may be an inference that you and I would be willing to make, but if you try to build it into a formal system then the system becomes inconsistent. So, for a rational agent who reasons within a specified axiomatic system, knowing that a proof exists is not as good as actually having a proof.

This leads to some very frustrating consequences. Let’s say I want to build a spaceship, but first I need to be sure that it’s not going to blow up. I have an idea about how to prove this, but it’s extremely tedious, so I write a program to work out the details of the proof and verify that it’s correct. The good news is that when I run the program, it informs me that it was able to fill in the details successfully. The bad news is that I now know that there is a proof that the ship won’t blow up, but I still don’t know that the ship won’t blow up! I’m going to have to check the proof myself, line by line. It’s a complete waste of time, because I know that the program functions correctly (we can assume I’ve proven this), so I know that the line by line verification is going to check out, but I still have to do it.

You may say that I have been programmed badly. Whoever wrote my source code ought to have allowed me to accept “there is a proof that the ship won’t blow up” as sufficient justification for building the ship. This can be a general rule: for any statement A, let “there exists a proof of A” license all the actions that A licenses. We’re not contradicting Löb’s theorem — we still can’t deduce A from knowing there is a proof of A — but we’re finessing it by stipulating that knowing there’s a proof of A is good enough. But there are still problems. Imagine that I can prove that if the Riemann hypothesis is true, then the ship won’t blow up, and if it’s false then there exists a proof that the ship won’t blow up. Then I’m in a situation where I know that either A is true or there is a proof that A is true, but I don’t know which one. So even with the more liberal licensing condition, I still can’t build my lovely spaceship.
Read more »

MIRI’s May 2014 Workshop

 |   |  News

december-2013-workshop-header-721px

From May 3–11, MIRI will host its 7th Workshop of Logic, Probability, and Reflection. This workshop will focus on decision theory and tiling agents.

The participants — all veterans of past workshops — are:

If you have a strong mathematics background and might like to attend a future workshop, apply today! Even if there are no upcoming workshops that fit your schedule, please still apply, so that we can notify you of other workshops (long before they are announced publicly).

As featured in:     CQ   Forbes   NY Times   Scientific American   Technology Review