Kathleen Fisher on High-Assurance Systems

 |   |  Conversations

Kathleen Fisher portraitDr. Kathleen Fisher joined DARPA as a program manager in 2011. Her research and development interests relate to programming languages and high assurance systems. Dr. Fisher joined DARPA from Tufts University. Previously, she worked as a Principal Member of the technical staff at AT&T Labs. Dr. Fisher received her Doctor of Philosophy in computer science and her Bachelor of Science in math and computational science from Stanford University.

Luke Muehlhauser: Kathleen, you’re the program manager at DARPA for HACMS program, which aims to construct cyber-physical systems which satisfy “appropriate safety and security properties” using a “clean-slate, formal methods-based approach.” My first question is similar to one I asked Greg Morrisett about the SAFE program, which aims for a “clean slate design for resilient and secure systems”: In the case of HACMS, why was it so important to take the “clean slate” approach, and design the system from the ground up for safety and security (along with functional correctness)?

Kathleen Fisher: Researchers have been trying to prove programs correct for decades, with very little success until recently (successful examples include NICTA’s seL4 microKernel and INRIA’s compCert verifying C compiler). One of the lessons learned in this process is that verifying existing code is much harder than verifying code written with verification in mind.

There are a couple of reasons for this. First, many of the invariants that have to hold for a program to be correct often exist only in the head of the programmer. When trying to verify a program after the fact, these invariants have been lost and can take a very long time for the verifier to rediscover. Second, sometimes the code can be written in multiple ways, some of which are much easier to verify than others. If programmers know they have to verify the code, they’ll choose the way that is easy to verify.

Luke: Harper (2000) quotes an old message by Ken Frith from the Safety Critical Mailing List:

The thought of having to apply formal proofs to intelligent systems leaves me cold. How do you provide satisfactory assurance for something that has the ability to change itself during a continuous learning process?

Obviously we can’t write a formal specification of the entire universe and formally verify a system against it, so what kinds of (formal) things can we do to ensure desirable behavior from increasingly capable autonomous systems?

Kathleen: Verifying adaptive or intelligent systems is a difficult challenge. It is the subject of active research, particularly in the avionics industry. One approach is to use a Simplex Architecture, in which the intelligent system is monitored to ensure it stays within a safety envelop. If the intelligent system looks likely to leave the envelop, a monitor diverts control to a basic system. In this approach, the monitor, the switching code, and the basic system have to be verified directly, but not the intelligent system. Verifying these smaller pieces is within the reach of current technology.

Luke: You said earlier that “verifying existing code is much harder than verifying code written with verification in mind.” This suggests that when a very high level of system safety is needed, e.g. for autopilot software, a great deal of extra work is required, because one must essentially start from scratch to design a system “from the ground up” with safety and verification in mind.

Moreover, I have the impression that the work required to design a verifiably safe intelligent system tends to be far more “serial” in nature than the work required to design an intelligent system without such a high degree of verifiable safety. When designing an intelligent system that doesn’t need to be particularly “safe”, e.g. a system for intelligently choosing which ads to show website visitors, one can cobble together bits of code from here and there and test the result on some data sets and then run the system “in the real world.” But a verifiably safe system needs to be built in a very precise way, with many algorithm design choices depending on previous choices, and with many research insights dependent on the shape of previous insights, etc. In this way, it seems that the project of designing a not-verifiably-safe intelligent system parallelizes more easily than the project of designing a verifiably safe intelligent system.

Is this also your impression? How would you modify my account?

Kathleen: Your assessment is accurate. Basically, if your code has a wide range of acceptable behaviors and if there aren’t particularly bad consequences if it misbehaves, it is a lot easier to write code.

I’d be careful with the term parallelize. It is usually used to refer to programs where parts run at the same time as other parts, say on multiple computers or chips. It can also refer to programmers themselves working in parallel. The way you are using it, it reads like you mean the code is running in parallel when I think you mean to say the code is written in parallel. Both are easier for non-high-assurance code, but the argument you are making lends itself to the “written in parallel” interpretation.

Luke: What qualitative or quantitative trends do you observe or anticipate with regard to verifiably safe cyber-physical systems? For example: Do you think research on the capability and intelligence of cyber-physical systems is outpacing research on their safety and verifiability? How much funding and quality-adjusted researcher hours are going into one (capability and intelligence) relative to the other (safety and verifiability), with respect to cyber-physical systems specifically? How much of the low-hanging fruit seems to have been plucked with respect to each, such that additional units of progress require more funding and researcher time than previous units?

Kathleen: In general, research into capabilities outpaces the corresponding research into how to make those capabilities secure. The question of security for a given capability isn’t interesting until that capability has been shown to be possible, so initially researchers and inventors are naturally more focused on the new capability rather than on its associated security. Consequently, security often has to catch up once a new capability has been invented and shown to be useful.

In addition, by definition, new capabilities add interesting and useful new capabilities, which often increase productivity, quality of life, or profits. Security adds nothing beyond ensuring something works the way it is supposed to, so it is a cost center rather than a profit center, which tends to suppress investment.

I don’t know how one would go about collecting the data to show levels of investment in research for capabilities vs. security. I’m not sure the dollar totals would be that revealing. Research in security could still be lagging research in capabilities even if more money were being spent on security, if research in security were more costly than research into capabilities.

Luke: It is one challenge to ensure the safe behavior of a system with very limited behaviors and very limited decision-making capacity, and quite another to ensure the safe behavior of a system with much greater autonomous capability. What do you think are the prospects for ensuring the safe behavior of cyber-physical systems as they are endowed with sufficient intelligence and autonomy to e.g. replace factory workers, replace taxi drivers, replace battlefield troops, etc. over the next several decades?

Kathleen: As you say, it is much easier to ensure the safe and secure behavior of simple systems with well-defined behavior. That said, it is clear that we will be fielding more and more complex cyber-physical systems for all kinds of activities and we will need to be able to ensure that those systems are sufficiently safe and secure. I think a combination of techniques will enable us to reach this goal, although substantial additional research is needed. These techniques include: 1) model-based design, 2) program synthesis, 3) security- and safety-aware composition, and 4) simplex-based architectures. Basically, we’ll need to start with simple components and simple architectures, and then leverage them to build more and more complex systems.

Luke: Thanks!