Benjamin Pierce on clean-slate security architectures

 |   |  Conversations

Benjamin C. Pierce portrait Benjamin C. Pierce is Henry Salvatori Professor of Computer and Information Science at the University of Pennsylvania and a Fellow of the ACM. His research interests include programming languages, type systems, language-based security, computer-assisted formal verification, differential privacy, and synchronization technologies. He is the author of the widely used graduate textbooks Types and Programming Languages and Software Foundations.

He has served as co-Editor in Chief of the Journal of Functional Programming, as Managing Editor for Logical Methods in Computer Science, and as editorial board member of Mathematical Structures in Computer ScienceFormal Aspects of Computing, and ACM Transactions on Programming Languages and Systems. He is also the lead designer of the popular Unison file synchronizer.

Luke Muehlhauser: I previously interviewed Greg Morrisett about the SAFE project, and about computer security in general. You’ve also contributed to the SAFE project, gave an “early retrospective” talk on it (slides), and I’d like to ask you some more detailed questions about it.

In particular, I’d like to ask about the “verified information-flow architecture” developed for SAFE. Can you give us an overview of the kinds of information flow security properties you were able to prove about the system?


Benjamin C. Pierce: Sure. First, to remind your readers: SAFE is a clean-slate design of a new hardware / software stack whose goal is to build a network host that is highly resilient to cyber-attack. One pillar of the design is pervasive mechanisms for tracking information flow. The SAFE hardware offers fine-grained tagging and efficient propagation and combination of tags on each instruction dispatch. The operating system virtualizes these generic facilities to provide an “information-flow abstract machine,” on which user programs run.

Formal verification has been part of the SAFE design process right from the beginning. We’d originally hoped to be able to verify the actual running code of the OS in the style of sel4, but we found that the codebase was too big and moving too fast for this to be practical with a small verification team. Instead, we’ve developed a methodology that combines full formal verification of models of the system’s key features with “property-based random testing” (à la QuickCheck) of richer subsets of the system’s functionality.

Our most interesting formal proof so far shows that the key security property of the information-flow abstract machine — the fact that a program’s secret inputs cannot influence its public outputs — is correctly preserved by our implementation on (a simpified version of) the SAFE hardware. This is interesting because the behavior of the abstract machine is achieved by a fairly intricate interplay between a hardware “rule cache” and a software layer that fills the cache as needed by consulting a symbolic representation of the current security policy. Since this mechanism lies at the very core of the SAFE architecture’s security guarantees, we wanted to be completely sure it is correct (and it is!).


Luke: You mention that since you aren’t able to formally verify the entire OS with your available resources, you’re instead formally verifying models of the system’s key features and also using property-based random testing of some of the rest of the system’s functionality. What level of subjective confidence (betting odds) does that approach give you for the security of the system, compared to what you could get if you had the resources to successfully verify the entire OS? (A rough “gut guess” is acceptable!)


Benjamin: Let’s say that an unverified but well-engineered and road-tested system gets a 1 on the security scale and a fully verified OS gets an 8. And let’s assume that we’re talking about a finished and polished version of SAFE (recognizing that what we have now is an incomplete prototype), together with correctness proofs for models of core mechanisms and random testing of some larger components. My gut would put this system at maybe a 4 on this scale. The proofs we’ve done are limited in scope, but they’ve been extremely useful in deepening our understanding of the core design and guiding us to a clean implementation.

Why is the top of my scale 8 and not 10? Because even a full-blown machine-checked proof of correctness doesn’t mean that your OS is truly invulnerable. Any proof is going to be based on assumptions — e.g., that the hardware behaves correctly, which might not be the case if the attacker has a hair dryer and physical access to the CPU! So, to get higher than 8 requires some additional assurance that the degree of potential damage is limited even if your formal assumptions are broken.

This observation suggests a defense-in-depth strategy. Our goal in designing the SAFE software has been to make the “omniprivileged” part of the code (i.e., the part that could cause the machine to do absolutely anything, if the attacker could somehow get it to misbehave) as small as possible — vastly smaller than conventional operating systems and even significantly smaller than microkernels. The code that checks security policy and fills the hardware rule cache is highly privileged, as is the garbage collector in our current design, but we believe that the rest of the OS can be written as a set of “compartments” with more limited privilege, using the tagging hardware to enforce separation between components. (Demonstrating this claim is ongoing work.)


Luke: I’ve spoken to some people who are skeptical that formal verification adds much of a security boost, since so many security problems come from not understanding the vulnerabilities well enough to capture them all in a system’s formal requirements in the first place, as opposed to coming from a system which fails to match the design specification in ways that could have been caught by formal verification. Yet you seem to rank a verified system as far more secure than a non-verified system. What’s your perspective on this?


Benjamin: Yes, other things being equal I would put quite a bit of money on a verified system being more secure than an unverified one — not because a the verification process eliminates the possibility of vulnerabilities that were not considered in the specification, but because the process of writing the formal specification requires someone to sit down and think carefully about at least some of the (security and other) properties that the system is trying to achieve. A clear understanding of at least some of the issues is a lot better than nothing!

Actually carrying out a proof that the system matches the specification adds significant further value, because specifications themselves are complex artifacts that require debugging just as much as the system being specified does! But property-based random testing — i.e., checking that the system satisfies its specification for a large number of well-distributed random inputs — can be a very effective intermediate step, giving a large part of the bang for relatively small buck.


Luke: In general, do you think that highly complex systems for which “high assurance” of safety or security (as relevant) are required need to be designed “from the ground up” for safety or security in order to obtain such high assurance, or are there many cases in which a highly complex but high assurance system can be successfully developed by optimizing development for general performance and then “patching” the system for security and safety?

And, do you have any guesses as to what the most common reply to this question might be in your own subfield(s) and other adjacent subfields?


Benjamin: Of course it depends a bit what you mean by “high assurance”! Quite a few people define the term so that it only applies to software developed using methods that consider safety and/or security from day 1. But even if we take a broader definition of HA — “offering very good resistance to malicious penetration and catastrophic failure,” or some such — my feeling is that this is extremely difficult to retrofit.

I personally saw this effect very clearly while building Unison, a cross-platform file synchronization tool that I made with Trevor Jim and Jérome Vouillon a few years ago. I had done several synchronizers before Unison, but I always discovered edge cases where the behavior seemed wrong, so we began the Unison design by trying to write down a very precise (though dramatically simplified) specification of its core behavior. Beyond helping us understand edge cases, this exercise had a huge effect on the way we wrote the code. For example, the specification says that, at the end of a run of the synchronizer, every part of the filesystems being synchronized should be either synchronized (brought up to date with the other copy) or unchanged from when the synchronizer started running (for example, if there was a conflict that the user chose not to resolve). But, since the synchronizer can terminate in all sorts of ways (finishing normally, getting killed by the user, the network connection getting dropped, one machine getting rebooted, …) this requirement means that the filesystems must be in one of these two states at every moment during execution. This is actually quite hard to achieve: there are many parts of the code that have to do extra work to buffer things in temporary locations until they can be moved atomically into place, use two-phase commits to make sure internal data structures don’t get corrupted, etc., etc., and the reasons why the extra work is needed are often quite subtle. Having the spec beforehand forced us to do this work as we went. Going back later and rediscovering all the places where we should have paid extra attention would have been well nigh impossible.

As for what others in my research area might say to your question, I decided to poll some friends instead of just assuming. As expected, most of the comments I got back were along the lines of “Making a complex system really secure after the fact is basically impossible.” But a few people did propose systems that might count as examples, such as SELinux and Apache. (Actually, SELinux is an interesting case. On one hand, its goal was only to add mandatory access control to Linux; it explicitly didn’t even attempt to increase the assurance of Linux itself. So for attacks that target kernel vulnerabilities, SELinux is no better than vanilla Linux. On the other hand, I’m told that, in certain situations — in particular, for securing web applications and other network-facing code against the sorts of attacks that come over HTTP requests — SELinux can provide very strong protection.)


Luke: Thanks, Benjamin!