Robert Constable on correct-by-construction programming

 |   |  Conversations

Robert L. Constable heads the Nuprl research group in automated reasoning and software verification, and joined the Cornell faculty in 1968. He has supervised over forty PhD students in computer science, including the very first graduate of the CS department. He is known for his work connecting programs and mathematical proofs, which has led to new ways of automating the production of reliable software. He has written three books on this topic as well as numerous research articles. Professor Constable is a graduate of Princeton University where he worked with Alonzo Church, one of the pioneers of computer science.

Luke Muehlhauser: In some of your work, e.g. Bickford & Constable (2008), you discuss “correct-by-construction” and “secure-by-construction” methods for designing programs. Could you explain what such methods are, and why they are used?

Robert Constable:

Short Summary

The history of programming languages shows that types are valuable because compilers can check programs for type correctness and inform programmers whether or not their code exactly matches its type specification. Types describe requirements on programming tasks, and very rich type systems of the kind used in mathematics and increasingly in computing describe certain technical tasks completely as we’ll see below. The more expressive the type system, the more specifications help programmers understand their goals and the programs that achieve them. System designers can specify many requirements and computing tasks precisely with types and organize them into modules (or objects) that become a blueprint for analyzing and building a computing system.

When types are rich enough to completely specify a task, the methodology we just outlined becomes an example of correct by construction programming. This approach has been investigated for mathematical problems for decades and for programming problems since the late 60s and early 70s. For example, if we want to implement a correct by construction factorization based on the fundamental theorem of arithmetic (FTA), that every natural number greater than 1 can be factored uniquely into a product of primes, we can first express this task as a programming problem using types. The problem is then solved by coding a computable function factor with the appropriate type. For programmers the type is probably just as clear as the FTA theorem. It is given below. It requires taking a natural number n greater than 1 as input and returning a list of pairs of numbers (p,e) where p is a prime and e is its exponent. We write the list as [(p1,e1), (p2,e2), …, (pn,en)] and require that n =p1e1 x … x pnen where pe means prime p to the power e, and the ei are all natural numbers, and x is multiplication (and is overloaded as the Cartesian product of two types). In addition the full FTA theorem requires that the factorization is unique. We discuss uniqueness later. For the factorization part, consider the example factor(24,750) = [(2,1),(3,2),(5,3),(11,1)]. This is correct since 24,750 = 2 x 9 x 125 x 11, where 53 is 125. In a rich type system the factorization piece of the fundamental theorem of arithmetic can be described completely as a programming task and the resulting program, factor, can be seen as a constructive proof accomplishing that factorization. A type checker verifies that the program has the correct type. The program factor is correct by construction because it is known that it has the type of the problem it was coded to solve.


Dependent Types

This paragraph is a bit technical and can be skipped – or used as a guide for coding factorization in a functional language with dependent types. Here is roughly how the above factorization task is described in the type systems of the programming languages of the Agda, Coq and Nuprl proof assistants – a bit more on them later. In English the type specification requires Nat, the type of the natural numbers, 0,1,2… and says this: given a natural number n larger than 1, compute a list of pairs of numbers, (p,e), where p is a prime number, and e is a natural number such that the input value n can be written as a product of all the prime numbers p on the list, each raised to the power e (exponent e) that is paired with it in the list lst. Symbolically this can be written as follows were N++ is the type (n:Nat where n>1). We also define the type Prime as (n:Nat where prime(n)) for a Boolean valued expression prime(n) which we code as a function from Nat to Booleans. Here is the FTA factorization type.

(n:N++) ->( lst:List[Prime x Nat] where n = prod(lst) ).

Note that the symbol x in (Prime x Nat) denotes the type constructor for ordered pairs of numbers. Its elements are ordered pairs (p,e) of the right types, p a Prime and e a Nat. Of course we need to define the product function, prod(lst), over a list lst of pairs in the right way. This is simple using functional programming in languages such as Agda, Coq, and Nuprl, in which we have the function type used above. We can assign this kind of programming problem to first year undergraduates, and they will solve it quite easily, especially if we give them a fast prime(n) procedure.

Another very nice way to write a compact solution is to use map and fold_left functions that are usually supplied in the library of list processing functions. That solution uses the well known map reduce paradigm taught in functional programming courses for years and widely used in industry. Solving with map and reduce is a good exercise since the reduce function fold_right is mathematical induction on lists defined as a specific very general recursive program.


Historical Perspective

The development of mainstream programming languages from Algol 60 to Java and OCaml shows that type systems have become progressively more expressive. For example, now it is standard that functions can be treated as objects, e.g. passed as inputs, produced as outputs, and created as explicit values as we did in the FTA example above. Type checking and type inference algorithms have kept up with this trend, although OCaml can’t yet check the type we created for FTA, Agda, Coq, Nuprl and other programming languages supported by proof assistants can (they all use some form of automated reasoning to help). Static type checking of programs defined using very rich types is a demonstrably effective technology for precisely specifying tasks, documenting code as it’s being written, rendering code more readable by others (including yourself when you revisit it several months later), finding errors, and for safely making changes. With higher-order types such as function types, more programming tasks can be precisely defined, and code can be generated that is completely correct by construction.

By 1984 researchers created an extreme point in this type checking technology in which the type systems are so rich that they can describe essentially any mathematical programming task in complete “formal” detail. Types also express assumptions on the data and the computing environment on which the specifications depend. These extremely rich type systems can essentially express any problem in mathematics. Indeed, Coq has been used to formally prove the Four Color Theorem as well as other important theorems in mathematics. The type checking by proof assistants often requires human intervention as well as sophisticated reasoning algorithms. The type systems we have mentioned are strong enough to also describe the uniqueness of the FTA factorization. They can do this by converting any factorization into the standard one produced by the function factor defined earlier. That function might be called normalize, and it does the job of converting any factorization into the standard one in which the primes come in order and are raised to the highest power possible.

Among the best known modern proof assistants are Agda, Coq, HOL, Isabelle, MetaPRL, Minlog, Nuprl, and PVS. Nuprl was the first of these, built in 1984, and the Coq assistant is written in OCaml and supported by INRIA through the French government. Coq is widely taught and used for programming. These proof assistants help programmers check extremely rich specifications using automated reasoning methods. The types are rich enough to guarantee that programs having the type are correct for the task specified. This is a dominant form of correct by construction programming and sometimes includes the technology of extracting a program from a proof that a specification is achievable; this is called using proofs as programs.


Research Goal

An overarching research goal is to design type systems and programming assistants to express any computational problem and provide mechanisms for type checking, perhaps with human aid and using automated reasoning tools, no matter how expressive the type system becomes.

Researchers in programming languages, formal methods, and logic are working on this goal worldwide. There are many references in the research literature citing progress toward this goal. The recent book by Adam Chlipala Certified Programming with Dependent Types, from MIT Press, 2013, provides an up to date view of the subject focused on the widely used Coq proof assistant. The 1991 book by Simon Thompson, Type Theory and Functional Programming, from Addison-Wesley, is one of the early comprehensive text books on the subject as it was just gaining momentum. Nuprl is the oldest proof assistant of this kind that is still active since 1984. It was described in the 1986 book Implementing Mathematics with The Nuprl Proof Development System. The Nuprl literature and an overview of the area circa 2006 can be found in the technical article “Innovations in computational type theory using Nuprl” from the Journal of Applied Logic, 4, 428-469, 2006. Also the author’s 2010 article on the “The Triumph of Types” at the 2010 celebration for Principia Mathematica in Cambridge, England recounts the 100 year old story of type theory and connects that to current research on type systems for correct by construction programming. It is available at

When a type system can specify security properties, the checked programs are secure by construction. In both correct and secure by construction programming, the specifications include assumptions about the computing environment, say about the network topology. If these assumptions fail to hold, say because the network topology changed in an unexpected way, then the type specifications are no longer sufficient to guarantee correctness. So it is very important to document the assumptions on which the type specification depends.

For many problems these assumptions are stable, and in those cases, this correctness methodology is extremely effective. There are very good examples in the literature; at the web site we recently posted a simple example for the purpose of illustrating the method to newcomers. It is a complete formal proof that allowed us to build a correct by construction program to solve the “maximum segment sum” problem that is used to teach formal program development starting from type specifications.


Recent Work

The author and his colleagues at Cornell have recently used Nuprl to create a correct by construction version of the Multi-Paxos distributed consensus protocol that is being used in a database system. This required an implemented theory of distributed computing which is steadily evolving. We have made this protocol attack-tolerant, which is a form of security. Consensus protocols are critical to Cloud computing, and industry works very hard to build them correctly and make them secure as they are revised and improved. Researchers have created hundreds of correct by construction programs since 1984 and secure by construction programs as well. Worldwide many more are being built.

Modern proof assistants are steadily becoming more powerful and effective at correct and secure by construction programming because there is a research imperative and an economic incentive to use them. There is also a strong incentive to teach them, as the work on Software Foundations by Benjamin Pierce and his colleagues at the University of Pennsylvania demonstrates. Proof assistants are an addictive technology partly because they improve themselves the more they are used which steadily increases their appeal and value.

Luke: What do you think are the prospects for applying correct-by-construction and secure-by-construction approaches to methods commonly labeled “artificial intelligence”?

Robert: The proof assistants such as Agda, Coq, and Nuprl that support correct by construction programming are themselves examples of artificial intelligence in the sense that they use automated reasoning tools developed in AI. The earliest proof assistants such as the Boyer-Moore prover came out of the AI Department at Edinburgh University. Also one of the seminal books in the field, The Computer Modeling of Mathematical Reasoning, by Alan Bundy at Edinburgh was a landmark in AI, published in 1983 by Academic Press.

In due course these proof assistants will use other AI tools as well such as those that can translate formal proofs into natural language proofs. There has already been interesting progress on this front. For instance the article “Verbalization of High-Level Formal Proofs” by Holland-Minkley, Barzilay and me in the 16th National Conference on AI in 1999, 277 – 284 translates Nuprl proofs in number theory into natural language.

It is also the case that these systems are extended using each other. For example, right now my colleagues Vincent Rahli and Abhishek Anand are using the Coq prover to check that new rules they want to add to Nuprl are correct with respect to the semantic model of Nuprl’s constructive type theory that they painstakingly formalized in Coq. They have also checked certain rules using Agda. This activity is precisely what you asked about, and the answer is a clear yes.

When it comes to looking at machine learning algorithms on the other hand, the criteria for their success is more empirical. Do they actually give good performance? Machine learning algorithms were used by professors Regina Barzilay (MIT) and Lillian Lee (Cornell) to improve the performance of machine translation from Nuprl mathematics to natural language in their 2002 article on “bootstrapping lexical choice.” For this kind of work, it does not necessarily make sense to use correct by construction programming. On the other hand, one can imagine a scenario where a correct translation of the mathematics might be critical. In that case, we would probably not try to prove properties of the machine learning algorithm, but we would instead try to capture the meaning of the natural language version and compare that to the original mathematics.

The work of “understanding natural language mathematics” would benefit from correct by construction programming. Already there is excellent work being done to use constructive type theory to provide a semantic basis for natural language understanding. The book by Aarne Ranta, Type-theoretical grammar, Oxford, 1994 is an excellent example of this kind of work. It also happens to be one of the best introductions to constructive type theory.

Eventually machine learning algorithms are going to have a major impact on the effectiveness of proof assistants. That is because we can improve the ability of proof assistants to work on various subtasks on their own by teaching them how the expert humans handle these tasks. This kind of work will be an excellent example of both aspects of AI working together. Machine learning guides the machine to a proof, but the automated reasoning tools check that it is a correct proof.

Luke: What developments do you foresee, during the next 10-20 years, in proof assistants and related tools?

Robert: I’ve been working on proof assistants, programming logics, and constructive type theory for about forty years, and part of the job is to make predictions five to seven years out and try to attract funding to attain a feasible goal that will advance the field. To maintain credibility, you need to be right enough of the time. I’ve been more right than wrong on eight or so of these predictions. So I am confident about predicting five to seven years out, and I have ideas and hopes about ten years out, and then I have beliefs about the long term.


The Short Term

Over the next five years we will see more striking examples of correct by construction programming in building significant software and more examples of verifying software systems as in the work on the seL4 kernel, the SAFE software stack, features of Intel hardware, and important protocols for distributed computing, especially for cloud computing. This will be done because it is becoming cost effective to do it. I don’t see a major industrial effort even for the next ten years. This area will be a focus of continued academic and industrial research, and it will have incremental effects on certain key systems. However, it will remain too expensive for wide spread deployment in industry.

We will see formal tools being used to make both offensive and defensive weapons of cyber warfare more effective. One of the key lessons driving this is the fact that the stuxnet weapon, one of the best and most expensive ever created, was apparently defeated because it had a bug that led to its discovery. Another lesson is that the offensive weapons are improving all the time, and the defense must be more agile, not only building secure by construction “fortress like” systems, but thinking about systems as living responsive entities that learn how to adapt, recover, and maintain their core functionality.

In the realm of warfare, cost has rarely been the decisive factor, and formal methods tools are not outrageously costly compared to other required costs of maintaining a superior military. In this realm, unlike in building and maintaining ships, planes, and rockets and defenses against them, the cost lies in training and recruiting people. These people will have valuable skills for the private sector as well, and that sector will remain starved for well educated talent.

So overall, investing in advanced formal methods technology will be seen as cost effective. Proof assistants are the best tools we have in this sector, and we know how to make them significantly better by investing more. These tools include Agda, Coq, Nuprl, Minlog, and MetaPRL on the synthesis side, and ACL2, Isabelle-HOL, HOL-Light, HOL, and PVS on the verification side. All of them are good, each has important unique features. All have made significant contributions. All of them have ambitions for becoming more effective at what they do best. They will attract more funding, and new systems will emerge as well.


Medium Term

Most of the current proof assistants are also being used in education. It seems clear to me that the proof assistants that are well integrated with programming languages, such as Coq with OCaml and ACL2 with Lisp will prosper from being used in programming courses. Other proof assistants will move in this direction as well, surely Nuprl will do that. What happens in education will depend on the forces that bring programming languages in and out of fashion in universities. It seems clear to me that the Coq/OCaml system will have a chance of being widely adopted in university education, at least in Europe and the United States. It has already had a major impact in the Ivy League – Cornell, Harvard, Princeton, UPenn, and Yale are all invested as is MIT. I predict that within five years we will see this kind of programming education among the MOOCs, and we will see it move into select colleges and into top high schools as well.

All of the proof assistants have some special area of expertise. If there is sufficient funding, we will see them capitalize on this and advance their systems in various directions. Nuprl has been very effective in building correct by construction distributed protocols that are also attack tolerant. It is likely that this work will continue and result in building communication primitives into the programming language to match the theory of events already well developed formally. Coq has been used to build a verified C compiler, at UPenn and Harvard Coq is being used to verify the Breeze compiler for new hardware designed to support security features, and at MIT they are using Coq to verify more of the software stack with the Bedrock system. There are other applications of Coq too numerous to mention. We will see additional efforts to formally prove important theorems in mathematics, but it’s hard to predict which ones and by which systems. Current effortsto use proof assistants in homotopy theory will continue and will produce important mathematical insights. Proof assistants will also support the effort to build more certified algorithms along the lines of Kurt Mehlhorn’s impressive work.


Long Term

People who first encounter a proof assistant operated by a world class expert is likely to be stunned and shocked. This human/machine partnership can do things that seem impossible. They can solve certain kinds of mathematics problems completely precisely in real time, leaving a readable formal proof in Latex as the record of an hour’s worth of work. Whole books are being written this way as witnessed by the Software Foundations project at UPenn mentioned previously. I predict that we will see more books of readable formal mathematics and computing theory produced this way. Around them will emerge a technology for facilitating this writing and its inclusion in widely distributed educational materials. It might be possible to use proof assistants in grading homework from MOOCs. Moreover, the use of proof assistants will sooner or later reach the high schools.

I think we will also see new kinds of remote research collaborations mediated by proof assistants that are shared over the internet. We will see joint research teams around the world attacking certain kinds of theoretical problems and building whole software systems that are correct by construction. We already see glimpses of this in the Coq, Agda, and Nuprl community where the proof assistants share very similar and compatible type theories.

There could be a larger force at work for the very long term, a force of nature that has encoded mechanisms into the human gene pool that spread and preserve information, the information that defines our species. We might not be Homo sapiens after all, but Homo informatis. The wise bits have not always been so manifest, but the information bits advance just fine – at least so far. That is the aspect of our humanness for which we built the proof assistant partners. They are part of an ever expanding information eco system. There is a chance that proof assistants in a further evolved form will be seen by nature as part of us.

Luke: Thanks, Bob!