Josef Urban on Machine Learning and Automated Reasoning
Josef Urban is a postdoctoral researcher at the Institute for Computing and Information Sciences of the Radboud University in Nijmegen, the Netherlands. His main interest is development of combined inductive and deductive AI methods over large formal (fully semantically specified) knowledge bases, such as large corpora of formally stated mathematical definitions, theorems and proofs. This involves deductive AI fields such as Automated Theorem Proving (ATP) and inductive AI fields such as Machine Learning. In 2003, he made the largest corpus of formal mathematics – the Mizar library – available to ATP and AI methods, and since then has been developing the first approaches, systems and benchmarks for AI and automated reasoning over such corpora. His Machine Learner for Automated Reasoning (MaLARea) has in 2013 solved by a large margin most problems in the mathematically oriented Large Theory division of the CADE Automated Systems Competition (CASC). The system is based on several positive-feedback loops between deducing new proofs and learning how to guide the reasoning process based on successful proofs. Such AI methods are also being deployed as systems that assist interactive theorem proving and formal verification in systems like Mizar, Isabelle and HOL Light.
He received his PhD in Computers Science from the Charles University in Prague in 2004, and MSc in Mathematics at the same university in 1998. In 2003, he co-founded the Prague Automated Reasoning Group, and from 2006 to 2008 he was a Marie-Curie fellow at the University of Miami.
Luke Muehlhauser: In Urban & Vyskocil (2013) and other articles, you write about the state of the art in both the formalization of mathematics and automated theorem proving. Before I ask about machine learning in this context, could you quickly sum up for us what we can and can’t do at this stage, when it comes to automated theorem proving?
Josef Urban: The best-known result is the proof of the Robbins conjecture found automatically by Bill McCune’s system EQP in 1996. Automated theorem proving (ATP) can be used today for solving some open equational algebraic problems, for example in the quasigroup and loop theory. Proofs of such open problems may be thousands of inferences long, and quite different from the proofs that mathematicians produce. Solving such problems with ATP is often not a “push-button” matter. It may require specific problem reformulation (usually involving only a few axioms) suitable for ATP, exploration of suitable parameters for the ATP search, and sometimes also guiding the final proof attempt by lemmas (“hints”) that were useful for solving simpler related problems.
ATP is today also used for proving small lemmas in large formal libraries developed with interactive theorem provers (ITPs) such as Mizar, Isabelle and HOL Light. This is called “large-theory” ATP. In the past ten years, we have found that even with the large mathematical libraries (thousands of definitions and theorems) available in these ITPs, it is often possible to automatically select the relevant previous knowledge from such libraries and find proofs of small lemmas automatically without any manual pre-processing. Such proofs are usually much shorter than the long proofs of the equational algebraic problems, but even that already helps a lot to the people who use ITPs to formally prove more advanced theorems.
An illustrative example is the Flyspeck (“Formal Proof of Kepler”) project led by Tom Hales, who proved the Kepler conjecture in 1998. The proof is so involved that the only way how to ensure its correctness is formal verification with an ITP. Tom Hales has recently written a 300-page book about the proof, with the remarkable property that most of the concepts, lemmas and proofs have a computer-understandable “Flyspeck” counterpart that is stated and verified in HOL Light. Hales estimates that this has taken about 20 man-years. Recently, we have shown that the proofs of about 47% of the 14000 small Flyspeck lemmas could be found automatically by large-theory ATP methods. This means that the only required mental effort is to formally state these lemmas in HOL Light. Again, the caveat is that the lemmas have to be quite easy to prove from other lemmas that are already in the library. On the other hand, this means that once we have such large corpus of formally-stated everyday mathematics, it is possible to use the corpus together with the large-theory ATP methods to solve many simple formally-stated mathematical problems.
Another interesting recent development are ATP methods that efficiently combine “reasoning” and “computing”. This has been a long-time challenge for ATP systems, particularly when used for software and hardware verification. Recent SMT (Satisfiability Modulo Theories) systems such as Z3 integrate a number of decision procedures, typically about linear and nonlinear arithmetics, bitvectors, etc. I am not an expert in this area, but I get the impression that the growing strength of such systems is making the full formal verification of complex hardware and software systems more and more realistic.