A Lost Proof

Christoph Benzmüller1
Manfred Kerber2


1Fachrichtung Informatik, Universität des Saarlandes
66041 Saarbrücken, Germany

2School of Computer Science
The University of Birmingham, Birmingham B15 2TT, England
http://www.cs.bham.ac.uk/~mmk

ABSTRACT

We re-investigate a proof example presented by George Boolos which perspicuously illustrates Gödel's argument for the potentially drastic increase of proof-lengths in formal systems when carrying through the argument at too low a level. More concretely, restricting the order of the logic in which the proof is carried through to the order of the logic in which the problem is formulated in the first place can result in unmanageable long proofs, although there are short proofs in a logic of higher order. Our motivation in this paper is of practical nature and its aim is to sketch the implications of this example to current technology in automated theorem proving, to point to related questions about the foundational character of type theory (without explicit comprehension axioms) for mathematics, and to work out some challenging aspects with regard to the automation of this proof - which, as we belief, nicely illustrates the discrepancy between the creativity and intuition required in mathematics and the limitations of state of the art theorem provers.

1.  Introduction

The art of general purpose automated theorem proving has been best developed for first-order logic. State of the art systems like Otter or Spass are very powerful and can solve many problems of large collections of problems (see http://www.cs.jcu.edu.au/ tptp/TPTP). A particular highlight of the success of first-order theorem provers was the machine-generated proof of the Robbins problem [McCune97]. While there are problems which are inherently of a higher-order nature and for which mere first-order mechanisms seem not to be appropriate (like diagonalisation proofs) surprisingly even for first-order problems to stay in first-order logic for the proof search may not be a good idea in general. Boolos gives a first-order example (actually it is a parameterised class of examples) that clearly sketches the limitations of first-order logic. Since there are complete calculi for first-order logic there is a first-order proof for Boolos' problem, but the proof is so long that it is intractable in first-order logic. Actually, it is easy to see that any bound on the proof length can be transgressed in first-order logic, while in higher-order logic there is a short proof which can be summarised in one page.

Higher-order logic allows quantification over function and predicate variables and thus provides a far more expressive language than first-order logic. It was Bertrand Russell [Russell02,Russell03] who first pointed out in 1902 that in connection with the unrestricted comprehension principlesThese principles assure the existence of certain functions, we will come back to them in Subsection 2.1. this may allow for paradoxes.Paradoxes occur equally in first-order axiomatisations of set theory with unrestricted comprehension principles. The most prominent example is the set of all non-self-containing sets (also called Russell's paradox). As a possible solution Russell suggested a few years later in [Russell08] a theory of types as a basis for the formalisation of mathematics that differentiates between objects and sets (or functions) consisting of these kinds of objects. This idea was also taken up by Alonzo Church in 1940, who introduced a formulation of type theory [Church40], also called simple theory of types, based on Russell's ideas which is taken as a basis formalism in many modern higher-order theorem provers. Church's lambda calculus and his theory of types also opened the door for the propositions as types idea and constructive type theory.

TPS [andrews00:systemdescr,andrews:tatpsfctt96], Hol [Gordon85], Omega [OwRu92], Omega [BenzmuellerEtAl:otama97], lambda-Clam [CADE98*129], Leo [BeKo:LAHOTP98] are interactive and automated higher-order provers based on the simple theory of types. Some prominent provers working for constructive type theory are uprl [Constable86], COQ [CoqMan95], and LEGO [luo92:legoproofdevelsystem].

Automating proof search in higher-order logic is a very challenging enterprise, such that the above systems all provide facilities to combine interaction with automation. The idea is that the interactive human provides the crucial proof steps while simple subgoals are handled automatically by the prover. Of course, many non-trivial proofs can be already automated in higher-order logic and the most impressive record in this sense is given by the theorems proven automatically with TPS [andrews00:systemdescr,andrews:tatpsfctt96]. A well known and very simple example illustrating the expressiveness and elegance of automated higher order theorem proving is Cantor's theorem, where the diagonalisation argument, in form of a lambda-term, is synthesised by higher-order unification. For further and more impressive examples we refer to [andrews:tatpsfctt96].

However, as we will discuss below Boolos' example perspicuously demonstrates the limitations of current first-order and higher-order theorem proving technology. As we will illustrate, with current technology it is not possible to find his proof automatically, even worse, automation seems very far out of reach. Let's first give a high-level description why this is so. Firstly, Boolos' proofs need comprehension principles to be available and it employs different complex instances of them. It is important to note here that theorem provers based on Church's theory of types typically assume that the comprehension principles can be completely avoided due to the power of higher-order unification and the existence of lambda-terms. First-order theorem provers use no comprehension principle nor typically any other definition principle to introduce new concepts. Secondly, the particular instances of the comprehension axioms cannot be determined by higher-order unification but are so-called Eureka-steps which have to be guessed. However the required instantiations here are so complex that it is unrealistic to assume that they can be guessed, for instance, by blindly applying the primitive substitution principle [andrews:tatpsfctt96], by splitting rules [Huet72,Huet73b], or by concisely representing them so that new ones can be generated during the proof search [Kerber94-ECAI]. Here it is where human intuition and creativity comes into play, and the question arises how this kind of creativity can be realised and mirrored in a theorem prover.

In the following we concentrate on the classical notion of higher-order logic and, in particular, on the simple theory of types, that is, a classical higher-order logic based on Church's simply typed lambda-calculus [Church40]. For an introduction to the pure simply typed lambda-calculus we refer to [Barendregt84].

Even though our viewpoint here is from classical higher-order logic the problem is more general and applies to constructive type theory [Martin-Loef84] and other approaches as well.

2.  Automating Higher-Order Proof Search

The automation of the simple theory of types was pioneered by the resolution based methods discussed in [Andrews71,Huet72,JePi72]. The authors introduce formal derivation systems and discuss soundness and completeness with respect to Henkin semantics [Henkin50], respectively general models [Andrews72,Andrews72b]. Huet combined his constrained resolution approach with a higher-order pre-unification algorithm [Huet75] which avoids guessing aspects of full higher-order unification. More recent approaches are discussed in [Wolfram93,BenKoh:ehor98] and a comparison of these approaches is provided by [benzmueller01:compar].

In the rest of this section, we briefly discuss some specifics of theorem proving in higher-order logic.

2.1.  Comprehension

The lambda-binding construct in combination with the lambda-conversion rules in the simple theory of types have the effect that the type restricted comprehension axioms become derivable (see also [Andrews86, Chapter 5, p. 159]). The comprehension axioms are instances of the following axiom schema EXISTS N alpha n ->  beta . FORALL zn . N(z1,... ,zn) = B beta Here B stands for an arbitrary term such that the variable N does not occur free in it. The intention of the comprehension axioms is to guarantee for each expression B the existence of the functions (or sets) referred by N alpha n ->  beta . For instance, if beta is the type o of truth values the comprehension principle asserts that for any respective formula B there exists a set N (referred to by the variable N) which contains exactly all those n-tuples for which B evaluates to true. The strength of the lambda-notation lies in the fact that the required sets and functions N can easily be directly described by the term lambda zn . B, so that the implicit requirement that each term in the simple theory of types has a denotation, already ensures their existence.

The theorem proving approaches from above as well as the systems mentioned in the introduction therefore avoid the infinitely many comprehension axioms altogether.

2.2.  Primitive Substitution

In higher-order proof search some set variables may have to be instantiated with complex terms which cannot be synthesised by higher-order unification. Simple examples illustrating this problem are EXISTS xo. x or EXISTS p i  ->  o . FORALL y i . NOT  p(y). In a resolution approach, for instance, the second formula leads to an initial unit clause p(Y), where p is a free set variable and Y a Skolem term for y. There is no further partner clause with a complementary literal available and thus no resolution step is possible. However, when instantiating the set variable p with the (complementary) set lambda x . NOT  p(x) leading to a second clause NOT  p(Y), then a proof can be immediately constructed. Note that this proof in some sense mirrors an initial instantiation of set variable p by a term lambda x. p(x) & NOT  p(x) denoting the empty set. Unfortunately this set description cannot be synthesised by unification in the context. An example which requires more complex instantiations is X5310 from the TPS library, which is discussed in detail in [andrews:tatpsfctt96, pp.333-335]. There, for instance, an instantiation lambda p beta  ->  o lambda y beta . EXISTS x beta . p(x)  ->  p(y) is employed.

In Huet's resolution approach the splitting rules address this problem, which are replaced in Andrews' work [Andrews:ocahol89] by the more elegant primitive substitution principle. Splitting or primitive substitution blindly instantiates free set variables at literal head positions with a most general binding (partial binding) that imitates a logical connective, that is, with a most general formula that introduces a logical connective as head. For instance, the set of most general bindings for head variable p of the literal p(Y) from above is p <-- lambda z . NOT  b1(z), p <-- lambda z . b2(z) v b3(z), p <-- lambda z . FORALL w beta . b4(z,w) where b1,2,3 i  ->  o and b4 i  ->  ( beta  ->  o) are new free head variables. Since there are infinitely many universal quantifiers in the simple theory of types (one for each type beta ) splitting and primitive substitution are infinitely branching. And as new free predicate variables are introduced they are recursively applicable to the results of their application. From an abstract point of view they realise a blind enumeration of all formula schemes in the Herbrand universe.

Clearly an unrestricted blind enumeration in this sense is practically infeasible. [andrews:tatpsfctt96] therefore explains how this problem is handled in the TPS prover, which instead of a blind enumeration works with some selectively chosen instances.

2.3.  Extensionality

Another source of blind search in higher-order which is in contrast to comprehension and primitive substitution less relevant for this paper is extensionality reasoning. The (type parameterised) functional extensionality principles FORALL m alpha  ->  beta . FORALL n alpha  ->  beta . m = n  <->  FORALL x . m(x) = m(x) and Boolean extensionality principle FORALL po . FORALL qo . p = q  <->  (p  <->  q) are both valid in Henkin semantics. In order to guarantee Henkin completeness these infinitely many axiomsPolymorphism may help to avoid infinitely many extensionality axioms in the search space. However, polymorphism does solve the search space problem induced by blind forward search with the extensionality principles. are assumed to be available in the search space of nearly all approaches. The problem is that the beta a-reasoning facilities built-in to higher-order unification are not sufficient to cover all extensionality aspects in theorem proving, such that blind search with these axioms is additionally required. In practical applications, however, the extensionality axioms are at the expense of completeness entirely or at least partly avoided. A complete approach which avoids these axioms in the search space and instead employs a goal directed extensionality treatment is provided by [BenKoh:ehor98].

3.  "A Curious Inference"

In [boolos98:logiclogicandlogic, Chapter 25, p. 376-382] Boolos presents an example of a first-order problem which has only a very long derivation in first-order logic, but which has a short derivation in a second-order logic, by making use of comprehension axioms. He builds up on earlier work by Gödel [Goedel36] who showed that in a higher-order logic by going to higher levels it may be possible to obtain short proofs.

In this section we take a closer look at this example and shed light on the question why a comprehension axiom of second-order logic is needed and even first-order with definition principle wouldn't suffice to generate short proofs.

Boolos assumes an inference system in the first-order predicate calculus with identity and function symbols s (unary) and f (binary).

  1. FORALL  n.  f(n,1)=s(1)
  2. FORALL  x.  f(1,s(x))=s(s(f(1,x)))
  3. FORALL  n. FORALL  x. f(s(n),s(x))=f(n,f(s(n),x))
  4. D(1)
  5. FORALL  x. (D(x) ->  D(s(x)))
         hence
  6. D(f(s(s(s(s(1)))),s(s(s(s(1))))))

Notice that there is no induction principle available. If it were available there would be a proof in two steps, namely because of (4) and (5), we would get FORALL  x. D(x) by induction and hence D(f(s(s(s(s(1)))),s(s(s(s(1)))))) by FORALL-elimination.

Actually, there is a relatively easy but enormously long proof. f is an Ackermann function and it is possible to prove the conclusion by f(s(s(s(s(1)))), s(s(s(s(1)))))-1 many application of Modus ponens from (4) with (5) to come to (6). Function f as defined in (1)-(3) grows extremely fast and hence f(s(s(s(s(1)))),s(s(s(s(1))))) is a very big numberNumber in the sense that f(s(s(s(s(1)))),s(s(s(s(1))))) unravels into the successive application of a very big number of s's to 1; in fact, Boolos shows that this number is at least the result of an exponantial stack 2(2***2) containing 64K `2s' in all. such that there is no chance to actually perform so many applications with all computation power ever.

A proof in second-order logic makes use of comprehension axioms, in particular, of the following twoIn the comprehension schema in Subsection 2.1 we employed = to cover the general case. Here N is a predicate and N(z) a term of type o such that = can be replaced by  <->  by boolean extensionality as introduced in Subsection 2.3.: EXISTS  N. FORALL  z. N(z) <-> (FORALL  X. X(1) & FORALL  y. (X(y) ->  X(s(y))) ->  X(z)) EXISTS  E. FORALL  z. E(z) <-> (N(z) &  D(z))

With this comprehension principle Boolos proves strong lemmata which can be viewed as conditional induction principle of the kind: "assumed the induction principle holds for number z - corresponding to N(z) - then we can show for any predicate X a property X(z) by induction."

The proof goes as follows; first you establish the following lemmata:

Lemma 1: N(1), FORALL  y. (N(y) -> N(s(y))), N(s(s(s(s(1))))), E(1), FORALL  y. (E(y) -> E(s(y))), and E(s(1))
Proof sketch: These are easy consequences of the axioms.

Lemma 2: FORALL  n. N(n) -> FORALL  x. (N(x) -> E(f(n,x)))
Proof sketch:

The rest of the proof of the lemma is mainly a further reduction of the problem in a similar way, the theorem itself is an easy application of the two lemmata.

Boolos' proof makes a couple of times use of the comprehension principle:

Subgoal to prove   comprehension axiom applied
FORALL  n.  N(n) -> (FORALL  x. N(x) ->  E(f(n,x)))   EXISTS  M. FORALL  n. M(n) <-> (FORALL  x. N(x) ->  E(f(n,x)))
FORALL  x. N(x) ->  E(f(1,x))   EXISTS  Q. FORALL  x.  Q(x) <->  E(f(1,x))
FORALL  x. N(x) ->  E(f(s(n),x)) from            FORALL  x. N(x) ->  E(f(n,x))   EXISTS  P. FORALL  x.  P(x) <->  E(f(s(n),x))
In plain first-order logic this cannot be modelled. Even in first-order logic with definition principle it can't. Namely if we try to do so we run into problems. For instance, we may try to define N(n) as M(1) & FORALL  y. (M(y) ->  M(s(y))) ->  M(n), but this is no longer a proper definition, since now N is defined in terms of M and M in terms of N, that is, this higher-order construction cannot be modelled in first-order logic. The original definition of N heavily depends on the universal second-order quantifier FORALL X, in which X can be later instantiated by predicates which are defined in terms of N itself.

Boolos' example - as well as Gödel's results - show that even for cases in which it is sufficient to stay completely in first-order logic in principle, it may not be sufficient in practice.

4.  Implications, Questions, and Challenges

In this section we discuss the implications of this example for recent higher-order theorem proving approaches, formulate some related foundational questions, and explain why we consider Boolos' example as a challenge in automated reasoning for the new millennium.

4.1.  Implications for existing Higher-order theorem proving approaches

For higher-order automated theorem proving there is the very difficult question how it may be determined whether certain comprehension axioms are necessary or not. Note that the comprehension axiom required in the above proof can not be replaced by the unification with a lambda-expression, since Boolos' problem is initially a mere first-order problem, that is, there is no second-order term given to unify with. Since there is in particular no higher-order variable given in the problem formulation also primitive substitution is not applicable. Hence, without comprehension and higher-order variables there is no chance at all to introduce any kind of structures that could support Boolos proof.

Of course, theoretically there are the extensionality axioms which contain higher order variables. Furthermore, in our particular case one could think of adding some new axioms like FORALL bo . EXISTS N alpha n ->  o . FORALL zn . N(zn) = b instead of all comprehension axioms. The idea would be to leave the particular structure of the terms to the right open at the beginning and to employ primitive substitution to generate respective instances for b (which may depend on z) during proof search. A third option is to introduce free variables via the following tertium non datur axiom FORALL bo . b v NOT  b Whereas it seems to be impossible to derive the required comprehension axioms from the extensionality axioms by applying the primitive substitution principle, this is far more promising in the latter cases. However, as remarked in Subsection 2.2, without guidance, primitive substitution just subsequently enumerates formula schemes, such that this would differ only slightly from a direct enumeration of all comprehension axioms (for terms B of type o).

As remarked in Subsection 2.1 current approaches typically completely avoid the comprehension axioms (in practical applications the extensionality axioms and the tertium non datur axiom are avoided as well). Thus, Boolos elegant higher-order proof is not supported in these approaches. However, there exist completeness proofs for them ensuring that any formula that is valid in Henkin semantics can indeed be shown as such. Of course, the intractable first-order proof for Boolos' examples can theoretically be synthesised, so the example is not a counterexample to, but an example of these completeness result. It demonstrates well what completeness means, namely that practically a proof may not be found since proofs are prohibitively long. While this is clear for problems which do not have short proofs, Boolos' example has a practically tractable (with respect to its size) higher-order proof. It gets lost, however, in standard higher-order theorem proving because the comprehension schemes (or alternative axioms) are not available. Hence the question arises whether it makes sense at all to avoid the comprehension principles in the simple theory of types if this has the side effect that some tractable proofs get lost thereby. Especially in systems aiming at supporting non-trivial mathematics, where similar elegant proof constructions are potentially required this could be a serious problem.

4.2.  Why is Boolos' example so challenging

Without any doubt the problem above is a challenge problem for automated theorem provers and we try to speculate in the following how it may be possible to solve such hard problems systematically. However, it should be very clear that these problems will remain very hard for the foreseeable future. While the problem to search in the original space is too hard, the problem to bring arbitrary comprehension axioms into play will open a different dimension with comparatively short proofs but extremely bushy search spaces, the only way to restrict that in some way seems at the moment to see how humans can master such search problems. Different observations can be made on that:

5.1.  How to find a solution?

In this Subsection we speculate what techniques are around that can be explored to get a handle on finding Boolos' proof automatically.

6.  Conclusion

We re-investigated a very amenable proof example originally provided by Boolos to illustrate Gödels argument for the potentially drastic decrease of proof lengths when ascending to a logic of higher order. Aside from its intended illustrative character this example poses very practical questions and, as we belief, provides a good basis for a critical discussion of recent automated theorem proving technology. In particular we demonstrated that not only first-order theorem approaches are in principle condemned to fail when applied to this example but also recent higher-order approaches approaches and systems. The problem is that the avoidance of axioms like the comprehension principles (or alternative axioms which introduce free higher-order variables) in higher-order theorem proving obviously causes the loss of tractable proofs, like Boolos' elegant second-order proof for the given example. The expressiveness and power of higher-order logic is not employed to its full extend in recent higher-order approaches and on the given example they would operate as naive as their first-order counterparts.

Therefore we think that Boolos example will remain a very challenging problem for automation for the foreseeable future. Because of its perspicuity it is however well suited to speculate and investigate what kind of mechanisms (such as abstraction and reflection) will be required to fruitfully tackle it.

References


© 2001, Christoph Benzmüller, Manfred Kerber
The URL of this page is http://www.cs.bham.ac.uk/~mmk/papers/01-TPHOL-WS.html.
Also available as pdf ps.gz bib .