Model-Guided Proof Planning

Seungyeob Choi and Manfred Kerber

School of Computer Science, The University of Birmingham
Edgbaston, Birmingham B15 2TT, England
http://www.cs.bham.ac.uk/~mmk

ABSTRACT

Proof planning is a form of theorem proving in which the proving procedure is viewed as a planning process. The plan operators in proof planning are called methods. In this paper we propose a strategy for heuristically restricting the set of methods to be applied in proof search. It is based on the idea that the plausibility of a method can be estimated by comparing the model class of proof lines newly generated by the method with that of the assumptions and of the theorem. For instance, in forward reasoning when a method produces a new assumption whose model class is not a superset of the model class of the given premises, the method will lead to a situation which is semantically not justified and will not lead to a valid proof in later stages. A semantic restriction strategy is to reduce the search space by excluding methods whose application results in a semantic mismatch. A semantic selection strategy heuristically chooses the method that is likely to make most progress towards filling the gap between the assumptions and the theorem. Each candidate method is evaluated with respect to the subset and superset relation with the given premises. All models considered are taken from a finite reference subset of the full model class. In this contribution we present the model-guided approach as well as first experiments with it.

KEYWORDS: proof planning, automated theorem proving, model generation

1.  Introduction

Although modern computing technologies make it possible to solve many complex mathematical problems, mathematical theorem proving still remains a difficult task for computers as it requires sophisticated reasoning capabilities. When the first theorem proving systems were built there was great optimism that the separation of the syntax from the semantics could lead to a purely syntactical procedure, which - once efficiently implemented - would allow for the automation of difficult mathematical proofs. This has to a certain degree been realised and become true in so-called machine-oriented theorem provers, but the hyperexponential nature of the search spaces shows that there are severe limitations of this approach. It works well mainly in particular niches of the space of potential theorem proving problems.

The most prominent machine-oriented approaches are based on the resolution [Robinson65] or the tableau calculus [Smullyan68]. In resolution, the problem that consists of assumptions and a theorem is transformed into a normal form, the so-called clause normal form, and a proof is searched for on this clause level. The approach consists of firstly efficiently searching big search spaces, and secondly making good heuristic choices, since the problem is known to be intractable for hard problems. More and more powerful systems have been built on such paradigms, e.g. MKRP [EO86], SETHEO [LSBB92], OTTER [McCune94], SPASS [WGR96], and VAMPIRE [RV01].

On the one hand, these systems show remarkable performance on many problems in particular application areas. Even open mathematical problems could be solved with the assistance of automated provers for the first time. Most notably, the problem whether Robbins algebras are Boolean algebras was successfully proved by an automated theorem prover [McCune97]. On the other hand, these methods unfortunately often fail even for problems that mathematicians would consider trivial.

Humans use approaches to theorem proving which drastically differ from the approaches machine-oriented theorem provers use. A deep understanding what these human methods exactly are is unfortunately still missing and will probably be very difficult to obtain in full generality. However, partial insights can be obtained from the reflections by mathematicians themselves, by which they try to explain the way how they find mathematical proofs [Hadamard44,Waerden64]. The extensive work by Pólya on teaching mathematics [Polya45,Polya54,Polya62] is a rich source of examples how mathematicians prove theorems. Furthermore, the field of mathematical reasoning has been investigated in cognitive science [Braine78,JB91]. This all led to very valuable insights and significant steps towards a computer model how humans perform mathematical reasoning. The area in which attempts are made to develop such insights to a computational model of human theorem proving is called human-oriented theorem proving.

Let us look at some differences between the approaches of human-oriented and machine-oriented theorem proving. When humans try to prove theorems, they usually do not transform the whole problem into a normalised form nor do they search for the solution on a logical level. Instead, they search and deduce the conclusion on a more abstract mathematical level. They decide on what proof methodology to use, and as they proceed in a proof attempt, they choose what inference rule to apply. The proof methodologies and inference rules are obtained from either textbooks or previous experience in similar examples. Unlike most machine approaches, these inference rules are not just a set of rules which are fixed once and for all, but any successful proof of a problem can be stored in the mathematicians' memory (in total or in parts) and may be reused as a new inference step for a new problem. Human-oriented theorem proving is an attempt to model (parts of) the human problem solving behaviour. Since it makes use of domain specific reasoning and allows for a hierarchical approach, the exponential explosion of the search space which is typical for machine-oriented theorem proving can be avoided for many practical problems. The most prominent approach to human-oriented theorem proving is proof planning [Bundy88]. In proof planning, the search for a proof is considered as a planning process in which existing plan operators, called methods, are combined to form a full plan which can then be (hopefully) carried through.

A major difference between the machine-oriented way of proving theorems and the way how humans do it is the size of the search spaces involved. Machines can traverse search spaces with millions of nodes with reasonable resources. Humans, however, cannot do that, but have to rely on keeping search spaces small by putting enormous effort on adequate representations (and using re-representations) as well as making use of good heuristics that explore the more promising parts of the search space first. A major challenge for the field of human-oriented theorem proving is to understand better how humans arrive at such heuristics. Up to now such knowledge is only patchy. For instance, heuristic information can be encoded in form of specificity which prefers a specific method over a general one, or in form of an explicit ranking which rates methods independently of the concrete situation. None of these approaches provides a satisfying answer to what method to choose.

Semantic information seems, to a large extent, to be used as a form of heuristics in human problem solving. Checking an argument in one concrete model or a couple of models seems to be a very important reasoning pattern. If the argument holds in a couple of key models it has significantly gained in evidence. If it fails, it is rejected and a different argument must be constructed. We believe that a semantic approach will play a very important role in modelling the human reasoning capabilities in mathematics. The main idea of the work presented here is that the plausibility of a method (or a step) in a proof plan can be estimated by means of comparing models taken from a reference class of models.

The remainder of the paper is organised as follows. In the next section we describe proof planning in more depth. We discuss model based reasoning in section 3. The semantic restriction and selection strategies which are the main contributions are presented in section 4. Section 5 discusses the state of our implementation and some initial results. We conclude in section 6.

2.  Proof planning as a way of reasoning

Proof planning [Bundy88] is an approach to human-oriented theorem proving. It is based on the idea that the reasoning process for proving mathematical theorems can be viewed as a planning process which employs abstract plan operators, called methods. Assume we want to prove a mathematical theorem (or conclusion) phi from premises A1,... , An. The picture of proof planning is that we have a gap between the assumptions and the theorem, which we try to fill by proof steps. We try to do that by systematically reducing gaps by introducing intermediate assertions. This is done by methods which represent on the one hand program fragments which generate specific proof parts, and on the other hand high-level specifications of the program fragments which make planning possible. In each planning stage, the proof planner selects a method which fills in the gap between premises and conclusion. A proof plan is found when there are no more remaining gaps on an abstract level. A proof is found when the proof plan can be successfully executed. Note that proof plans may fail.

2.1.  The plan operator: methods

Methods play the most significant role in proof planning. In Bundy's view [Bundy88] a method consists of a tactic, a precondition, and a postcondition. The tactic is a piece of program code that manipulates the actual proof in a controlled way, and the precondition and the postcondition form a specification of the deductive ability of the tactic. In the Omega proof planner, a slightly extended view has been taken in order to allow automated adaptation of methods to new problems. Huang et al. [HKK92,HKRS94,Kerber98] proposed an idea of separating the procedural and declarative knowledge in the tactic part of the methods.

1. Omega method

An Omega method has the following slots: declarations, premises, constraints, conclusions, declarative content, and procedural content as displayed in Figure 1. The declarations specify a signature that declares variables of the method. The premises consist of a list of proof line schemata which are used to prove the conclusions. The constraints contain additional restrictions on the premises and the conclusions. The conclusions slot contains the proof lines which are the goal of the proof. The declarative content provides declarative knowledge that is interpreted by the procedural content. And finally, the procedural content is a piece of program code that takes a piece of declarative knowledge and a pointer to the current proof plan tree, and produces a subtree which can be integrated into the current tree.

2.2.  Forward and backward proof planning

Mainly two different types of planning strategies, forward and backward planning, are used in proof planning. In forward planning, the planner searches for a method that takes assumptions (axioms, definitions) or already proved intermediate results and produces new facts that can be used as new assertions in the following planning steps. Initially, there is a gap between the initial assumptions A1, ... , An and the theorem Th as shown in Figure 2. The planner starts from the assumptions and proceeds toward the theorem by filling in the gaps between assumptions and the theorem. When the line generated by the method is identical to the theorem, a proof plan is found.

2. Forward proof planning

Likewise, in backward planning, a method takes a theorem and produces subgoals sufficient to deduce the theorem. Some of the subgoals may be identical to existing assumptions, but typically some others are new and need to be added as open goals to be proved in later steps. In Figure 3, the planner starts from the open goal (initially, the theorem Th) and works backward towards the assumptions.

3. Backward proof planning

2.3.  An example of a proof plan

The method for mathematical induction is an abstract level method that transforms the problem to a base case and a step case, in which the proof is obtained by recursive applications of induction rules. Note that we describe here only a simplified form. The proof plan for induction was studied in great detail as the first application of proof planning [BvHHS91].

Figure 4 shows the basic structure of the proof plan for induction.

4. Proof plan for mathematical induction [IB96a]
A proof for the step case consists of two steps, firstly transforming the induction conclusion to make it syntactically similar to the induction hypothesis, and secondly applying the induction hypothesis.

Let us look at a concrete example. With the inductive definition of +:

0+y=y
s(x)+y=s(x+y)
Figure 5 illustrates a concrete inductive proof for the theorem that addition on natural numbers is associative. (We use x:t to denote x is of type t.)
5. Proof of the associativity [BvHHS91]
In Figure 5, five methods are used. They are induction, base, wave, fertilise, and symeval (symmetric evaluation). The induction method separates the theorem into two sub-theorems, base case and step case, and replaces x by 0 in the base case and by s(x') in the step case. The base and the wave methods rewrite the base case and the step case, respectively. The symeval method evaluates whether the two expressions are identical. The fertilise method then replaces the expression with the induction hypothesis and completes the proof. The rippling process [BSvHI93,BvHSI90] repeats applications of the wave rule to obtain a form to which the fertilise method can be applied.

2.4.  Backtracking - a problem for proof planning

Although proof planning allows for more abstract proof search, and typically involves smaller search spaces than traditional machine-oriented theorem proving, this does not mean that the standard problems of search do not constitute a problem. Proof planning can be viewed as a process in which one searches iteratively for a method for the simplification of a given problem until no more subproblems are generated in this process. At each stage of the planning process, the procedure takes one method which has the highest priority. When a method leads to a stage in which no further methods are applicable (or promising) but no full plan has been generated, backtracking becomes necessary and another candidate method is tried. In existing proof planners, heuristic selection is typically performed on static information (such as specificity or explicit rank explained in section 4) attached to each method regardless of the concrete problems. More flexible heuristics seem to be necessary.

Typically the planner finds several candidates that seem equally applicable at each stage of the planning process. However, some branches have dead ends from where the planner can not proceed any more. Figure 6 illustrates the backtracking mechanism where the planner chose a method, proceeded two more steps and found that it was blocked.

6. Backtracking
The overhead of backtracking increases as every application of a method that leads to an unsuccessful proof attempt may cause several other applications of methods for further potentially unsuccessful proof attempts.

Although there is no general way to disambiguate good from bad steps for undecidable problems, a key to successful theorem proving - whether it is done by a human being or by a machine - is to minimise wrong choices in search. In the next section we try to show how models can reduce the backtracking in the search for proof plans.

3.  Model based reasoning

Humans use models in various forms of reasoning. Models at least contribute to the heuristic information which direction the proof procedure should take. This has been taken into account in several traditional machine-oriented theorem provers [SLM94,CP94,KC00], which have incorporated semantic guidance into their proof search. However, since these provers are based on a low-level refutation procedure, the semantic guidance is restricted to choosing clauses that are most likely to contribute to finding a semantic contradiction. There is a significant difference between machine-oriented and human-oriented attitudes towards how to use models to guide the proof search. Mathematicians use semantics not only to check a contradiction but also to get the implications of a problem and to measure the plausibility of the logical consequences. In this section, we introduce a human-oriented view of model based guidance and present how it works in proof planning.

3.1.  Model based representation of reasoning

A visual representation of reasoning was proposed in mental models [JB91], in which deductions are modelled by Venn diagram. For instance, we have two premises: all psychologists are experimenters, and all experimenters are skeptics. The first premise gives rise to two diagrams:

The first diagram illustrates that all psychologists are experimenters and there are also experimenters other than psychologists, and the second diagram shows that all psychologists are experimenters and there are no other experimenters. Likewise, the second premise also gives rise to two diagrams:
The combinations of the diagrams are:
From all the combinations the conclusion is derived: all psychologists are skeptics.

We take up this idea of the mental models to mathematical reasoning by approximating the area of a premise with semantics, and use this information as a heuristic guidance which proof planning steps to perform.

3.2.  Models

In order to make use of semantic information we need models. These can be either human generated or machine generated. In our approach we generate a finite set of models from the premises and the theorem with a model generator. Concretely, we use Slaney's FINDER system [Slaney95]. We use the standard definition of models commonly used in classical logic. A domain D is a non-empty set of objects. An interpretation I is a mapping from each constant symbol, variable symbol, and other term to an element of D, and from each formula to a truth value. If a formula F is satisfied in an interpretation I, I is a model of F.

Suppose Gamma is a set of assumptions and phi the theorem, (written as Gamma |- phi). In order to limit the model space, we assume a reference set of interpretations R. We define the model sets MGamma := { M in R | M |= Gamma } and Mphi := { M in R | M |= phi }, then it follows from Gamma |= phi that MGamma subset Mphi. Let Gamma consist of assumptions, for instance, A1, A2, ... , An, we get model sets M1, M2, ... , Mn with respect to the model reference class (note, all Mi subset R). The semantic relations presented by A1  &  A2  &  ...  &  An |- phi may be described as ( M1 intersect M2 intersect ... intersect Mn ) subset Mphi, as illustrated for n=3 in Figure 7.

7. Semantic diagram of assumptions and theorem

3.3.  Model based theorem proving

From the early days of automated theorem proving [Gelernter59] semantics played a significant role in pruning the search for proofs. This work has been taken up in more recent systems built on traditional theorem provers such as the SCOTT system [SLM94,HS01] and the semantic clause graph procedure [KC00]. Semantic information can indicate what the theory means, which direction the proof is to proceed, and which one of the candidate steps seems to have the most plausibility to lead to the desired result. In the case of the proof planning approach, in each stage of the above planning procedure we can efficiently reduce the search space if we have a mechanism to rule out less promising methods. The key idea of this research is to use semantics to obtain heuristic knowledge for determining the feasibility of a certain method.

3.4.  Model based proof planning

Models can provide heuristics about whether a formula generated by a method can be used as a new assumption or open goal, and if so, to what extent it fills the gap between the initial assumptions and the theorem. As mathematicians seem to use semantics in order to obtain heuristics to solve problems, it should also be natural that human-oriented automated theorem provers do so. In each step of the theorem proving process, the proof planner chooses a method that fills in the gap between assumptions and the theorem either by adding a new open goal before the theorem or by adding new assumptions after the initial assumptions. The main idea we introduce here is that the plausibility of a method can be estimated by means of comparing the semantics of new lines generated by the method with those of the assumptions and of the theorem. We extend the idea that any model of the assumptions Gamma should also be a model of the theorem phi to the models of intermediate results as well. All the newly generated lines must satisfy the semantic rules. In forward proof planning, an application of a method adds new lines and may delete old lines from the planning state (the deleted lines are considered to be replaced by the newly generated ones) and the assumption is updated from the initial set Gamma to a new set Gamma'. The models of the intersection of all assumptions should also be models of the theorem. That is, MGamma' subset Mphi must follow from Gamma |- phi ==> Gamma' |- phi, where ==> indicates the transition from one planning state to the next. To put it in different words, any transition for which the semantic restriction MGamma' subset Mphi does not hold cannot lead to a valid proof and can be excluded from the search space. Likewise, in backward proof planning an application of a method adds a new open goal phi'. The models of new open goals should contain all models of the intersection of all assumptions. MGamma subset Mphi' must follow from Gamma |- phi ==> Gamma |- phi'; if not it can be excluded from search.

4.  Semantic restriction and selection of methods

Proof planning incorporates formalised proof fragments in methods. However, often existing control rules are not sufficient to heuristically choose one method when two or more methods are applicable. In areas in which strong methods are available - like in the area of mathematical induction - this is no major drawback, since the specification is strong enough to restrict the applicability to one or very few methods. In other areas, however, such heuristic knowledge is not given and existing proof planners can easily fall into dead end situations in which no more methods are applicable. In those cases, backtracking has to be done in order to return to a previous state in which other candidate methods were available. As in any search problem, the efficiency of proof planning depends on exploring the more promising parts of the search space first. We need good heuristics in order to evaluate each branch of the search tree. Some heuristic information can be encoded in form of specificity which prefers the application of a specific method to a general one. Another approach is to explicitly rank methods in plan construction, for instance, by a rating, and to apply the applicable method that has the highest rank (measuring its usefulness) independently of the concrete situation.

In this work we use models in proof planning to exclude implausible candidate methods (semantic restriction) and, where there is still more than one method applicable, to estimate which one would fill the gap better (semantic selection).

4.1.  Semantic restriction

A semantic restriction strategy is a strategy that rejects any methods whose applications produce an intermediate result with which the semantic relation discussed in section 3.2 is not satisfied. This simple strategy can avoid many useless steps. For example, the search for a proof plan in Omega for {S  ->  FORALL x Q(x)} |- FORALL x (S  ->  Q(x)) can be reduced from 10 to only 3 steps with the semantic restriction strategy as we will describe in section 5.

Semantic restriction in forward planning

In some cases, it is possible for the application of a method to produce a new assumption with which the theorem cannot be deduced. As an example, suppose premises and a conclusion that satisfy A1, A2, A3 |- phi with M1 intersect M2 intersect M3 subset Mphi. The following method may be applied, which replaces line L1 (indicated by - L1) by line L4 (indicated by + L4). The intermediate result psi from line L4 will participate in the proof as a new assumption in the next stages.


 without Method 1
  premises   - L1, L2, L3
  conclusions   + L4
  proof schema   L1. H |- A1
    L2. H |- A2
    L3. H |- A3
    L4. H |- psi
When the range of the model set of each assumption and conclusion is like those illustrated in Figure 8, the set Mphi from the conclusion phi satisfies MGamma=M1 intersect M2 intersect M3 subset Mphi but does not Mdelta=M2 intersect M3 intersect Mpsi subset Mphi.
8. Wrong conclusion in forward planning

This means, from A2  &  A3  &  psi, one cannot derive phi. Therefore, the application of this method should be blocked in the given context.

Semantic restriction in backward planning

Suppose the initial premises are A1, A2, A3 and the theorem is phi. We apply Method 2 that uses premises A2 and A3 and adds a new premise A4 in order to derive phi.


 without Method 2
  premises   L2, L3, + L4
  conclusions   - L5
  proof schema   L2. H |- A2
    L3. H |- A3
    L4. H |- A4
    L5. H |- phi
The range of models of each premise and conclusion is illustrated in Figure 9. The model set from the initial premise set MGamma := M1 intersect M2 intersect M3 satisfies MGamma subset Mphi and the model set from the premises taken by the method Mdelta := M2 intersect M3 intersect M4 also satisfies Mdelta subset Mphi. The proof planner adds the premise A4 as a new open goal. However, A4 cannot be deduced from the given premises since MGamma NOT  subset M4.
9. Wrong assumption in backward planning

Although the method is applicable in the current stage, the newly generated conclusion, A4, cannot be derived from the initial premises. Therefore, we should not apply this method in this context.

4.2.  Semantic selection

A semantic selection strategy does not only restrict the number of methods in a semantic way, but heuristically chooses one that seems to make most progress towards filling the gap. It evaluates the semantics of each method where all methods seem equally applicable if semantics is not taken into consideration. We propose to distinguish more promising methods from the others by estimating how well they semantically match the open goals with respect to the subset and superset relation with the given premises. In a forward reasoning approach, the methods which restrict the model class best in the direction to the theorem are better. Concretely, if the problem is Gamma |- phi and two methods mu1 and mu2 transform the problem to Gamma1 |- phi and Gamma2 |- phi, respectively, with MGamma subset MGammai and MGammai subset Mphi with i=1 or 2, select MGamma1 rather than MGamma2 if and only if MGamma1 is bigger than MGamma2. (If the sets have the same cardinality, make a random choice.) Likewise in backwards reasoning, a method that produces the smallest model class should be selected.

Even when two or more candidate methods are applicable without semantic conflict, the plausibility of a method may still be assessed by diagrams. Assume the initial assumption set Gamma, assumption set Gamma1 obtained by the application of method mu1, Gamma2 obtained by the application of mu2, and the theorem phi.

10. Semantic selection in proof planning

As in Figure 10, if the model set of Gamma2 is larger than that of Gamma1, Gamma2 is likely the one closer to the conclusion and therefore Gamma2 has the higher possibility to deduce the theorem in fewer steps.

The selection strategy in backward proof planning proceeds towards the assumptions. In Figure 10, unlike in forward planning, we choose Gamma1 rather than Gamma2 since it is closer to the initial assumption Gamma and it should be easier to be derived from Gamma.

5.  Implementation and initial results

We have implemented the semantically restricted proof planner based on the Omega system [BCFFx97], which is a multi-initiative theorem proving environment. We use FINDER [Slaney95] as a model generator. The semantic proof planner transforms each proof line to FINDER readable format and generates a finite number of finite model sets.

We suppose the following problem, which is simple enough to illustrate the approach: {S  ->  FORALL x Q(x)} |- FORALL x ( S  ->  Q(x) )

In the initial state, there are two methods applicable. The Omega proof planner chooses a backward method mu1 by its existing method selection strategy. However, the application of mu1 will give rise to backtracking.

11. Proof planning procedure

12. Proof planning procedure (continued)

In Figure 10, the application of method mu1 in the initial stage generates two new open lines: S and FORALL x Q(x). The new open goal S must be the logical consequence of the initial assumption S  ->  FORALL x Q(x). The theorem FORALL x (S  ->  Q(x)) must be the logical consequence of the new lines S and FORALL x Q(x) along with the initial assumption.
{ S  ->  FORALL x Q(x) } |- S
{ (S  ->  FORALL x Q(x)), S, FORALL x Q(x) } |- FORALL x (S  ->  Q(x))
The relation between models of each lines follows:
MS  ->  FORALL x Q(x) subset MS
M{S  ->  FORALL x Q(x)), S, FORALL x Q(x)} subset MFORALL x (S  ->  Q(x))
where MS means the model set of formula S, and so on.

To verify the first condition, we generate models from the premise S  ->  FORALL x Q(x) with the domain fixed to D = {0, 1}.

  Model   S   Q(0)   Q(1)
  m1   F   F   F
  m2   F   T   F
  m3   F   F   T
  m4   F   T   T
  m5   T   T   T
Among these models, m5 is the only one that also satisfies S. The other four models are counterexamples. Thus, we see that it is not possible to derive S from the premise. If we do not rule out the method mu1, the procedure will backtrack after some number of steps.

Similarly, there is another chance to avoid backtracking as shown in Figure 11. As a result of the model-based restriction, the procedure can avoid two backtrackings and find the proof in three steps only.

6.  Conclusion

Although the way humans manage to keep search spaces in theorem proving small has not been fully understood, semantic guidance in form of checking whether certain reasoning steps make sense in well chosen models seems to provide powerful restriction and selection strategies. In this paper we discussed how to apply such strategies to proof planning, which is a human-oriented way of theorem proving.

We have introduced the main ideas of proof planning and how models can help to reduce wrong choices in proof planning. Wrong choices result in backtracking and slow down the process of proof planning. We have demonstrated the approach of semantic restriction with a simple example. One important question left for future investigation is how to select informative models. Currently we use the FINDER model generator for the generation of a reference set of models. While this is already informative (and reduces search) in some examples like the one we have looked at in section 5, we want to improve on it in several ways. In particular, we want to keep the reference set of models small and informative at the same time. In order to do so other approaches for using models are necessary. We think that using typical models [Kerber93a] is promising for this purpose.

For the future we plan to deepen our understanding by more experiments. In particular we want to investigate different approaches to semantic selection strategies.

Acknowledgements

We would like to thank Andreas Meier for his generous help to integrate the semantic restriction in the Omega proof planner. Furthermore we are indebted to Mateja Jamnik, Martin Pollet, and Claus-Peter Wirth for valuable feedback on earlier drafts of this work.

References


© 2002. Kluwer Academic, Dordrecht. Seungyeob Choi and Manfred Kerber
The URL of this page is http://www.cs.bham.ac.uk/~mmk/papers/02-MBR.html.
Also available as pdf ps.gz bib .