Formation of Methods for Proof Planning in Mathematics

EPSRC Grant GR/M22031

Project Investigator: Dr. Manfred Kerber
Research Fellow: Dr. Mateja Jamnik




1 Background

Although machine-oriented and interactive theorem provers are still of great interest, human-oriented theorem-proving systems have attracted growing attention after the limitations of these systems became more apparent and the initial enthusiasm for machine-oriented theorem provers waned. By human-oriented theorem provers, we mean systems that model human proof-search behaviour and are based on plausible rather than on demonstrative reasoning, a distinction introduced by Pólya. While search in the first two approaches (machine-oriented and interactive) is done in the space of demonstrative reasoning, that is, in some logical calculus, humans normally search for proofs in a more abstract space using plausible reasoning. Ideally the object language in human-oriented theorem proving would be the mathematical colloquial language. Since this would cause many problems in natural language understanding, normally the object language is a formal language that sufficiently approximates the mathematical colloquial language.

Proofs generated by plausible reasoning can normally be translated to low-level formal steps that can be fed into a proof checker (e.g., one based on some variant of the natural deduction calculus). While the strength of traditional theorem provers essentially relies on fixed domain-independent search strategies, the deductive power of human-oriented systems mainly resides in (user-written) domain-specific heuristics.

Both aspects of theorem proving are of extra-ordinary importance and a large body of mathematics (also that of Newton, Leibniz, and Euler) started by inspirational work that was logically not justified. (Indeed ``rigorous mathematics as a going concern is a rather rare phenomenon in the 4000-year-history of mathematics'' [KMH97, p.17].) This does not mean that rigorous proofs are not worth striving for, on the contrary an informal proof can be considered as a major step towards a formal proof. Indeed it is useful to view proofs ranging on a very wide spectrum of vague proof ideas over concrete proof plans to fully expanded formal proofs in a logical calculus.

There are different attempts to model human-oriented theorem proving, e.g., some based on expert systems or learning systems. The most prominent one is based on the idea that plausible reasoning is a planning process. Consequently this approach is called proof planning. It was first introduced by Bundy [Bun88] and is described in more detail below.


2 State of the Art - Proving as Planning

The most prominent attempt to separate levels in proving is the proof planning approach [Bun88]. The basic idea is to distinguish two levels: a planning level, which can be seen as the level of plausible reasoning, and a plan execution and verification level that corresponds to demonstrative reasoning. By using abstraction in the planning process a wide spectrum reaching from vague proof ideas over concrete plans to calculus level proofs can be realised.

In proof planning a plan operator is called method and consists of a tactic (in the sense of proof checkers) and a declarative specification. The declarative specification is used to construct plans and the tactic to execute them. In the plan approach the search space for a proof plan is often considerably smaller than the search space on the calculus level. This is partly due to domain specific heuristic knowledge, partly due to abstraction. The price paid is that the execution of proof plans may fail. However, in many cases even failed proof plans contain useful information that can either be adapted automatically to an executable proof plan or that can at least be communicated to the user (or another system) to provide useful hints for their proof search. This process makes it possible to convey ideas rather than ultimate proofs.

To a first approximation, planning for proofs can be done by applying techniques from the planning field, e.g., hierarchical planning. Methods can be very general, for instance corresponding to one application of a rule at the calculus level. More specific methods comprise proof ideas, for instance, to use a homomorphy property, to apply mathematical induction or to use diagonalisation. The most specific methods consist of full proofs for specific problems. In the Omega-MKRP proof planner, methods can typically be formulated in terms of proof schemata, that is partial proofs, in which expressions may be replaced by meta-variables.

A successful realisation of the plan-based approach is the OYSTER-CLAM-system [BvHHS90], where tactics of the OYSTER interactive theorem prover have been extended to methods (CLAM is the proof planner built on top of OYSTER). A method there can be viewed as a unit consisting of a procedural tactic and a declarative specification. CLAM has been applied in particular to problems from the field of mathematical induction and a large body of the heuristic knowledge of the Boyer-Moore prover [BM79] is encoded in CLAM methods. This makes CLAM a strong mathematical induction theorem prover. That approach relies heavily on the carefully designed methods, which on the one hand have to be general enough to cover a large class of instances, but on the other hand special enough to avoid the general combinatorial problems of traditional theorem provers. The process of developing methods is normally done manually and requires a lot of expertise.

Since the working language and the proof formalism of proof planning systems are usually human-oriented, proof planners can easily support user interactions in the spirit of a proof checker. The plan-based framework is particularly well-suited to encode domain-specific problem-solving knowledge. In the current form proof planning and machine-oriented theorem proving should not be considered as exclusive alternatives but rather as supplementary approaches, the first is particularly good in finding global (and more detailed) proof ideas and the latter to fill remaining gaps.

In the rest of this section the proof planning approach taken in Omega-MKRP will be described in more detail. For a detailed description of the other features of Omega-MKRP see [HKK+94,BCF+97].

The concept of a method is central to the reasoning process, since methods are the basic units which make up proof plans and are carried out to obtain the proof by bridging the gap between premises and conclusions. The set of all available methods constitutes the basic reasoning repertory of a reasoner, it is constantly enlarged and enriched with increasing experience. As a first approximation the process of learning mathematical reasoning by a student can be considered as the acquisition of new methods rather than the adoption of a formal calculus. While certain methods are specific to a certain field (e.g., in analysis the methods to prove continuity of real-valued functions by the E-D-criterion), others are domain-independent (e.g., indirect proof). A possible formalisation of the theorem proving process can be considered as a planning process of combining methods to build high-level proofs.

In a more sophisticated creative stage, given methods are not only applied and combined to build more or less complicated proofs, but existing methods are adapted to situations for which no particularly well-suited methods are available. This is fundamentally different from the logicistic view where a sound and complete calculus is assumed. There is no need to augment the calculus rules, but the theoretically complexity results are devastating.

In order to model aspects of this creative process we have adopted a declarative approach to the representation of methods. Since it can be pretty long winded (and hence inadequate) to represent all aspects of methods in a purely declarative form, we formalise a method as a 6-tuple with the following components, where the last one is procedural:

The work proposed here should be understood in the setting of a computational model that encompasses the entire process of theorem proving, from the analysis of a problem up to the completion of a proof, as an interleaving process of proof planning, method execution, and verification. In particular, this model ascribes a problem solver's reasoning competence to the existence of methods together with a planning mechanism that uses these methods for proof planning. The goal of proof planning is to fill gaps in a given partial proof tree by forward and backward reasoning [HKKR94]. Thus from an abstract point of view the planning process is the process of exploring the search space of planning states generated by the plan operators in order to find a complete plan (i.e., a sequence of instantiated plan operators) from a given initial state to a terminal state.

A planning state contains a set of proof arguments formulated in Gentzen's natural deduction (ND for short) calculus. Since we adopted a linearised version of ND proofs as introduced in [And80], the arguments consist of proof lines. The planning state for a partial proof corresponds to just those lines that form the boundaries of gaps in the proof. This subset can be divided into open lines (that must be proved to bridge the gaps) and support lines (that can be used as premises to bridge them). The initial planning state consists of all lines in the initial problem; the assumptions are the support lines and the conclusion is the only open line. The terminal planning state is reached when there are no more open lines in the planning state.
 
 


2.1 Learning in Theorem Proving

There are several approaches to apply methods of machine learning to theorem proving. Roughly they can be classified as machine-oriented or human-oriented according to the underlying automated reasoning. The first attempts to learn which calculus level rule should be applied next in a certain proof situation. This is done for instance by training a neural net with comparatively easy examples and then applying this to a more difficult problem [SE90].

The human-oriented approach models the human process of adapting proofs by analogy [KdlT92,Owe90,Mel95], by abstraction or generalisation [BGVW93,Gre88,NL95,DW88], or by case-based reasoning [Kol93]. Related to that are concept formation techniques [Len83].
 


3 Description of the Proposal

Pólya [Pól45] gave a four step recipe for solving mathematical problems and underpins that by a multitude of examples, some reaching from a detailed description how to proceed (for instance, constructing geometric objects by two loci), some hard to formalise (for instance, if a problem is too hard, start with a simpler similar problem before solving the original one). Newell [New81] discussed the relevance of Pólya's advice for AI and concludes that it is very relevant, but that AI did not manage to implement it (up to 1981, but we would add up to now). The main problem is that Pólya's advice is mainly example-based and it is highly non-trivial to implement it.

We believe that proof planning offers the framework for attacking at least some of Pólya's suggestions. In the context of this research proposal we would like to concentrate on the following three questions:

1.
How can solutions for one problem be made usable for the solution of related problems?
2.
How can a proof planner learn?
3.
How can a semantic analysis based on examples be integrated in the proof planning framework?



3.1 Plan of Work

For all these questions we see a clear starting point, but they also have an open-ended character. Not all the problems can be dealt with in this project in depth and we'll have to concentrate on those points that turn out to be most fruitful. In the context of first question we have worked for quite a while, so we am confident that the research in this direction will be successful. Point 1 offers a firm base of research, the more vague points 2 and 3 would be investigated in parallel. If the approaches to the latter turn out not to be fruitful, investigation to point 1 offer rich enough a domain and we would concentrate on this question. The work proposed aims at investigating these three problems. In the following we discuss each question and the way we want to address it in more detail.
 


First Question:

How can solutions for one problem be made usable for the solution of related problems?

First steps towards the solution of the first question were addressed by previous work we carried out in collaboration with colleagues from the Universität des Saarlandes in Saarbrücken, Germany. The declarative representation of methods allows for a mechanical adaptation of methods by so-called meta-methods.

The work carried out so far will be extended as follows:

1.
We did not build up a large body of methods that would cover a certain area of mathematics, but concentrated on representation issues. The first part of the investigation will consist of pinpointing an application area suitable for formalisation. A strong candidate would be high-school analysis, for which algorithmic knowledge how to solve certain problems is known. Note that although this area seems not to be very hard, in general these problems are not in the scope of general purpose theorem provers.
2.
So far the expressive power of the declarative part in form of proof line schemata is limited (in particular, it is not possible to use recursion and to express something like ``eliminate all quantifiers''). Building up a large body of methods in a certain field (as proposed in point 1.) will show how limiting this restriction is and give hints whether the framework has to be extended. There is a tradeoff between expressivity on the one hand and simplicity of the adaptation process on the other hand.
3.
The question what would form a comprehensive set of meta-methods to adapt existing methods is not addressed. While step 1. should be a matter of industriousness it is prerequisite for finding meta-methods. Our previous work gives some hope that the meta-methods are domain independent. This has, however, to be tested in a broader context.
4.
The two most interesting questions are not addressed in previous work: Firstly, when to interrupt normal proof planning in order to create a new method and secondly, which meta-method and method pair should be selected for adaptation. This problem is hard since the possibility to create new methods may dramatically increase the search space for a proof plan: not only existing operators but potentially many more may play a rôle. Hence new methods must be created very restrictively. In order to address the first question we need an estimate of whether the planning process is making progress or not. In order to address the second question we can, as a heuristic, organise methods in a hierarchy of mathematical theories and prefer methods that belong to the same theory as the current problem or whose theory is close to that of the problem in the hierarchy. Furthermore we can use general conflict solving strategies like those of OPS5 to favour methods and meta-methods with the most specific specification. A more sophisticated approach will be to prune the search space by using procedures described in the Third Question below.
There is a fundamentally different approach to the problem of reusing problem solving knowledge. Rather than adapting the methods so that they are applicable to the problem at hand, you can reformulate the problem so that it is solvable by the existing methods. In collaboration with a former colleague we worked on this approach as well [KP96], however, there are serious problems that remain unsolved. To me the most puzzling problem is the notion of a reformulation in itself. A naive definition is not satisfying. If we have a problem G |= A in which A is mathematical theorem, we could define a reformulation as a problem G' |= A' that logically entails the original problem. If, however, the first problem relation holds, any problem, in particular Ø |= true would be a reformulation. There is certainly a loss of information in this process, in particular it is not possible to compute the solution for G |= A from the solution for Ø |= true hence in order to be useful reformulations have to transform not only the problems but also the solutions. Another main problem is when a reformulation is useful at all. In [KP96] we have investigated this problem and looked at particular examples. For some we could present ad-hoc arguments why the complexity of the reformulated problems was lower than that of the original ones. Recently we learned about Kolmogorov complexity [LV97]. Extensions of this concept may be transferable to the notion of reformulation and help to clarify complexity questions and answer the question of the usefulness of reformulations more systematically for whole classes of problems (surely not for all problems, however).


Second Question:

How can a proof planner learn?
The field of machine learning offers a wide variety of techniques for producing adaptive behaviour ranging from neural nets, genetic algorithms over reinforcement learning to inductive logic programming (There is a vast amount of literature in this area and many techniques are described in standard AI text books, e.g., [RN95], as well as in particular text books). Although it is possible to produce fascinating results with these techniques, applications in the area of mechanised reasoning are limited (see above) and how humans learn to solve mathematical problems is hardly understood at all.

Learning plays a crucial rôle in mathematical work. Learning about a domain, about concepts in a domain, discovering interesting problems in the domain, improving the search behaviour, discovering new methods, new theorems and proofs as well as new applications all happen concurrently. In this research we shall investigate the improvement of the search behaviour and the discovery of new methods.

Many approaches try learning from scratch. We don't think that this approach is adequate for coming up with a strong proof plan based reasoning system. To expect that an algorithm produces well-designed methods (i.e., proof plan operators) for proof planning, would confront it with the task of finding problem solving knowledge that has been generated over many years by implementers of automated reasoning systems as well as by many mathematicians. It is more promising to start with a strong system and to improve it. This part of the research is still partly vague and we shall give only some ideas in the following.

A very low level type of learning can be based on the automatic adaptation of utility measures for methods. In order to improve the search behaviour of the proof planner we have manually added such a measure, called rating, in each of the Omega-MKRP methods. It is represented in form of a number between 0 and 1 and essentially gives a preference order among methods that are applicable at the same time. Reinforcement learning techniques can be used to adapt these values to a problem domain automatically.

Since the values of the rating are just absolute numbers (independent of the problem) only a very coarse representation of the utility for a particular situation is given. It is more adequate to consider the rating as a function whose values are known for special cases only and that can be learned from these cases. This may require learning new meta-level concepts: types of contexts which favour different methods. Learning could be based on methods used in explanation-based learning or inverse logic programming. Of course the results in such a space would be considerably more useful, but much harder to learn. To start it would be necessary to find an adequate representation of the rating functions.

Recent work in classifier systems [Wil95] has shown that it may be useful to distinguish two different measures, one for the expected gain of applying a rule and one of the accuracy of the prediction of this gain. It may well be that a similar distinction is also valuable in the context of proof planning.


Third Question:

How can a semantic analysis based on examples be integrated in the proof planning framework?
If we look again at Pólya's heuristic advice [Pól45] to solve problems, we see that it is almost exclusively based on examples. This makes it hard to integrate it in theorem proving program (while it is relatively easy for humans to understand and use). A direct way of using examples is to find counter-examples to conjectures. This was successfully applied in geometrical theorem proving [Gel59]. It is also used in general purpose theorem proving, for instance, the Finder [Sla94] model generator was integrated into the Otter system [McC90] in order to implement a dynamic variant of the set-of-support strategy [SLM94a].

To our knowledge, the application of example-based methods has not been investigated in proof planning but might turn out to be more fruitful than in machine-oriented theorem proving that heavily relies on speed rather than sophistication. It seems to be very much in line with the general philosophy of proof planning, namely, to model the human proof search behaviour. Humans normally make heavy use of examples to get an intuition of problems [Kol93].

A straightforward way of applying examples is to restrict the application of methods in such a way that it is consistent with the examples. For instance, if M is a model (i.e., an example) of the theorem Th and Meth a method for simplifying the theorem to Th' then M must be a model of Th' as well, otherwise Meth will not be applied. One line of research would be to use this approach for restricting the generation of new methods by meta-methods as well.

A major question is how to get and to represent examples. In general there are two approaches. In the one, examples are created by a model-generator, in the other, there is a carefully selected example base with typical examples and counter-examples to concepts (in collaboration, we have done some research on the concept of typicality in previous work [KMS93]). Human mathematicians seem to follow both approaches, in particular text books normally contain carefully selected examples for clarification.

More vague is the idea to use examples in the way of explanation-based learning, that is, to generate proof plans from concrete considerations in one particular model.


4 Diagrammatic Workplan

The first six months will be mainly devoted to a literature research, research for examples and to get started.

Month First Question Second Question Third Question
1-6 Starting to building up large body of methods. Theoretical development of the notion of reformulation and complexity. Starting automatic adaptation of utility measures for methods Representation of examples
7-12 experiments with the adaptation property. Building up a set of meta-methods investigation of more sophisticated forms of rating Building up an example base and/or integration of a model generator
13-18 Further experiments with the adaptation of methods. Building up a set of meta-methods. If necessary change of the method language. Experiments with problem reformulation. Learn for two separate measures: Utility of the application and accuracy of the prediction Integration of examples into proof planning.
19-24 Completing the basic set of methods and meta-methods. Experiments on the automation of planning with meta-methods. Calculation and/or estimation of problem complexities for reformulations. Calculation of measures for newly generated methods (rather than just taking it over) Investigation in typical examples, multiple examples
25-30 Further experiments on the automation of planning with meta-methods. Integration of learned utility and accuracy measures in the automated planning process. Learning from failures. Generalisation from examples
31-36 Completion of a planner with meta-methods. Integration of reformulations into planning. Learning the utility of meta-methods Generalisation from examples


References

And80
Peter B. Andrews. Transforming matings into natural deduction proofs. In Wolfgang Bibel and Robert Kowalski, editors, Proceedings of the 5th CADE, Les Arcs, , 1980. Springer Verlag, Berlin, LNCS 87.

BCF+97
Christoph Benzmüller, Lassaad Cheikhrouhou, Detlef Fehrer, Armin Fiedler, Xiaorong Huang, Manfred Kerber, Michael Kohlhase, Erica Melis, Andreas Meier, Wolf Schaarschmidt, Jörg Siekmann, and Volker Sorge. : Towards a mathematical assistant. In William McCune, editor, Proceedings of the 14th CADE, pages 252-255, Townsville, , 1997. Springer Verlag, Berlin, LNAI 1249.

BGVW93
Alan Bundy, Fausto Giunchiglia, Adolfo Villafiorita, and Toby Walsh. Gödel's incompleteness theorem via abstraction. IRST-Technical Report 9302-15, , Povo, Trento, 1993.

BM79
Robert S. Boyer and J Strother Moore. A Computational Logic. Academic Press, New York, USA, 1979.

Bun88
Alan Bundy. The use of explicit plans to guide inductive proofs. In Ewing Lusk and Ross Overbeek, editors, Proceedings of the 9th CADE, pages 111-120, Argonne, Illinois, USA, 1988. Springer Verlag, Berlin, LNCS 310.

BvHHS90
Alan Bundy, Frank van Harmelen, Christian Horn, and Alan Smaill. The Osyter-Clam System. In Mark E. Stickel, editor, Proceedings of the 10th CADE, pages 647-648, Kaiserslautern, 1990. Springer Verlag, Berlin, LNAI 449.

DW88
M.R. Donat and L.A. Wallen. Learning and applying generalised solutions using higher order resolution. In Ewing Lusk and Ross Overbeek, editors, Proceedings of the 9th CADE, pages 41-60, Argonne, Illinois, USA, 1988. Springer Verlag, Berlin, LNCS 310.

Gel59
H. Gelernter. Realization of a geometry theorem-proving machine. In Proceedings of the International Conference on Information Processing, UNESCO, 1959.

Gre88
Russell Greiner. Abstraction-based analogical inference. In D.H. Helman, editor, Analogical Reasoning, pages 147-170. Kluwer Academic Press, Dordrecht, 1988. In Series: Perspectives of Artificial Intelligence, Cognitive Science, and Philosophy.

HKK+94
Xiaorong Huang, Manfred Kerber, Michael Kohlhase, Erica Melis, Dan Nesmith, Jörn Richts, and Jörg Siekmann. Omega-MKRP: A proof development environment. In Alan Bundy, editor, Proceedings of the 12th CADE, pages 788--792, Nancy, 1994. Springer Verlag, Berlin, Germany, LNAI 814.

HKKR94
Xiaorong Huang, Manfred Kerber, Michael Kohlhase, and Jörn Richts. Adapting methods to novel tasks in proof planning. In Bernhard Nebel and Leonie Dreschler-Fischer, editors, KI-94: Advances in Artificial Intelligence -- Proceedings of KI-94, 18th German Annual Conference on Artificial Intelligence, pages 379--390, Saarbruecken, Germany, 1994. Springer Verlag, Berlin, Germany, LNAI 861.

HKRS94
Xiaorong Huang, Manfred Kerber, Jörn Richts, and Arthur Sehn. Planning mathematical proofs with methods. Journal of Information Processing and Cybernetics, EIK, 30(5-6):277--291, 1994.

KdlT92
Christoph Kreitz and Thierry de~la Tour. Building proofs by analogy via the curry-howard isomorphism. In A.~Voronkov, editor, Proceedings of LPAR, pages 202--213. Springer Verlag, Berlin, Germany, LNAI 624, 1992.

KMH97
Israel Kleiner and Nitsa Movshovitz-Hadar. Proof: A many-splendored thing. Mathematical Intelligencer, 19(3):16--26, 1997.

KMS93
Manfred Kerber, Erica Melis, and Jörg Siekmann. Analogical reasoning based on typical instances. in Proceedings of the IJCAI-Workshop on Principles of hybrid reasoning and representation, July 1993. Chambery, France.

Kol93
Janet Kolodner. Case-Based Reasoning. Morgan Kaufmann Publishers, San Mateo, CA, USA, 1993.

KP96
Manfred Kerber and Axel Praecklein. Using tactics to reformulate formulae for resolution theorem proving. Annals of Mathematics and Artificial Intelligence, 18(2-4):221--241, 1996.

Len83
Douglas B. Lenat. Eurisko: A program that learns new heuristics and domain concepts. Artificial Intelligence, 21:61--98, 1983.

LV97
Ming Li and Paul Vitanyi. An Introduction to Kolmogorov Complexity and Its Applications. Springer, New York, USA, second edition edition, 1997.

McC90
William McCune. Otter 2.0. In Mark E. Stickel, editor, Proceedings of the 10th CADE, pages 663--664, Kaiserslautern, Germany, 1990. Springer Verlag, Berlin, Germany, LNAI 449.

Mel95
Erica Melis. A model of analogy-driven proof-plan construction. In Chris S. Mellish, editor, Proceedings of the 14th IJCAI, pages 182--188, Montreal, Canada, 1995. Morgan Kaufmann, San Mateo, CA, USA.

New81
Allen Newell. The heuristic of George Pólya and its relation to artificial intelligence. Technical Report CMU-CS-81-133, Department of Computer Science, Carnegie-Mellon University, Pittsburgh, Pennsylvania, USA, 1981. Also in Rudolf Groner, Marina Groner and Walter F. Bishoof, eds., Methods of Heuristics, Lawrence Erlbaum, Hillsdale, New Jersey, USA, pages 195--243.

NL95
P. Pandurang Nayak and Alon Y. Levy. A semantic theory of abstraction. In Proceedings of the 14th IJCAI, pages 196--202, Chris S. Mellish, 1995. Morgan Kaufmann, San Mateo, CA, USA.

Owe90
Stephen Owen. Analogy for Automated Reasoning. Academic Press, 1990.

Pól45
George Pólya. How to Solve It. Princeton University Press, Princeton, New Jersey, USA, also as Penguin Book, 1990, London, UK, 1945.

RN95
Stuart J. Russell and Peter Norvig. Artificial Intelligence -- A Modern Approach. Prentice Hall, Englewood Cliffs, New Jersey, USA, 1995.

SE90
Christian Suttner and Wolfgang Ertel. Automatic acquisition of search guiding heuristics. In Mark E. Stickel, editor, Proceedings of the 10th CADE, pages 470--484, Kaiserslautern, Germany, 1990. Springer Verlag, Berlin, Germany, LNAI 449.

Sla94
John Slaney. Finder: Finite domain enumerator--system description. In Alan Bundy, editor, Proceedings of the 12th CADE, pages 798--801. Springer Verlag, Berlin, Germany, Nancy 1994.

SLM94
John Slaney, Ewing Lusk, and William McCune. SCOTT: Semantically Constraint Otter -- System Description. In Alan Bundy, editor, Proceedings of the 12th CADE, pages 764--768, Nancy, France, 1994. Springer Verlag, Berlin, Germany, LNAI 814.

Wil95
Stewart W. Wilson. Classifier fitness based on accuracy. Evolutionary Computation, 3(2):149--175, 1995.


M.Jamnik@cs.bham.ac.uk