ABSTRACT
Oants is an agent-oriented environment for combining inference systems. A characteristics of the Oants approach is that a common proof object is generated by the cooperating systems. This common proof object can be inspected by verification tools to validate the correctness of the proof. Oants makes use of a two layered blackboard architecture, in which the applicability of inference rules are checked on one (abstract) layer. The lower layer administrates explicit proof objects in a common language. In concrete proofs these proof objects can be quite bit, which can make communication during proof search very inefficient. As a result we had situations in which most of the resources went into the overhead of constructing explicit proof objects and communicating between different components. Therefore we have recently developed an alternative modelling of cooperating systems in Oants which allows direct communication between related systems during proof search. This has the consequence that proof objects can no longer be directly constructed and thus the correctness-validation in this novel approach is in question. In this paper we present a pragmatic approach how this can rectified.
Oants is an agent-oriented environment for combining inference rules and inference systems. Oants was originally conceived to support interactive theorem proving but was later extended to a fully automated proving system [Sorge01phd,BeSo01]. A characteristics of the Oants approach is that a joint proof object is generated by the cooperating inference rules and inference systems. This joint proof object can be inspected by proof verification tools in combination with proof expansion in order to validate the correctness at a purely logic level. The Oants blackboard architecture consists of two layers, an abstract upper layer, and a more detailed lower layer. Applicability criteria for inference rules are modelled at the upper layer. The upper layer is supported by computations at the lower layer which models criteria for the instantiation of the parameters of the inference rules.
External systems have been modelled in Oants as individual inference rules at the upper layer. With this approach, Oants has been successfully employed in past experiments to check the validity of set equations using higher-order and first-order theorem provers, model generation, and computer algebra [BeJaKeSo01b]. However, this approach was very inefficient for hard examples because of the communication overhead imposed by the need to translate all steps into a common proof data structure.
Therefore, we have recently developed an alternative approach: the single inference rule approach of cooperating systems in Oants which exploits the lower layer of the blackboard architecture. This approach has been successfully applied to the combination of automated higher-order and first-order theorem provers. In particular, it has outperformed state-of-the-art first-order specialist reasoners (including Vampire 7.0) on 45 examples on sets, relations and functions; see [lpar05].
Unfortunately, using a single inference rule approach, we had to sacrifice the generation of joint proof objects and correctness validation in this novel approach. In this paper we present a pragmatic approach to how this can be rectified.
The paper is structured as follows: In Section 2 we motivate and illustrate the cooperation between a higher-order theorem prover (we employ Leo [C3]) and a first-order theorem prover (we employ Bliksem [Nivelle99]). In Section 3 we compare the two options for modelling cooperative reasoning systems in Oants: the initial multiple inference approach and the novel single inference rule approach. In Section 4 we show how a joint proof object can also be obtained for the latter modelling by mapping it back to the former. Section 5 concludes the paper.
While it has already been shown in [C3,T2] that many problems of this nature can be easily proved from first principles using a concise higher-order representation and the higher-order resolution ATP Leo [C3], the combinatorial explosion inherent in Leo's calculus prevents the prover from solving a whole range of possible problems with one universal strategy. Often higher-order problems require only relatively few but essential steps of higher-order reasoning, while the overwhelming part of the reasoning is first-order or even propositional level. This suggests that Leo's performance could be improved when combining it with a first-order ATP to search efficiently for a possible refutation in the subset of those clauses that are essentially first-order.
The advantages of such a combination are not only that many problems can still be efficiently shown from first principles in a general purpose approach, but also that problems can be expressed in a very concise way.
For instance, in [lpar05] we present 45 problems from the SET domain of the TPTP-v3.0.1, together with their entire formalisation in less than two pages, which is difficult to achieve within a framework that does not provide lambda -abstraction. We have used this problem set, which is an extension of the problems considered in [GanzingerStuber-03-cade], to show the effectiveness of our approach (cf. [lpar05]). While many of the considered problems can be proved by Leo alone with some strategy, the combination of Leo with the first-order ATP Bliksem [Nivelle99] is not only able to solve more problems, but also needs only a single strategy to prove them. Several of our problems are considered very challenging by the first-order community and five of them (of which Leo can solve four) have a TPTP rating of 1.00, saying that they cannot be solved by any TPTP prover to date.
Technically, the combination has been realised in the concurrent reasoning system Oants [Sorge01phd,BeSo01] which enables the cooperation of hybrid reasoning systems to construct a common proof object. In our past experiments, Oants has been successfully employed to check the validity of set equations using higher-order and first-order ATPs, model generation, and computer algebra [BeJaKeSo01b]. While this enabled a cooperation between Leo and a first-order ATP, the proposed solution could not be classified as a general purpose approach. A major shortcoming was that all communication of partial results had to be conducted via the common proof object, which was very inefficient for hard examples. Thus, the solved examples from set theory were considered too trivial, albeit they were often similar to those still considered challenging in the TPTP in the first-order context.
SET171+3 | FORALL X_{o alpha }, Y_{o alpha }, Z_{o alpha }. X union (Y intersection Z) = (X union Y) intersection (X union Z) |
SET611+3 | FORALL X_{o alpha }, Y_{o alpha }. ( X intersection Y = {} ) <-> ( X \ Y = X ) |
SET624+3 | FORALL X_{o alpha }, Y_{o alpha }, Z_{o alpha }. Meets ( X, Y intersection Z ) <-> Meets ( X, Y ) v Meets ( X, Z ) |
SET646+3 | FORALL x_{ alpha} , y_{ beta} . Subrel ( Pair ( x, y ), ( lambda u_{ alpha} . T ) x ( lambda v_{ beta} . T ) ) |
SET670+3 | FORALL Z_{o alpha }, R_{o beta alpha }, X_{o alpha }, Y_{o beta }. IsRelOn ( R, X, Y ) -> IsRelOn(RestrictRDom(R, Z),Z,Y) |
In [lpar05] we have presented our novel approach to the cooperation between Leo and Bliksem inside Oants by decentralising communication. As has been documented in [lpar05] this leads not only to a higher overall efficiency but also to a general purpose approach based on a single strategy in Leo.
We list some examples of the test problems on sets and relations (and functions) that have been investigated in [lpar05]. These test problems are taken from the TPTP against which we evaluated our system. The problems are given by the identifiers used in the SET domain of the TPTP, and are formalized in a variant of Church's simply typed lambda -calculus with prefix polymorphism. In classical type theory, terms and all their sub-terms are typed. Polymorphism allows the introduction of type variables such that statements can be made for all types. For instance, in problem SET171 in Table 1, the universally quantified variable X_{o alpha } denotes a mapping from objects of type alpha to objects of type o. We use Church's notation o alpha , which stands for the functional type alpha -> o. The reader is referred to [Andrews02] for a more detailed introduction. In the remainder, o will denote the type of truth values, and small Greek letters will denote arbitrary types. Thus, X_{o alpha } (and its a-longform lambda y_{ alpha} . Xy) is actually a characteristic function denoting the set of elements of type alpha , for which the predicate associated with X holds. As further notational convention, we use capital letter variables to denote sets, functions, or relations, while lower case letters denote individuals. Types are usually only given in the first occurrence of a variable and omitted if inferable from the context. Table 1 presents some examples of the test problems investigated in [lpar05].
These test problems employ defined concepts that are specified in a knowledge base of hierarchical theories that Leo has access to. Table 2 gives the concepts necessary for defining the above problems:
_{-} IN _{-} | := | lambda x_{ alpha} , A_{o alpha }. [ A x ] |
{} | := | [ lambda x_{ alpha} . F ] |
_{-} intersection _{-} | := | lambda A_{o alpha }, B_{o alpha }. [ lambda x_{ alpha} .x IN A & x IN B ] |
_{-} union _{-} | := | lambda A_{o alpha }, B_{o alpha }. [ lambda x_{ alpha} .x IN A v x IN B ] |
_{-} \_{-} | := | lambda A_{o alpha }, B_{o alpha }. [ lambda x_{ alpha} .x IN A v x NOT in B ] |
Meets (_{-},_{-} ) | := | lambda A_{o alpha }, B_{o alpha }. [ EXISTS x_{ alpha} .x IN A & x IN B ] |
Pair (_{-},_{-} ) | := | lambda x_{ alpha} , y_{ beta} . [ lambda u_{ alpha} , v_{ beta} .u = x & v = y ] |
_{-} x _{-} | := | lambda A_{o alpha }, B_{o beta }. [ lambda u_{ alpha} , v_{ beta} .u IN A & v IN B ] |
Subrel (_{-},_{-} ) | := | lambda R_{o beta alpha }, Q_{o beta alpha }. [ FORALL x_{ alpha} , y_{ beta} .R x y -> Q x y ] |
IsRelOn (_{-},_{-},_{-} ) | := | lambda R_{o beta alpha }, A_{o alpha }, B_{o beta }. [ FORALL x_{ alpha} , y_{ beta} .R x y -> x IN A & y IN B ] |
RestrictRDom (_{-},_{-} ) | := | lambda R _{o beta alpha }, A_{o alpha }. [ lambda x_{ alpha} , y_{ beta} . x IN A & R x y ] |
These concepts are defined in terms of lambda -expressions and they may contain other, already specified concepts. For presentation purposes, we use customary mathematical symbols union , intersection , etc., for some concepts like union, intersection, etc., and we also use infix notation. For instance, the definition of union on sets in Table 2 can be easily read in its more common mathematical representation A union B := {x | x IN A v x IN B}. Before proving a problem, Leo always expands -- recursively, if necessary -- all occurring concepts. This straightforward expansion to first principles is realised by an automated preprocess in our current approach.
Assumptions: | 1 | FORALL B, C, x. [x IN (B union C) <-> x IN B v x IN C] |
2 | FORALL B, C, x. [x IN (B intersection C) <-> x IN B & x IN C] | |
3 | FORALL B, C . [B = C <-> B SUBSET eq C & C SUBSET eq B] | |
4 | FORALL B, C . [B union C = C union B] | |
5 | FORALL B, C . [B intersection C = C intersection B] | |
6 | FORALL B, C . [B SUBSET eq C <-> FORALL x . x IN B -> x IN C] | |
7 | FORALL B, C . [B = C <-> FORALL x . x IN B <-> x IN C] | |
Proof Goal: | G | FORALL B, C, D . [B union (C intersection D) = (B union C) intersection (B union D)] |
We can observe that the assumptions provide only a partial axiomatisation of naive set theory. On the other hand, the specification introduces lemmata that are useful for solving the problem. In particular, assumption (7) is trivially derivable from (3) with (6). Obviously, clausal normalisation of this first-order problem description yields a much larger and more difficult set of clauses. Furthermore, definitions of concepts are not directly expanded as in Leo. It is therefore not surprising that most first-order ATPs still fail to prove this problem. In fact, very few TPTP provers were successful in proving SET171+3. Amongst them are Muscadet 2.4. [Pastre01], Vampire 7.0, and Saturate. The natural deduction system Muscadet uses special inference rules for sets and needs 0.2 seconds to prove this problem. Vampire needs 108 seconds. The Saturate system [GanzingerStuber-03-cade] (which extends Vampire with Boolean extensionality rules that are a one-to-one correspondence to Leo's rules for Extensional Higher-Order Paramodulation [C5]) can solve the problem in 2.9 seconds while generating 159 clauses. The significance of such comparisons is clearly limited since different systems are optimised to a different degree. One noted difference between the experiments with first-order provers listed above, and the experiments with Leo and Leo-Bliksem is that first-order systems often use a case tailored problem representation (e.g., by avoiding some base axioms of the addressed theory), while Leo and Leo-Bliksem have a harder task of dealing with a general (not specifically tailored) representation. Thus, the comparison of the performance of Leo and Leo-Bliksem with first-order systems as done in [lpar05] is unfair: the higher-order systems attack harder, non-tailored problems. Nevertheless, as we demonstrated by the performance results in [lpar05] the higher-order systems still perform better.
For the experiments with Leo and the cooperation of Leo with the first-order theorem prover Bliksem, lambda -abstraction as well as the extensionality treatment inherent in Leo's calculus [J5] is used. This enables a theoreticallyFor pragmatic reasons, such as efficiency, most of Leo's tactics are incomplete. Leo's philosophy is to rely on a theoretically complete calculus, but to practically provide a set of complimentary strategies so that these cover a broad range of theorems. Henkin-complete proof system for set theory. In the above example SET171+3, Leo generally uses the application of functional extensionality to push extensional unification constraints down to base type level, and then eventually applies Boolean extensionality to generate clauses from them. These are typically much simpler and often even propositional-like or first-order-like (FO-like, for short), that is, they do not contain any `real' higher-order subterms (such as a lambda -abstraction or embedded equations), and are therefore suitable for treatment by a first-order ATP or even a propositional logic decision procedure.
(6) [ ( B x v ( C x & D x ) ) <-> ( ( B x & C x ) v ( C x & D x ) ) ]^{ F} (11) []
(1) FORALL B, C, D.B union ( C intersection D ) = ( B union C ) intersection ( B union D ) | ||
v | clause initialization | |
(2) [ FORALL B, C, D.B union ( C intersection D ) = ( B union C ) intersection ( B union D ) ]^{F} | ||
v | def.-expansion, cnf | |
v | B, C, D Skolem const. | |
(3) [ ( lambda x.B x v ( C x & D x ) ) = ( lambda x. ( B x & C x ) v ( C x & D x ) ) ]^{F} | ||
v | unification constraint | |
(4) [ ( lambda x.B x v ( C x & D x ) ) =^{?} ( lambda x. ( B x & C x ) v ( C x & D x ) ) ]^{ } | ||
v | f-extensionality | |
v | x new Skolem constant | |
(5) [ ( B x v ( C x & D x ) ) = ^{?} ( ( B x & C x ) v ( C x & D x ) ) ]^{ } | ||
v | B-extensionality | |
v | cnf, factor., subsumption | |
(7) [ B x ]^{ F} | ||
(8) [ B x ]^{ T} v [ C x ]^{ T} | propositional problem! | |
(9) [ B x ]^{ T} v [ D x ]^{ T} | ||
(10) [ C x ]^{ F} v [ D x ]^{ F} | ||
v | propositional reasoning | |
Table 4 illustrates how Leo tackles and solves the test problem SET171+3. First the resolution process is initialised, that is, the input problem (1) is turned into a negated unit clause (2). Then the definitions occurring in this input problem are expanded, that is, completely reduced to first principles. The resulting (not displayed intermediate) clause is not in normal form and therefore Leo first normalizes it with explicit clause normalisation rules (cnf) to reach some proper initial clauses. In our concrete case, this leads to the unit clause (3). Note that negated primitive equations are generally automatically converted by Leo into unification constraints. This is why (3) is automatically converted into (4), which is a syntactically not solvable, but is a semantic unification problem. Observe, that we write [.]^{T} and [.]^{F} for positive and negative literals, respectively. Leo then applies its goal directed functional and Boolean extensionality rules which replace the unification constraint (4) by the clauses (5) and (6). Unit clause (6) is again not normal; normalisation, factorisation and subsumption yields the clauses (7)-(10). This set is essentially of propositional logic character and trivially refutable. Leo needs 0.56 seconds for solving the problem and generates a total of 36 clauses.
The result of the case study performed in [lpar05] is: The Leo-Bliksem cooperation impressively outperforms both state-of-the art first-order specialists (including Vampire 7.0) and the non-cooperative Leo system.
In the next section we describe in more detail how the cooperative proof search approach between Leo and the first-order prover Bliksem has been modelled in Oants.
The architecture consists of two layers. On the lower layer, possible instantiations of the arguments of individual inference rules are computed. In particular, each inference rule is associated with its own blackboard and concurrent processes, one for each argument of the inference rule. The role of every process is to compute possible instantiations for its designated argument of the inference rule, and to record these on the blackboard. The computations are carried out with respect to the given proof context and by exploiting information already present on the blackboard, that is, argument instantiations computed by other processes. On the upper layer, the information from the lower layer is used for computing and heuristically ranking the inference rules that are applicable in the current proof state. The most promising rule is then applied to the central proof object and the data on the blackboards is cleared for the next round of computations.
Oants employs resource reasoning to guide search.Oants provides facilities to define and modify the processes at run-time. But notice that we do not use these advanced features in the case study presented in this paper. This enables the controlled integration (e.g., by specifying time-outs) of full-fledged external reasoning systems such as automated theorem provers, computer algebra systems, or model generators into the architecture.
INFER [Leo (LEO-parameters)]HO-goal
A process can then place an open goal on the blackboard, where it is picked up by a
process that applies the Leo prover to it. Any computed proof from the external system
is again written to the blackboard from where it is subsequently inserted into the proof
object when the inference rule is applied. While this setup enables proof construction by
a collaborative effort of diverse reasoning systems, the cooperation can only be achieved
via the central proof object. This means that all partial results have to be translated
back and forth between the syntaxes of the integrated systems and the language of the
proof object. For modelling the cooperation of Leo with a first-order reasoner we have
first experimented with the following multiple inference rule modelling
(see also [BeJaKeSo01b]):
INFER [Leo-with-partial-result (Leo-parameters)]HO-goalNeg-Conj-of-FO-clauses
INFER [Bliksem (Bliksem-parameters)]FO-goal
The first rule models a process that picks up higher-order proof problem from the
blackboard, passes it to Leo which starts its proof search, and then returns the
negated conjunction of generated first-order clauses back (e.g. the negated conjunction of the
clauses (7)-(10) in our previous example). For each modified bag of first-order like
clauses in Leo this rule may suggest a novel reduction of the original higher-order
goal to a first-order criterion.
Since there are many types of integrated systems, the language of the proof object maintained in Oants -- a higher-order language even richer than Leo's, together with a natural deduction calculus -- is expressive but also cumbersome. This leads not only to a large communication overhead, but also means that complex proof objects have to be created, even if the reasoning of all systems involved is clause-based. Large clause sets need to be transformed into large single formulae to represent them in the proof object; the support for this in Oants to date is inefficient. Consequently, the cooperation between external systems is typically rather inefficient [BeJaKeSo01b].
In order to overcome the problem of the communication bottleneck described above, we devised a new method for the cooperation between a higher-order and a first-order theorem prover within Oants. Rather than modelling each theorem prover as a separate inference rule (and hence needing to translate the communication via the language of the central proof object), we model the cooperation between a higher-order (concretely, Leo) and a first-order theorem prover (in our case study Bliksem) in Oants as a single inference rule.
INFER [Leo-Bliksem (Leo-partial-proof, FO-clauses, FO-proof, Leo-parameters, Bliksem-parameters )]HO-goal
The communication between the two theorem provers is carried out directly by the parameters of the inference rule and not via the central proof object. This avoids translating clause sets into single formulae and back.
Concretely, the single inference rule modelling the cooperation between Leo and a first-order theorem prover needs the following arguments to be applicable: (1) an open higher-order proof goal, (2) a partial Leo proof, (3) a set of FO-like clauses in the partial proof, (4) a first-order refutation proof for the set of FO-like clauses, and (5) and (6) the usual flag-parameters for the theorem provers Leo and Bliksem. Each of these arguments is computed, that is, its instantiation is found, by an independent process. The first process finds open goals in the central proof object and posts them on the blackboard associated with the new rule. The second process starts an instance of the Leo theorem prover for each new open goal on the blackboard. Each Leo instance maintains its own set of FO-like clauses. The third process monitors these clauses, and as soon as it detects a change in this set, that is, if new FO-like clauses are added by Leo, it writes the entire set of clauses to the blackboard. Once FO-like clauses are posted, the fourth process first translates each of the clauses directly into a corresponding one in the format of the first-order theorem prover, and then starts the first-order theorem prover on them. Note that writing FO-like clauses on the blackboard is by far not as time consuming as generating higher-order proof objects. As soon as either Leo or the first-order prover finds a refutation, the second process reports Leo's proof or partial proof to the blackboard, that is, it instantiates argument (2). Once all four arguments of our inference rule are instantiated, the rule becomes applicable and its application closes the open proof goal in the central proof object. That is, the open goal is proved by the cooperation between Leo and a first-order theorem prover. When computing applicability of the inference rule, the second and the fourth process concurrently spawn processes running Leo or a first-order prover on a different set of FO-like clauses. Thus, when actually applying the inference rule, all these instances of provers working on the same open subgoal are stopped.
While in the previous approach with multiple inference rules the cooperation between Leo and Bliksem was modelled at the upper layer of the Oants architecture, our new approach models their cooperation by exploiting the lower layer of the Oants blackboard architecture. This is not an ad hoc solution, but rather, it demonstrates Oants's flexibility in modelling the integration of cooperative reasoning systems.
Our approach to the cooperation between a higher-order and a first-order theorem prover has many advantages. The main one is that the communication is restricted to the transmission of clauses, and thus it avoids any intermediate translation into the language of the central proof object. This significantly reduces the communication overhead and makes effective proving of more involved theorems feasible.
A disadvantage of our approach is that we cannot easily translate and integrate the two proof objects produced by Leo and Bliksem into the central proof object maintained by Oants. This has been possible in our previous approach with multiple inference rules. Thus, we developed a simple and pragmatic solution to the problem:
In this paper we have discussed the difference between two forms of modelling cooperating proof systems within Oants: the multiple inference rule approach and the single inference rule approach. In previous experiments the latter has been shown as highly efficient and it has outperformed state-of-the-art first-order specialist reasoners on 45 examples on sets, relations and functions; cf. [lpar05]. The drawback so far, however, was that no joint proof object could be generated. In this paper we have reported how we have solved this problem by simply mapping the single inference rule modelling back to the multiple inference rule modelling. Related to our approach is the Techs system [DeFu99], which realises a cooperation between a set of heterogeneous first-order theorem provers. Similarly to our approach, partial results in Techs are exchanged between the different theorem provers in form of clauses. The main difference to the work of Denzinger et al. (and other related architectures like [FisIre:mapp98]) is that our system bridges between higher-order and first-order automated theorem proving. Also, unlike in Techs, we provide a declarative specification framework for modelling external systems as cooperating, concurrent processes that can be (re-)configured at run-time. Related is also the work of Hurd [hurd2002e] which realises a generic interface between HOL and first-order theorem provers. It is similar to the solution previously achieved by Tramp [MeierCade00] in Omega, which serves as a basis for the sound integration of ATPs into Oants. Both approaches pass essentially first-order clauses to first-order theorem provers and then translate their results back into HOL resp. Omega. Some further related work on the cooperation of Isabelle with Vampire is presented in [DBLP:conf/cade/MengP04]. The main difference of our work to the related systems is that while our system calls first-order provers from within higher-order proof search, this is not the case for [hurd2002e,MeierCade00,DBLP:conf/cade/MengP04].
Future work is to investigate how far our approach scales up to more complex problems and more advanced mathematical theories. In less trivial settings as discussed in this paper, we will face the problem of selecting and adding relevant lemmata to avoid immediate reduction to first principles and to appropriately instantiate set variables. Relevant related work for this setting is Bishop's approach to selectively expand definitions as presented in [BiAn98] and Brown's PhD thesis on set comprehension in Church's type theory [Brown04].