The University of Birmingham

Birmingham B15 2TT, England

http://www.cs.bham.ac.uk/~mmk

**ABSTRACT**

In this contribution we propose a variant of a resolution theorem prover which selects resolution steps based on the proportion of models a newly generated clause satisfies compared to all models given in a reference class, which is generated from a subset of the initial clause set. Since the empty clause does not satisfy any models preference is given to such clauses which satisfy few models only. In order to avoid premature convergence, model-guided steps are interleaved with random steps. Since computing the number of models is computationally expensive on the one hand, but will remain almost unchanged after the application of one single resolution step on the other hand, we adapt Kowalski's connection graph method to store the number of models at each link. We report on work in progress and are not yet able to present experimental results.

The connection graph proof procedure was first introduced by Kowalski [Kowalski75]. The main idea is to explicitly represent all resolution possibilities as links in a graph between literals, where the links store not only the applicability but additionally the most general unifier for the application. Independently of open theoretical questions (like completeness, see [Eisinger91]) the method has not attracted much attention in up-to-date resolution theorem provers since the computational overhead to explicitly represent the unifiers in a graph does not pay off compared to other techniques like indexing. In other words the graph is used to store information which can be more cheaply computed from first principles. In particular, the heuristic choice, on which link to resolve makes only marginally use of the links.

Some work has been done to integrate semantic information into
resolution theorem proving. Slaney, Lusk, and McCune [SLM94]
developed the Scott-system which combines the model generator Finder
and the resolution theorem prover Otter. To speak in a simplified way,
Scott extends the set-of-support strategy by the introduction of a
semantic satisfiability check, which is based on iterated model
generation. It could be shown that this approach makes a significant
improvement for some examples. Chu and Plaisted [CP94]
implemented a theorem prover using *hyper-linking* techniques
[LP92] with the concept of the semantic guidance.

The main question in resolution theorem proving is which pair of
literals to select for resolution (or for paramodulation or for
factorisation). Optimal choices lead to shortest proofs. Since
first-order logic is undecidable it seems to be obvious that there is
no efficiently computable strategy which gets these choices right in
*all* cases. However, recent years have seen significant
progress in the solution of NP-hard problems by the combination of
heuristically guided choices and random choices (see, e.g.,
[Gomes98] and also the recent special issue of the Journal of
Automated Reasoning, Vol. 24, Nos.1-2, 2000). Our approach can be
seen as an attempt to use a similar method in resolution theorem
proving in which the heuristic guidance is provided by the cardinality of
models of clauses. It first generates a number of models from the given theory,
then it constructs a clause graph out of the set of all clauses, and
finally searches for a proof by interleaving random selection and
heuristic selection of a link. The heuristic selects the link that
is most promising in the sense that a minimal number of models fulfils
the offspring of a link.

The main idea of the connection graph procedure is to represent the set of clauses for a given problem as a graph in which each node represents a literal. These literal nodes are grouped together to form the clauses (since a clause is a disjunction of literals). Any two complementary unifiable literals are connected by a link, which represents the unification possibility.

When we select a link, we apply the resolution rule by firstly
removing the link, secondly creating the new clause which inherits
literals from both ancestors, and thirdly insert the new links for
this new clause, where the new links are a subset of the links of the
parent clauses. For instance from the graph in
Fig. 1 we obtain by resolving on the rightmost link
the clause graph (unifiers not displayed) in Fig. 2.
The clause graph contains two literals,
R(x) and NOT R(y), which do not have any links to other literals,
they are so-called *pure literals* and the clauses inclusively
all links from literals in these clauses can be removed from the
clause graph, since they cannot have the empty clause among their
successors. The removal of links can influence literal nodes in other
clauses and some neighbouring clauses may be removed as well. In
fact, an application of a resolution rule can cause a chain reaction
of further reduction, which in extreme cases can lead to the collapse
of the whole graph. In the latter case the original problem was
satisfiable. This is the case for the graph in
Fig. 2. Removing the two middle clauses which
contain the pure literals results in the graph in
Fig. 3.

This graph contains two more pure literals, hence the corresponding clause can be removed, leaving just the pure one-literal clause NOT P(a), which can be removed as well.

The semantics of a set of first-order formulae is specified by a pair M=(D,I) , consisting of a domain and an interpretation. The domain D is a nonempty set of objects, and the interpretation I is a mapping from the terms of the language to values in the domain, formulae are mapped to truth values. We say M is a model of a formula (or formula set) if the formula is (or all formulae in the formula set are) evaluated to true.

Assume we have a first-order problem given by a set Gamma and a
theorem phi for which we want to show that phi follows from
Gamma (i.e., Gamma |= phi ). Furthermore we assume that the
formula set Gamma is consistent, that is, that it has
models. In resolution theorem proving phi is negated and it is
proved that Gamma `union` {NOT phi } is unsatisfiable, that is, has
no model.

For our procedure we make the standard assumption that the background
theory Gamma is consistent. Hence it has a model. Furthermore the
method as described below makes use of the assumption that there are
*finite* models for Gamma (since we make use of a model
generator which generates finite models). This is, however, an
assumption that can be relaxed if we proceed to a model generator
which can generate infinite models. Anyway, we think that the class of
problems, for which finite models are sufficient, is rich enough that
it deserves investigation.

Slaney's Finder system [Slaney94] is a model generator which
fixes the domain to a specific number of objects and then performs an
exhaustive search in the space of functions and constants which can be
satisfied in that domain. We use Finder as a model generator to
generate a set of models ** M** = { M

Finder generates models by increasing the cardinality of the domain D . It is possible to give lower and upper bounds for the cardinality. We may either take all models found in a certain bound or limit the result to a certain number of models we are interested in. We will have to experiment which number of models should be considered best. 50-100 may be a typical number for n.

For instance look at the following problem (where P stands for
parallel and Q for orthogonal. Since orthogonal is symmetric,
parallel is as well.):

FORALL x FORALL y ( P(x,y) <-> EXISTS z ( Q(x,z) & Q(z,y) )),
FORALL x FORALL y ( Q(x,y) -> Q(y,x) )

|= FORALL x FORALL y ( P(x,y) -> P(y,x) )

To prove the consequence relation by resolution we assume the first
set

Gamma = {(FORALL x FORALL y ( P(x,y) <-> EXISTS z ( Q(x,z) & Q(z,y) )))
& (FORALL x FORALL y ( Q(x,y) -> Q(y,x) )) }
and negate the second part phi = FORALL x FORALL y ( P(x,y) -> P(y,x) ).
By normalisation we get:

Gamma ' : | {NOT P(x,y) | Q(x,f(x,y)), |

NOT P(x,y) | Q(f(x,y),y), | |

P(x,y) | NOT Q(x,z) | NOT Q(z,y), | |

NOT Q(x,y) | Q(y,x) } | |

NOT phi ' : | { P(a,b), |

NOT P(b,a) } |

If we fix the interpretation domain to D = { 0,1 } , the models generated from Gamma ' are

Models | M_{1} |
M_{2} |
M_{3} |
M_{4} |
M_{5} |
M_{6} |
M_{7} |
M_{8} |
... | M_{100} |
M_{101} |
M_{102} |
... | M_{239} |
M_{240} |

P(0,0) | F | F | F | F | F | F | F | F | ... | T | T | T | ... | T | T |

P(0,1) | F | F | F | F | F | F | F | F | ... | F | F | F | ... | T | T |

P(1,0) | F | F | F | F | F | F | F | F | ... | F | F | F | ... | T | T |

P(1,1) | F | F | F | F | F | F | F | F | ... | T | T | T | ... | T | T |

Q(0,0) | F | F | F | F | F | F | F | F | ... | F | F | F | ... | T | T |

Q(0,1) | F | F | F | F | F | F | F | F | ... | T | T | T | ... | T | T |

Q(1,0) | F | F | F | F | F | F | F | F | ... | T | T | T | ... | T | T |

Q(1,1) | F | F | F | F | F | F | F | F | ... | F | F | F | ... | T | T |

f(0,0) | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | ... | 1 | 1 | 1 | ... | 1 | 1 |

f(0,1) | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 1 | 1 | ... | 1 | 1 |

f(1,0) | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | ... | 1 | 1 |

f(1,1) | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | ... | 1 | 1 |

a | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | ... | 1 | 0 | 0 | ... | 1 | 1 |

b | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | ... | 1 | 0 | 1 | ... | 0 | 1 |

The number of all models we have got is 240. The procedure is now set
up to count for each possible step in the calculus (resolution,
paramodulation, factorisation) how many interpretations in ** M**
are models of the resulting clause. The figure below shows the clause
graph labelled with semantic information and the most promising link
is the one between NOT P(b,a) and P(x,y) | NOT Q(x,z) |
Q(z,y). (Be warned the numbers are not yet machine-generated, hence
they have to be taken with particular care.)

When the link is resolved, a new clause NOT Q(b,z) | NOT Q(z,a) is created. The two literals are linked in the graph. The clause NOT P(b,a) now has a pure literal which does not have any links and can be removed. We display only the number of models for the four newly generated links.

Following a best-first search strategy, the resulting clause with the fewest models is the most promising since it makes the steepest step towards the empty clause which has no models at all. This simple form of best-first search is known to end often in local optima or plateaus. In phases of a plateau the heuristic measure is insignificant and does not tell what to do next. As a way out, a mixture of heuristically guided steps and random steps has successfully been applied in other application areas. That is, use in x% of the steps the heuristic guidance and in (100-x)% make a random choice, where 0<= x<= 100 will be fixed to 40, for instance.

In summary, our procedure is as follows. If we want to show
Gamma |= phi with phi as theorem from which we generate the
set of support, we first generate a set of models from Gamma
with a model generator, concretely Finder. The next steps are to
construct a clause graph from the set of clauses of
Gamma `union` {NOT phi } , applying resolution rules to all connected
literals in the graph, producing resolvents, evaluating these in the
set of models and counting the number of models in which each
resolvent is true. We use the links in the clause graph to store the
number of models (in order not to have to recompute them each in
cycle). The selection of the link on which the next step is to be
performed follows a heuristic which is based on the number of models
in which the resolvent is true.

In pseudo code

INPUT: Gamma ;;; hypotheses phi ;;; conclusionM := GenerateModels( Gamma ); G := ConstructGraph( Gamma

union{NOT phi }); FOR EVERY L a link in G determine N number of models in M and insert Label(L,N); REPEAT Select(L,G); Res := Resolve(L); Insert(Res,G); FOR EVERY L is a link connected to Res determine N number of models in M and insert Label(L,N); Simplify UNTIL empty clause derived OR G empty

An abstract view of the data flow can be found as a flow chart in Fig. 7. The following is a rough description of the whole system. The most important parts are the model generator, the initialising procedure, and selection procedure. The model generator takes the input theory and generates models of it. The initialising procedure builds the initial clause graph with literal nodes and links; it calculates semantic plausibilities for all links and stores them in the links. The selection procedure chooses the links on which will be operated next.

It is a well-known phenomenon that just following a best-first strategy can lead to local optima or plateaus and the global optimum (here, the empty clause with no models) may be missed out. Furthermore it must be left to experiments to see whether the models generated by Finder produce information which forms a good heuristic guidance. Even if that were true in general, it would not mean that there could not be phases in the proof search where the number of models either does not differ for different links or where there is a significant number of links with the same number attached (as in Fig. 5). In the first case, the procedure would not produce significant information, in the latter, it could not be used as a selection strategy, but as a restriction strategy only.

In order to avoid the problem with local optima and plateaus, we introduce a degree of randomness into the selection scheme. We want to test two different procedures. In the first approach we randomly select between a random choice and a best choice (select the link/a link with minimal number of models), where there can be an arbitrary bias towards random choice or best choice (see Fig. 8).

INPUT: BIAS ;;; bias towards best choice G ;;; a clause graphRandomise (n,100) ;;; select n randomly between 0 and 100 IF 0 <= n <= BIAS THEN link := RandomChoice(G) ELSE link := BestChoice(G) ENDIF; RETURN link

Alternatively we can choose links in a biased random way, where in
each selection step the chances of a link to be selected is the
smaller the bigger the number of associated models is. More concretely,
let f(n) be a function that is strictly monotonously decreasing and
always bigger than 0, for instance f(n)=1/n+1. For k links
l_{1},... ,l_{k} with number of models n_{1},... ,n_{k}, respectively,
the chances for link l_{m} to be selected can then be computed as
f(n_{m})/ f(n_{1})+...+f(n_{k})

We are currently implementing the whole procedure and want in particular to test the two selection procedures discussed above. An evaluation is still missing.

The performance of semantically guided resolution will probably depend
to a great degree on the availability of good models. Whether the
models generated by Finder are suitable or not has to be studied for
different example classes. Semantic guidance seems to be a standard
technique humans use when searching for proofs. Rather than
generating models from scratch, they often use *typical
examples*. It remains an open question whether and to which extent
this is necessary in our context. It may also be important to use
infinite models as investigated by Caferra and Peltier [CP98].

We are currently implementing our ideas and want to test the method when the implementation is completed. Our hope is that the two different approaches, the traditional clause graph resolution method and the model-based semantic guidance will combine well together. Furthermore we hope that the random element in the procedure will enable us to obtain a weak completeness result of the following form: Since every possible derivation will eventually be made (because of the random element), for any unsatisfiable clause set the probability that the empty clause is derived in n steps will converge towards one for n to infinity.

- [CP98]
Ricardo Caferra and Nicolas Peltier.
Disinference rules, model building and abduction.
In Ewa Orlowska, editor,
*Logic at Work: Essays Dedicated to the Memory of Helena Rasiowa*, chapter 20, pages 331-353. Physica-Verlag, 1999. - [CP94]
Heng Chu and David A. Plaisted.
Semantically guided first-order theorem proving using hyper-linking.
In Alan Bundy, editor,
*Proceedings of the 12th International Conference on Automated Deduction*, number 814 in Lecture Notes in Artificial Intelligence, pages 192-206. Springer-Verlag, 1994. - [Eisinger91]
Norbert Eisinger.
*Completeness, Confluence, and Related Properties of Clause Graph Resolution*. Pitman, London, UK, 1991. - [Gomes98]
Carla P. Gomes, Bart Selman, and Henry Kautz.
Boosting combinatorial search through randomization.
In
*Proc. AAAI-98*, Madison, Wisconsin, USA, 1998. - [Kowalski75]
Robert Kowalski.
A proof procedure using connection graphs.
*JACM*,**22**, 1975. - [LP92]
Shie-Jue Lee and David A. Plaisted.
Eliminating duplication with the hyper-linking strategy.
*Journal of Automated Reasoning*, 9:25-42, 1992. - [Slaney94]
John Slaney.
Finder: Finite domain enumerator-system description.
In Alan Bundy, editor,
*Proc. of the 12th CADE*, pages 798-801. Springer Verlag, Berlin, Germany, Nancy 1994. - [SLM94]
John Slaney, Ewing Lusk, and William McCune.
SCOTT: Semantically Constrained Otter.
In Alan Bundy, editor,
*Proc. of the 12th CADE*, pages 764-768. Springer Verlag, Berlin, Germany, Nancy 1994.

© 2000. Seungyeob Choi, Manfred Kerber

The URL of this page is http://www.cs.bham.ac.uk/~mmk/papers/00-CADE-Models-WS.html.

Also available as pdf ps.gz bib .