TEACH RESOLUTION Jon Cunningham and Steve Hardy
INTRODUCTION TO RESOLUTION THEOREM PROVING
These notes assume some familiarity with PROLOG.
CONTENTS - (Use g to access required sections)
-- Mechanical inference
-- Prolog as a Theorem Prover
-- A complete Theorem Prover
-- Clausal Form
-- Resolution
-- Controlling what Inferences are made.
-- References
-- Mechanical inference -----------------------------------------------
You should have already met the notion that computers can be programmed
to make simple inferences (see TEACH * PSYS, * SCHEMA, * PROLOG and the
references cited below). In some sense, getting the computer to do
anything requires it to make inferences. This file is about getting the
machine to make inferences explicitly. By an inference, I mean the
process of deriving a new fact using some rule (from given or previously
derived facts). As an example of an inference:
Given the facts:
If Socrates is a Greek then Socrates is a mortal.
Socrates is a Greek.
we can infer
Socrates is mortal.
The above is an example of a VALID inference. It is a particularly
straightforward kind of inference, and the rule I used has a name: it is
called "modus ponens". Here is another example of using modus ponens to
make an inference:
Given:
If Rose is a pig then Rose can fly.
Rose is a pig.
we can infer
Rose can fly.
This is also a valid inference, even though the conclusion is probably
not true. Programmers call this GIGO (garbage in - garbage out), i.e. if
the data is wrong you can't expect sensible results. It doesn't matter
how carefully inferences are made if the initial facts (the premises)
are false.
Here is an example of an INVALID inference.
Given:
If Jon is a Greek then Jon is mortal.
Jon is not a Greek.
therefore we can infer
Jon is not mortal.
Both the premises are true, yet the conclusion isn't. I'm afraid I don't
know a name for this inference rule - probably because I don't find it
very useful. I prefer to make valid inferences. However, there are
undoubtedly occasions when I do make invalid inferences: most people do.
It is particularly hard to avoid accidentally using invalid inference
rules when thinking about a complex, abstract problem. (A good example
of this is in writing programs.)
Supposing Shakey robot wants to plan how to stack up a few boxes. This
is a complex, abstract problem as far as robots are concerned. It starts
with some facts about where things are, facts about where things get to
if it performs certain actions, and it wants to infer that if it
performs a sequence of actions, the boxes will be stacked up
appropriately. If we could invent a general purpose inference making
machine, then all we need to do to find a plan for Shakey, is to give it
a general purpose inference machine and ask it to make an appropriate
inference.
Although, in theory, this method should work okay, in general the number
of possible inferences from a few facts is so great that for any but
trivial problems an uncontrolled general purpose inference machine would
sit thinking while the stars grew old before it came up with the right
plan. For this reason, STRIPS, the planning part of Stanford's Shakey
robot, was not a resolution theorem prover - even though the problems it
solved were represented in predicate calculus. This is why you've
probably met Shakey before under a discussion of Searching. (Shakey did
include a resolution theorem prover as a question-answering component -
see refs on QA3)
-- Prolog as a Theorem Prover -----------------------------------------
Prolog can be regarded as an inference machine. It has solved the
problem of uncontrolled inference by specifying very exactly in what
order inferences are made and by writing down facts in a very specific
format. However, as a consequence of this strict control, prolog is not
a COMPLETE theorem prover. (A logical system is SOUND so long as it
doesn't make any invalid inferences, and it is COMPLETE if it is able to
infer anything which can be validly inferred from the facts.) For
example, suppose we have the following prolog program:
male(mars).
boy(X) :- male(X),not(adult(X)).
This is intended to represent the fact that the deity Mars is male, and
that if an entity is male but not adult, then it is a boy. Now suppose
we know that Mars is not a boy. We can validly infer from the above that
Mars is an adult. (Because otherwise we could infer that Mars is a boy.)
However, prolog is unable to make the inference. The simple explanation
is that prolog lacks the appropriate inference rule. There are two ways
to answer that:
1) The prolog program consists of a fact and an inference rule, which
embodies information about a specific inference to do with boyhood. It
can be extended by adding extra inference rules:
adult(X) :- male(X),not(boy(X)).
In this case, you are regarding the system as a special purpose logical
inference machine built on top of the prolog interpreter.
2) The alternative to the above is to regard prolog as a general purpose
logical inference machine. In this case, what we have been calling
prolog rules are not to be regarded as inference rules, but instead are
to be regarded as a special kind of fact. In this interpretation, prolog
only has one inference rule (of a more general nature). This more
general inference rule can be stated as:
Given the facts:
A.
B.
C.
...
Y.
and the special fact
Z :- A,B,C ... Y.
then you can infer the fact
Z.
Actually, it has to be slightly more complex than that to allow for ';'
and bracketing, and variable bindings.
This single general rule is very much like a generalisation of modus
ponens, but on its own, we have seen that this is not sufficient for
prolog to be complete. Since prolog is intended to be used as a
programming language this doesn't matter, but we are considering how to
construct a complete, sound logical inference machine (or theorem
prover). What other inference rules need to be added?
-- A complete Theorem Prover ------------------------------------------
-- Clausal Form -------------------------------------------------------
There are large numbers of logical systems, each with a different
collection of valid inference rules required to make the system complete
(see any introduction to formal logic, e.g. [Lemmon]). The system that
has become most popular for use on computers needs only a single rule,
called "resolution". Resolution theorem provers were invented by J. A.
Robinson (see references).
In order to manage with only one inference rule, facts have to be stored
in the database in a highly standardised, or NORMAL, form. The normal
form required for required for resolution is called CLAUSAL form.
Usually it will be possible to add axioms to the theorem prover's
database in more general forms, using various logical symbols (e.g.
symbols for logical implication, logical conjunction (and), logical
disjunction (or), logical equivalence, non-equivalence) but any facts
will be converted to, and used internally in, the clausal form. So
before explaining resolution, I must explain how to represent things in
the clausal form.
The database will consist of a collection of facts (often called axioms
or propositions), and all the facts in the database are to be considered
as being true. Each fact will be a list of alternatives, such that at
least one of the alternatives is true. For example, "It is raining or
snowing" would be represented (in some suitable notation) as:
FACT: "It is raining" OR "It is snowing"
I would probably represent that in prolog as:
fact([raining,snowing]).
Notice that a fact is one thing: it is a list of an arbitrary number of
alternatives. As always in prolog, the items of the list are separated
by commas, but this is purely to do with the prolog way of representing
lists, and has nothing to do with the use of ',' to mean 'and'. The
items in the list are alternatives. Incidentally, the fact allows that
it is both raining and snowing, as long as it is doing at least one of
those things.
How to represent "It is raining and snowing" in clausal form? Each fact
is a list of ALTERNATIVES, to represent a fact that is a conjunction of
two simpler facts we must represent the two simpler facts, e.g.:
FACT: "It is raining"
FACT: "It is snowing"
I would represent this in prolog as:
fact([raining]).
fact([snowing]).
Notice that in this case we have facts which are lists of one element,
but they are still lists.
How do we represent "All men are mortal"? As an ordinary prolog rule
(rather than a fact for a prolog theorem prover), we would write this as
mortal(X) :- man(X).
Actually, there is a simple way to translate prolog rules into clausal
form, so rather than doing it via English, I'll go straight to a prolog
theorem prover representation:
fact([mortal(X),not(man(X))]).
Basically, as long as there are no ';'s in the body of the rule, all you
do is make a list out of the head and body of the rule, and put 'not' on
each item from the body. Another example:
aunt(X,Y) :- sister(X,Z),parent(Z,Y).
becomes
fact([aunt(X,Y),not(sister(X,Z)),not(parent(Z,Y))]).
One way of translating this back into English is:
"For any three people, X,Y and Z, either X is Y's aunt, or
else X isn't Z's sister, or else Z isn't Y's parent."
It probably requires some thinking about to see that this is equivalent
to the usual English reading of the prolog clause! That is why I
recommend you translate rules into prolog, and then into 'facts'.
Prolog rules are very easy to translate into clausal form because, in a
sense, they are already in clausal form! Prolog rules are often referred
to as 'Horn clauses', and are an adaptation by Horn from the form where
there is no distinction between alternatives in the list. [Incidentally,
how would you represent in prolog the fact that Mr. Platypus is known to
be either a bird or a mammal? It is easy in clausal form:
fact([bird(platypus),mammal(platypus)]).
but it is not obvious to me how to do this in prolog. One possibility:
bird(platypus) :- not(mammal(platypus)).
In this case, there is a default of assuming Platypus to be a bird in
the absence of any information to the contrary: the representation is
not symmetrical between the two possibilities. This could lead in to a
discussion of the whole problem of default reasoning and then to the
frame problem etc.]
I don't like all the brackets that are beginning to accumulate in the
facts, so I'm going to assume there have been prolog operator
declarations that enable facts to be written omitting some of the
brackets. In addition, I am going to use '#' instead of 'not' because it
is more visible. (There is no reason why a theorem prover written in
prolog should use the same symbol as the prolog interpreter to mean
'not': they are conceptually distinct.) Thus I've changed the notation
so that the last fact is now written:
fact[aunt(X,Y),#sister(X,Z),#parent(Z,Y)].
I feel that that is an improvement, but remember that you would need two
operator declarations to be able to type such an expression directly to
prolog.
Incidentally, since a double negative is equivalent to no negative in
most logics (an exception is 'intuitionistic logic'), the prolog:
male(mars).
boy(X) :- male(X),not(adult(X)).
would be translated to:
fact[male(mars)].
fact[boy(X),#male(X),adult(X)].
-- Resolution ---------------------------------------------------------
I will assume you are able to translate all the facts and prolog-like
rules into a clausal form. (Although this assumption is probably
unreasonable, I do expect that anyone familiar with prolog will be able
to translate some rules into clausal form.)
The only inference rule we need with clausal form is the resolution
rule. This rule says that if we have two facts, and one fact includes
an alternative which occurs in the other fact negated, then we can make
a new fact from all the other alternatives from both of the original
facts. A symbolic example:
From the facts:
fact[a,#b,c,d,#e,f].
fact[c,#d,g,h,#i].
it can be inferred that:
fact[a,#b,c,#e,f,g,h,#i].
In the above example, 'd' occurred positively in one of the given facts,
and negated in the other, so the inference rule says to make a new list
of everything else from both given facts. Notice that 'c' occurs the
same in both given facts, but it only needs to be included once in the
conclusion. Strictly speaking, this elimination of repetitions should
be included in the statement of the inference rule. (There is no need
to say "It is raining or else it is snowing or else it is raining".)
Actually, to state the rule precisely, would require mention of what to
do if the facts contain variables: strictly speaking the two 'd's in the
above example have to be UNIFIED (which may cause variables to be
replaced by values). Prolog matching is a kind of unification, so
simply testing to see whether the two 'd's match (using 'member' or '='
etc.) should be okay.
-- Controlling what Inferences are made. ------------------------------
As you can see from the example, it would be quite easy for a theorem
prover using the above inference rule to derive longer and longer facts.
However, a real theorem prover shouldn't be allowed to run away like
that. The vital principle that brings it under control is the way in
which the inference rule is used. This is the subject of another demo
(to be written), alternatively, read the section in Raphael's book (see
refs).
For information about some Prolog programs that translate from Predicate
Calculus into clausal form and embody various simple strategies for
resolution theorem proving, as well as a programming assignment, see
TEACH * RESOLVEEXERCISE.
-- References ---------------------------------------------------------
This list is not comprehensive, nor is it as 'tidy' as it should be.
"The Logic Machine" in
Computers and Thought" eds Feigenbaum and Feldman.
This is a description of a VERY early program to solve simple logical
problems. It indicates an initial approach to theorem proving.
Bertram Raphael
The Thinking Computer
This general introduction to Artificial Intelligence has a section
on theorem proving by machine.
D. Michie et al, (eds)
Machine Intelligence, Vols III and IV
In these two volumes of collected papers there are two papers by
Robinson on the 'resolution' principle and its use in building 'question
answering' programs.
Nils Nillson
Problem Solving Methods in Artificial Intelligence
This has a very thorough coverage of various problem solving
methods. (The title says it all). There is a section on theorem
proving.
E. J. Lemmon.
Beginnning Logic
An introductory logic textbook.
"Some Philosophical Problems from the Standpoint of AI"
Pat Hayes and John McCarthy. This isn't directly about theorem proving,
but concerns some of the problems of representing real-world knowledge
in logic.
W. Clocksin and C.S Mellish.
"Programming in Prolog
Springer-Verlag
Chapter 10 gives a brief discussion of resolution theorem-proving and
the relation of Prolog to logic.
A. Ramsay and R. Barrett
AI In Practice: Examples in POP-11
Ellis-Horwood 1987
Includes a description of a theorem prover written in POP-11.
The STRIPS problem solver (by Fikes and Nilsson) made extensive use of
the QA3.5 theorem prover. See TEACH * STRIPS, TEACH * SOLVER
------------