School of Computer Science
The University of Birmingham
Birmingham, B15 2TT, England
WWW: http://www.cs.bham.ac.uk/~mmk
ABSTRACT
A good knowledge representation system has to find a balance between expressive power on the one hand and efficient reasoning on the other. Furthermore it is necessary to understand its limitations and problems. A logic which contains strings is very expressive and allows for very natural representations, which in turn allow for appropriate reasoning patterns. However, such a system has the feature that it is possible to formulate self-referential paradoxes in it. This can be considered as a strength and as a weakness at the same time. On the one hand it is a positive aspect that it is possible to represent paradoxes, which can be formulated in natural language. On the other hand it is necessary to be careful and not to trivialise the logical system. In the paper different aspects of knowledge representation which allows self-referentiality will be discussed. A system will be presented which is a pragmatic compromise between expressive power on the one hand and simplicity and efficiency of the reasoning process on the other hand. It is built on a three-valued system that makes it possible to use reasoning techniques from classical first-order logic.
The problems with paradoxical sentences have been known for very long, but were initially thought to be of no relevance for the development of set theory and logic around the end of the 19th/start of the 20th century. But then paradoxes in set theory and logic seemed to endanger the progress in the field and for a while it wasn't clear whether "the paradise of sets" built by Cantor (as Hilbert called it) and "the paradise of logic" built by Frege weren't inhabited by terrible lions that could attack and kill at any time. Zermelo presented in [Zermelo08] the first axiomatisation of set theory, and Russell [Russell08] in the same year the theory of types. This way it was (apparently) possible to fence the lions out.
While to a certain extend the constructs of axiomatic sets and types may be considered as adequate for mathematical reasoning, there are quite a number of examples in the areas of language understanding and knowledge representation in a more general sense, where we need a more powerful language, where we can't and/or do not want to exclude self-referentiality a priori on mere syntactic grounds. If we did, we would end up with a system which is difficult to use. Actually quite a number of complicated formalisms have been developed and are currently in the main stream of investigations in AI, although some simpler system may do the job better. Perlis argued in [Perlis85] that we "can have everything in first-order logic." He investigated a system which is close to the one we propose here, it's first-order logic plus strings.
Our system will be made more formal in the next section. In section 3 we make it clear what we mean by a paradox. Section 4 is devoted to three-valued Kleene logic, the system that will be used to deal with paradoxes. Section 5 describes an efficient form of reasoning for the three-valued approach. Some related work is presented in section 6. Finally a summary is given.
Let us look now at some typical examples of pieces of knowledge that can be represented and reasoned with in syntactic theory:
In this contribution, we shall see in more detail, a three-valued approach to deal with paradoxes; formulae like the liar sentence are admitted, but they are neither true nor false, but paradoxical. In order to do so, a third truth value, nn (standing for "neither-nor") is added. As we shall see, just adding such a truth value does not solve the problem, but the system has to fulfil an additional constraint. However, before we go into a semantic analysis, let's first fix the syntax and then clarify what we mean by a paradox.
The language we want to investigate is a standard first-order sorted language plus strings as terms. In addition to the standard setting, we use strings in the language. To this end we take characters as constant symbols in the language. Following the description in [Davis90] we prefix characters by a colon, e.g., : C is the character C, : & the character standing for the conjunction connective. Furthermore we assume a binary function symbol, pair, on strings, which takes a character and a string as input. From the empty string epsilon we build up all strings, e.g., pair(: C, pair(: a, pair(: t,epsilon))). Since these expressions play a major role, we introduce as syntactic sugar for expressions like the one above the notation "Cat".
It is possible to define and use syntactic predicate symbols. Variable("x") and Objconstant(" John ") denote, for instance, that x and John are a variable and an object constant, respectively. Details of such a construction can be found in [Genesereth87, Chapter 10]. Sorts stand for unary predicates. That is, a term t is of sort S (written as tIN S) iff S(t). Sorted quantification like FORALL in. xS A semantically means that the x's are not taken from the whole universe of discourse, but only from the subset of those elements which are in the interpretation of S. The sort system adopted here has been introduced by Weidenbach [Weidenbach95].
The so-defined language is reflective. For instance, it is possible to define the syntax of the language within the language. Although it is intended that strings and objects they stand for are closely related, we must carefully distinguish between them. There is an intuitive difference between long( John ) and long(" John ") where long is a polymorphic predicate symbol that is true for persons over 195cm and strings consisting of at least eight characters.
In addition to the syntactic function and predicate symbols there are semantic predicate symbols such as true . We shall discuss in particular this predicate symbol in detail in the sequel. For this special predicate symbol we will also have to give a special semantic account (as one does for first-order logic plus equality, where the equality sign gets a fixed interpretation).
Traditionally at this point the semantics of the language is discussed. In our case this is not so straightforward and the meaning of syntactic theory is one of the major problems with this language. Hence we shall first try to get an intuitive idea of the semantics, then we discuss in more detail how such a system can be build up.
The last example, sentence, Löb, is not a paradox in a strict sense, but an example that violates the conservativity of definitions. It can be stated as "If this proposition is true, then A ." where A is an arbitrary formula. Formally we get Löb :<=> true ("Löb") -> A . With the definition of truth this simplifies to Löb <=> (Löb -> A ). This formula is problematic in a two-valued setting since it allows to prove A for arbitrary A . The only consistent way to assign truth values is one, in which A must be true. That is, defining Löb potentially changes the semantic status of sentences.
Of course it is possible to doubt whether these examples are relevant and do really occur; also whether there is a syntactic way to exclude them. Kripke [Kripke75] stated the so-called Watergate Paradox by summarising different propositions made by a journalist Jones and by the former US president Nixon about the Watergate affair. Nixon says "Everything Jones says about Watergate is true." and Jones says "Most of Nixon's assertions about Watergate are false." If we assume in addition that Nixon made 2n+1 assertions about Watergate altogether of which n are definitively true, n definitively false, plus the one above, we have reproduced the paradox of the liar.
This example suggests that it is not possible to decide just on syntactic grounds whether sentences make up a paradox or not, since this may depend on the real world context. If Nixon, for instance, always told the truth except for the statement about Jones, there wouldn't be a paradox, the two statements would simply be false. There seems not to be a clearcut syntactic borderline of what should be admissible and what not.
We will understand by a paradox, a definition which cannot be evaluated to the truth value true . A definition is a sentence which introduces a new concept, that is, a constant symbol, function symbol, or predicate symbol, which has not been in the signature. A property we usually expect from a definition is that it forms a conservative extension, that is, that the semantic status of old concepts remains unchanged by the introduction of the new concept.
In addition to the treatment of Kleene logic in the papers mentioned above, in the following we have to deal with strings (a simple conservative extension as discussed above) and additionally the axiom schema for defining the semantics of the true predicate.
The semantics is based on three truth values, false , nn , and true . The meaning of the connectives can be defined by the truth tables
~ | |
-------- | ------------- |
false | true |
nn | nn |
true | false |
D | |
-------- | ------------- |
false | true |
nn | false |
true | true |
& | false | nn | true |
-------- | ------------- | ||
false | false | false | false |
nn | false | nn | nn |
true | false | nn | true |
<=> is true if and only if the input truth values are the same and false else. Other connectives | , -> , and <-> can be defined in & and NOT just as in classical two-valued logic. The semantics of the universal quantifier is defined by I _{phi}(FORALL in. xS A) :<=> *FORALL({ I _{phi,[a/x]}(A)| a IN A_{S}}) ]
REMARK
Note that this is (purposefully) not a proper definition of the
true predicate, since it doesn't unambiguously fixes the meaning
of true . Let us assume, for instance, a knowledge base
gamma :<=> { T , T <=> true (" T ")}.
Nothing can be derived if someone protests his sincerity.
The sentence has three fixpoints, false , nn ,
and true . In the semantics we fix the
meaning of true just to the extent that it mirrors Tarski's definition
of truth and build a calculus which exactly corresponds to this.
If we assume this semantics it is possible to consistently assign
truth values to the liar sentence, namely with I ( L )= nn
the whole equivalence is true (and hence not causing any trouble).
In the following a refutation tableau calculus for three-valued Kleene logic is presented, the details can be found in [KeKo96-KGS]. Let A be a formula, then A ^{ alpha } is a labelled formula if ptyset =/= alpha subset { false , nn , true }.
In order to show that a formula B follows from a formula set { A _{1},... , A _{n}} we can prove that the set of labelled formulae { A _{1}^{ true },... , A _{n}^{ true }, B ^{ nn , false }} is unsatisfiable.The formulae in the formula set are implicitly conjuncts. Formulae like B ^{ nn , false } stand for B is nn or B is false . That is, it is shown that the assumption that the A _{i} hold and B is nn or false is inconsistent ("quartum non datur"-principle).
Tableau Rules: Tableau rules for & and FORALL are: ]c
( A & B )^{ true } | |
A ^{ true } B ^{ true } |
( A & B )^{ nn } | |
A ^{ nn,true} B ^{ nn,true} A ^{ nn }| B ^{ nn } |
( A & B )^{ false } | |
A ^{ false }| B ^{ false } |
(FORALL x_{S} . A )^{ true } | |
[y_{S}/ x_{S}] A ^{ true } |
(FORALL x_{S} . A )^{ nn } | |
[f(y^{1},... , y^{n})/x_{S}] A ^{ nn } [y_{S}/x_{S}] A ^{ nn , true } (f(y^{1},... , y^{n})IN S)^{ true } |
(FORALL x_{S} . A )^{ false } | |
[f(y^{1},... , y^{n})/x_{S}] A ^{ false } (f(y^{1},... , y^{n})IN S)^{ true } |
Now we only need tableau closure rules: These are a cut rule, which allows to close a tableau which contains formulae with complementary labels and the strict rule which allows to close a tableau if a defined expression contains an undefined term. Where possible we would like to make use of most general unifiers, of course. In the case of string expressions, these may not be so easy to compute, however, and a unification algorithm would have to make use of techniques developed in higher-order unification. In both cases so-called sort constraints insure the correctness (in terms of the sorts) of the instantiations. We quote the soundness and completeness theorem from [KeKo96-KGS]:
\begintheorem[Soundness and Completeness]
The tableau calculus is sound and complete for Kleene logic.
\endtheorem
REMARK
In order to ensure soundness and completeness for the syntactic
theory add axioms that exactly capture the meaning of the semantic
predicates, i.e., in the case of true add
true (" A ") <=> A either as axiom schema or as
simplification rules (plus the reverse rules):
]c
( true (" A "))^{ alpha } | |
A ^{ alpha } | if A wwf |
( true (" A "))^{ alpha } | |
( true (" A "))^{ alpha union { nn }} | else |
When we look at the sentence "This sentence is paradoxical or false", we can define it in this system as P :<=> NOT D true (" P ") | NOT true (" P "). With the definition of truth we can simplify the expression to P <=> NOT D P | NOT P . If we look at the possible truth values for P , that is, false , nn , and true , we get an evaluation, which shows that P cannot be consistently assigned to any of the three truth values that are available. That means, while we have tamed some paradoxes like the liar sentence by assigning a third truth value to them, some are still wild beasts and cause mayhem in form of contradictions.
Neither the attempt to remedy the problems with paradoxes by many-valued logics nor the problems of an ad hoc solution by introducing a further truth value are new. In 1939 Bochvar made such an attempt, but as Church pointed out the same year, the problems can be easily reconstructed in the three-valued setting again (see [Urquhart86, p.75]). Actually, adding more and more truth values is a solution to the problem, but there is no intuitive meaning of these values any more.
The source of any paradoxes of self-referentiality relies on the fact that it is possible to define a formula to which no truth value can be assigned. This can be done in the propositional logic case if and only if it is possible to define a function in the connectives that is fixpoint free on the truth values. In a two-valued system such a function is easily defined, it is just negation (lambdax .NOT x), that is, as soon as negation is part of a self-referential two-valued system, paradoxes are not far away, or to put it the other way around: in a two-valued logic without negation (explicit or implicit) it is not possible to construct paradoxes. To give up negation in a knowledge representation system is, of course, a serious restriction of the expressive power.
In our three-valued setting above lambdax .NOT D x | NOT x and lambdax .(x <=> NOT x) <=> x are fixpoint-free as well. If we apply one of these functions to any of the three truth values false , nn , or true , the result will never be the same as the input of the function. From this the paradoxes are easily constructed.
However, if we restrict the language to a system, in which each function that can be built from the connectives has at least one fixpoint (conveniently, nn often plays that role), it is not possible to construct self-referential paradoxes. This restriction seems not to be a very serious one, since it is rarely desired to communicate paradoxicality. If one really liked to do so one would have to go to a four-valued setting (which would have paradoxes of a higher kind in itself, of course).
In a three-valued setting, no language that contains the connectives NOT , | , and D and no language with unrestricted use of NOT and <=> guarantees the existence of fixpoints. But NOT has nn as fixpoint, lambdax .x | x has the same fixpoint nn as well (and in addition the fixpoints false and true ), and the fixpoint property is conserved under composition.
However, this restriction would not prevent the construction of more subtle paradoxes, in which we construct paradoxes by defining a binary function f (using only NOT and | ) in the truth values such that x :<=> f(x,y) for a fixed truth value y. This would cause a problem if x =/= f(x,y) for all possible truth values x. In that case y can't be nn , since then x= nn would be a fixpoint. So let us assume that y =/= nn . Is it possible to define a function f in NOT and | such that f(x,y) =/= x for x= true , x= nn , and x= false ? This is not the case, since when restricted to false and true , f has also range { false , true }. In order to be fixpoint free on { false , true }, f(x,y) must be NOT x, since this is the only fixpoint free function of all possible 16 such functions. Hence it will have the fixpoint nn on the full set { false , nn , true }. (Typically there are more fixpoints than just nn .)
\begintheorem Any function built from the connectives NOT and | contains a fixpoint. \endtheorem
The fixpoint property has as corollary that definitions form conservative extensions. Let us now have a look why universally quantified formulae like A <=> FORALLx . B have a fixpoint. This is potentially problematic if x is a string which allows an instantiation with an expression that contains A . If B can be made false for one instantiation of x, A is false. If B can be made true for all instantiations, then A is true. In other cases (it is not possible to instantiate x to get a false value, but it is not always true), we can select nn as fixpoint. This possibility is given since the propositional system of connectives is not complete.
In order to be able to introduce definitions (like the definition of the liar sentence, but of course of more useful sentences as well), the <=> connective cannot be abandoned altogether, but has to be allowed in definitions, that is, in the form A <=> ... or FORALLx . A (x) <=> ... , where A is a predicate constant. This does not cause problems, since the right hand side will have a fixpoint of A if A occurs in it, hence the equivalence as such can always be made true .
Since we use a restricted language without a D connective it is possible to make use of the strategy for reusing two-valued theorem proving methods in dealing with Kleene logic as described in [KI97]. That means, efficient standard two-valued theorem provers can be made to efficient theorem provers for the approach above just by adding a simple restriction strategy.
This forms a calculus for first-order Kleene logic in general. In order to handle quoted expressions, we have to add axiom schemas that describe the semantics of the meta-predicates defined on them. In the case of the truth predicate true , which we have looked at more closely, this means we add the rules from remark \refremark:rules, or corresponding axiom schemas.
The main result from [KI97] is that it is possible to use the structural similarity between the tableau rules for truth value sets containing nn and those not containing nn . It is possible to just eliminate nn in these rules. Since it is possible to avoid rules for the truth value nn altogether by using combined rules (for false , nn and nn , true ), it is possible to establish the following strategy result, which allows to employ standard first-order theorem proving techniques for Kleene logic as well.
Theorem
(Strategy)
Let S be the control strategy not to close a tableau on
two formulae both stemming from the theorem, then each two-valued
tableau proof found using S can be lifted to a three-valued
tableau proof for the theorem.
REMARK
The result from [KI97] leaves us with the question how to treat
the <=> connective efficiently. We could use
tableau rules for <=> . However, we make only restricted use of
the <=> connective, namely for definitions ("
definiendum <=> definiens"). For that reason only the
first of the three tableau rules (( A <=> B )^{ true }) is
relevant at all. But even this rule splits the tableau into three
tableaux, hence it would be rather inefficient. Even more seriously,
the strategy result mentioned above could not be conserved. Instead
of using this rule, we simply assume that in proofs all definitions
can be expanded, that is, we take a definition expansion rule into
the calculus (for arbitrary truth value sets alpha ).
]c
A ^{ alpha } ( A <=> B )^{ true } | |
B ^{ alpha } |
The first solution to treat paradoxes goes back to Russell and it is as easy as brilliant, namely simply to exclude them. Russell introduced types such that in any expression of the form P(Q), P has to have a higher type than Q. In particular P can't express anything about itself or about anything that can make statements about P. This approach has been adapted by Tarski [Tarski36] to hierarchical meta-systems, in which it is possible to speak explicitly about the truth of statements, but not in the language itself, but in some meta-language. If one wants to make statements about the meta-language, this is possible as well, but then one needs a meta-meta-language. This approach has been formalised and implemented in the multi-level meta-systems FOL and Getfol [Giunchiglia94c]. While this approach is appropriate for mathematics, examples like the Watergate paradox point to the conclusion that excluding self-referentiality altogether is not appropriate for (at least) some applications.
An alternative to totally forbidding self-reference is restricting it, putting it in the right context. This approach is discussed by Barwise and Etchemendy in detail in [Barwise87] as the second approach, which they call the Austinian approach. The idea is to restrict self-referential expressions in an adequate way. Applied to Russell's example of the barber this means, the definition of the barber as "the man who shaves all and only the men who don't shave themselves" is problematic. But, for instance, to define the barber as "the man who shaves all and only the men who live in Oxford and who don't shave themselves" means only that the barber does not live in Oxford. While I agree that many situations which seem to be paradoxical on the first view can be clarified so that the paradox goes away, certain sentences are deliberately paradoxical.
Kripke [Kripke75] and others attacked the problem by changing the semantic construction of the truth values of formulae. They go beyond two-valued logic in a way that they assume a third truth value, which stands for the paradoxical expressions. There are different ways, either to assume truth value gaps [Fraassen66] or a third truth value, which states undefinedness explicitly [Kleene52, p.332-340]. In both systems, Tarski's original definition of truth can be incorporated, that is, true (" A ") <=> A and the liar sentence, L :<=> NOT true (" L ") can be dealt with as well. In the first system with truth value gaps, no truth value is assigned to the liar sentence L . In the second, L is evaluated to the truth value nn , in particular true (" L ") is nn as well. The question arises how to determine the truth values in such expressions. To this end Kripke proposes a fixpoint iteration, a process in which in each iteration truth values are assigned to more and more formulae. Kripke's approach is an appropriate general semantic account. Such a general approach is necessary for a general approach. It has been made more formal and in particular it has been given a semantic interpretation by Barwise and Etchemendy [Barwise87]. They use Aczel's theory of hypersets (sets which do not fulfil the foundation axiom) to model self-referentiality. A more detailed description is given in [Barwise96]. A similar approach has been developed in parallel by McGee [McGee90]. In Perlis' approach [Perlis85] Tarski's definition of truth is changed to a system in which L and NOT true (" L ") hold at the same time.
The approach presented in this paper can be viewed as a restricted version of what Barwise and Etchemendy call the Russellian approach. It builds up on earlier work by the author [JELIA98], and is closely related to Ramsay's approach [Ramsay01], which is built up on a constructive lambda-calculus.
Summarising, we have presented a system for stating facts about truth. By adding further axiom schemas it is possible to extend the framework to deal with knowledge and belief as well. No unnatural restrictions on the expressive power are made in order to avoid self-referential statements. As shown in section 5 standard calculi can be used. In this work we have not produced a formal fixpoint construction for the true predicate, but only given an indication why such fixpoints exist. Building on the work of Kripke, Barwise, Etchemendy, and Moss this should be possible, but is left as future work.