ABSTRACT
Syntactic theory, that is, first-order logic with strings, is a powerful knowledge representation formalism. However, its usage is not without problems since it is possible to formulate self-referential paradoxes in it. In this contribution we present a system which is a pragmatic compromise between expressive power on the one hand and simplicity and efficiency of the reasoning process on the other hand. It is built on a three-valued system that makes it possible to use reasoning techniques from classical first-order logic.
Keywords: Knowledge representation, strings, paradoxes, three-valued logic
I've squandered over it much time,Paradoxes have from the days of the Ancient Greeks been a considerable source of exhilaration. A very entertaining book on paradoxes is [Poundstone88]. They can be very funny, but also considerably confusing and lead to inappropriate conclusions. The probably best known example of a paradox and a curious inference drawn from it is Zeno's paradox about Achilles and the tortoise. If in a race between the two, the tortoise is a bit ahead at the start, at time point t0, Achilles has no chance ever to catch it, even him being the fastest runner of all. Zeno's reasoning goes as follows: When Achilles reaches at time point t1 the position p0 where the tortoise was at t0, it will be already gone there and be at a new position p1, which Achilles can reach at a time t2, but then the tortoise has moved on to p2 and so on. At any point in time ti the tortoise is ahead of Achilles. Zeno concluded that this proves that motion is impossible. With a deep understanding of infinite series we have reached nowadays, we see where Zeno went wrong. All the time intervals under consideration add up to a finite amount of time and when that has passed, Achilles reaches the tortoise, and after that overtakes it. For many centuries this example was not only funny, but deeply confusing and led to strange conclusions like Zeno's that motion is impossible. We see that care is advisable when concluding what follows from a paradox.
For perfect contradictions, in the end,
Remain mysterious alike for fools and sages. ...
Men oft believe, if only they hear wordy pother,
That there must surely be in it some thought or other.
Johann Wolfgang von Goethe. Faust I. 1808.
In this contribution we do not look any more at paradoxes in a very broad sense, but at so-called logical paradoxesWe will see later what is exactly meant by a logical paradox.. These have been known at least back to the Cretan prophet Epimenides who said "All Cretans are liars." Under some additional assumptions this leads to a paradoxical situation. In its most concise form, the paradox can be formulated by the sentence: "This sentence is false." Let's assume it is true, then it must be false, since it says that it is false. Hence we have a contradiction since it is true and false at the same time. So our initial assumption that it is true can't have been correct, hence it must be false. This leads to a contradiction as well, since if it is false, its content can't be true, hence the sentence can't be false.
The problems with paradoxical sentences have been known for very long, but were initially thought to be of no relevance for the development of set theory and logic around the end of the 19th/start of the 20th century. But then paradoxes in set theory and logic seemed to endanger the progress in the field and for a while it wasn't clear whether "the paradise of sets" built by Cantor (as Hilbert called it) and "the paradise of logic" built by Frege weren't inhabited by terrible lions that could attack and kill at any time. Frege [Frege79] had formalised predicate logic and achieved a new level of clarity and rigour. Naive set theory was built by Cantor and first published in 1874 [Cantor74]. The first paradoxes were found in 1897 by Burali-Forti, and have been refined to paradoxes which involve the cardinality of the set of all sets. Russell's notion of the set of all sets which do not contain themselves presented a major problem for set theory and also meant that Frege's original system (see [Heijenoort67] for the original conversation between Frege and Russell) was reflective and paradoxical.
Zermelo [Zermelo08] presented in 1908 the first axiomatisation of set theory, and Russell [Russell08] in the same year the theory of types. This way it was possible to fence the lions out. It was achieved by the foundation axiom which excludes sets which contain (or worse do not contain) themselves in the case of set theory, and the theory of types which excludes that a predicate can speak about itself. This way it was possible to build a safe area, which is free of paradoxes. To be more precise, there might still be some wild beasts out there in the bush. But for a hundred years there was no more attack and so we are pretty confident that there aren't any out there inside the fence. But of course, you never know, that may just be a false kind of security.
While to a certain extend the constructs of axiomatic sets and types may be considered as adequate for mathematical reasoning, there are quite a number of examples in the areas of language understanding and knowledge representation in a more general sense, where we need a more powerful language, where we can't and/or do not want to exclude self-referentiality a priori on mere syntactic grounds. If we did, we would end up with a system which is difficult to use. Actually quite a number of complicated formalisms have been developed and are currently in the main stream of investigations in AI, although some simpler system may do the job better. Perlis pleaded in 1985 [Perlis85] that we "can have everything in first-order logic." Perlis investigated a system which is close to the one we propose here, it's first-order logic plus strings. Unlike our proposals which is built on three-valued logic, Perlis redefined Tarski's definition of truth. We will discuss the relationship in more depth later.
Our system will be made more formal in the next section. In section 3 we make it clear what we mean by a paradox. Some related work and some different approaches how to deal with paradoxes are presented in section 4. Section 5 is devoted to three-valued Kleene logic, the system that will be used to deal with paradoxes. Section \refsec:reasoning describes an efficient form of reasoning for the three-valued approach. In section \refsec:examples some examples are discussed in the light of the approach adopted. Finally a summary and future work are discussed.
The limits of my language mean the limits of my world.
Ludwig Wittgenstein. Tractatus logico-philosophicus. 1921
Before we introduce a formal syntax for syntactic theory we introduce some examples of expressions we want to be able to formulate. Then we fix a syntax, and discuss problems with it.
Let us look now at some typical examples of pieces of knowledge that can be represented and reasoned with in syntactic theory:
The predominant formalism for representing knowledge and belief in agent systems is based on modal logics (there is a vast amount of literature on modal logic. I just point to Fitting's general introduction, where further pointers can be found, see [Fitting93]). However, syntactic theory has some crucial advantages over modal logic, although some problems as well. As Davis points out in [Davis90, p.77] the difference between the two approaches is pretty much the same as the difference between direct quotation ("John knows `The evening star is the morning star.' ", formally represented in syntactic theory by know( John ,"evening-star=morning-star")) and ("John knows that the evening star is the morning star.", formally represented in modal logic by [] John evening-star=morning-star). In both approaches the truth values of the formulae are not extensional, that is, the truth value of composed formulae cannot be calculated as functions of the subformulae: In modal logic the truth value in all possible worlds reachable from an initial world has to be known, in syntactic theory, expressions like "evening-star=morning-star" stand for strings of symbols and not for the objects they denote.
The examples mentioned above can be represented in syntactic theory as follows:
Syntactic theory as any self-referential language in general allows for the representation of sentences such as the liar sentence ("This sentence is false", formally L :<=> ~ TRUE (" L ")).Note that we have not introduced an explicit " this" operator for denoting self-referentiality as done by Barwise and Etchemendy in [Barwise87], but that self-referentiality comes into play through the back door by introducing explicit names like L . This seems to be a crucial disadvantage compared to modal logics. While powerful modal logics are faced with the same problem as pointed out by Perlis [Perlis88], in the case of modal logics there is a good understanding how the paradoxes can be avoided. In syntactic theory there are also ways around these problems, but many solutions seem to be ad hoc and so restrictive that the advantages of syntactic theory are no longer visible at all.
In this article, we shall see in more detail, a three-valued approach to deal with paradoxes; formulae like the liar sentence are admitted, but they are neither true nor false, but paradoxical. In order to do so, a third truth value is added. As we shall see, just adding such a truth value does not solve the problem, but the system has to fulfil an additional constraint. However, before we go into a semantic analysis, let's first fix the syntax and then clarify what we mean by a paradox.
In addition to this standard setting, we use strings in the language. To this end we take characters as constant symbols in the language. Following the description in [Davis90] we prefix characters by a colon, e.g., : C is the character C, : & the character standing for the conjunction connective. Furthermore we assume a binary function symbol, pair, on strings, which takes a character and a string as input. From the empty string epsilon we build up all strings, e.g., pair(: C, pair(: a, pair(: t,epsilon))). Since these expressions play a major role, we introduce as syntactic sugar for expressions like the one above the notation "Cat".
It is possible to define syntactic predicate symbols. For instance, Variable("x") and Objconstant(" John ") denote that x and John are a variable and an object constant, respectively. Terms can be defined by formulae like FORALLx . Term(x) <=> Objconstant(x) | Variable(x) | Funexpress(x) and FORALLf,l . Funexpress( conc(f,"(,l,")"") <=> Funconst(f) & Termlist(l). Concatenation conc is a syntactic function symbol. It is assumed that all terms can have a sort, for instance, in FORALLx . Term(x) <=> Objconstant(x) | Variable(x) | Funexpress(x), x is assumed to be of sort string. Details of such a construction can be found in [Genesereth87, Chapter 10]. Sorts stand for unary predicates. That is, a term t is of sort S (written as t in S) iff S(t). Sorted quantification like FORALL in. xS A semantically means that the x's are not taken from the whole universe of discourse, but only from the subset of those elements which are in the interpretation of S. The sort system adopted in this article has been introduced by Weidenbach (see, e.g. [Weidenbach95]).
The so-defined language is reflective. For instance, it is possible to define the syntax of the language within the language. Although it is intended that strings and objects they stand for are closely related, we must carefully distinguish between them. There is an intuitive difference between long( John ) and long(" John ") (where long is assumed to be a polymorphic predicate symbol that is true for persons over 195cm and strings consisting of at least eight characters.
In addition to the syntactic function and predicate symbols there are semantic predicate symbols such as TRUE . We shall discuss in particular this predicate symbol in detail in the sequel. For this special predicate symbol we will also have to give a special semantic account (as one does for first-order logic plus equality, where the equality sign gets a fixed interpretation).As an alternative, one could add an eval construct to the language, which maps strings like "Löb" to the formula Löb. We could then define TRUE ("Löb") as eval("Löb"), which is just Löb. However, in order to keep things simple we don't follow this approach.
Traditionally at this point the semantics of the language is discussed. In our case this is not so straightforward and the meaning of syntactic theory is one of the major problems with this language. Hence we shall first look again at examples in order to get an intuitive idea of the semantics, then we discuss in more detail how such a system can be build up.
On first encounter, it's hard not to consider assertions of this sort as jokes, hardly matters of serious intellectual enquiry. But when one's subject matter involves the notion of truth in a central way, for example when studying the semantic properties of a language, the jokes take on a new air of seriousness: they become genuine paradoxes. And one of the important lessons of twentieth century science, in fields as diverse as set theory, physics, and semantics, is that paradoxes matter. The significance of a paradox is never the paradox itself, but what it is a symptom of. For a paradox demonstrates that our understanding of some basic concept or cluster of concepts is crucially flawed, that the concepts break down in limiting cases. And although the limiting cases may strike us as odd or unlikely, or even amusing, the flaw itself is a feature of the concepts, not the limiting cases that bring it to the fore. If the concepts are important ones, this is no laughing matter.The existence of problematic expressions in syntactic theory has been known back to the start of the 20th century. Grelling and Nelson [Grelling07] described paradoxes in syntactic theory long before the language was made precise by Tarski. They defined a predicate heterogeneous as FORALLx . heterogeneous("x") <=> ~ x("x"). If we take the predicate long on strings, we get heterogeneous("long"), since ~long("long") holds. The paradox occurs when we check, whether heterogeneous is heterogeneous or not. By definition we get: heterogeneous("heterogeneous") <=> ~heterogeneous("heterogeneous"). While this expression is not a first-order expression, and not expressible in our language, closely related ones are, like the liar sentence. We will have a closer look now at the problematic sentences 6-9 of section 2.
Jon Barwise, John Etchemendy. The Liar. 1987
If we assume a standard first-order semantics there is no problem with the liar sentence 6, since the strings like " L " and " A " are ordinary ground terms in the language and have nothing to do with their counterparts L and A . However, as such they are not particularly meaningful and do not capture the intended meaning. Of course we would like the predicate TRUE to mean truth. Tarski [Tarski36] gave a famous definition of truth:In Tarski's definition it is assumed that A is a formula, since otherwise the equivalence between a formula and a non-formula wouldn't make sense. TRUE (" A ") <=> A as axiom schema for arbitrary formulae A . If we add this schema the liar example is turned into a paradox, since we get: L <=> ~ TRUE (" L ") <=> ~ L , hence L <=> ~ L . Using Tarski's definition of truth on sentence 7 results in the same problem A <=> ~ A . While one may hope to exclude these examples on the grounds that the symbol that occurs on a right hand side may not occur on a left hand side, the next example shows that this restriction does not help.
Socrates' sentence that he knows that he knows nothing, sentence 8, can't be true as soon as one adds extra information that comprises the fact usually assumed (to distinguish knowledge from belief) namely that sentences that are known by an agent are true: FORALLa .FORALLx .know(a,x) -> TRUE (x). With this axiom and the definition of truth we can firstly derive FORALLx .~know(I,x). When we secondly instantiate x with "FORALLx .~know(I,x)" we get ~know(I,"FORALLx .~know(I,x)"), the negation of the sentence we started with ("I don't know that I know nothing."). The sentence can be consistently false, assumed that Socrates knows nothing, inclusively that he doesn't know that either.
The last example, sentence 9 is not a paradox in a strict sense, but an example that violates the conservativity of definitions. It can be stated as "If this proposition is true, then A ." where A is an arbitrary formula. Formally we get Löb :<=> TRUE ("Löb") -> A . With the definition of truth this simplifies to Löb <=> (Löb -> A ). This formula is problematic in a two-valued setting since it allows to prove A for arbitrary A . Let's use truth tables to get the argument:
| Löb | <=> | (Löb | -> | A ) | |
| false | false | false | TRUE | false | |
| false | false | false | TRUE | TRUE | |
| TRUE | false | TRUE | false | false | |
| TRUE | TRUE | TRUE | TRUE | TRUE |
The only consistent way to assign truth values is the last, in which A must be true. That is, defining Löb potentially changes the semantic status of sentences.
Of course it is possible to doubt whether these examples are relevant and do really occur; also whether there is a syntactic way to exclude them. Kripke [Kripke75] stated the so-called Watergate Paradox by summarising different propositions made by a journalist Jones and by the former US president Nixon about the Watergate affair.
This example shows that it is not possible to decide just on syntactic grounds whether sentences make up a paradox or not, since this may depend on the real world context. If Nixon, for instance, always told the truth except for the statement about Jones, there wouldn't be a paradox, the two statements would simply be false.
A similar paradox - which has been reported to me by Aaron Sloman in private communication - was constructed at The University of Rummidge"Rummidge is an imaginary city, with imaginary universities and imaginary factories, inhabited by imaginary people, which occupies, for the purposes of fiction, the space where Birmingham is to be found on maps of the so-called real world." Rummidge has been invented by Lodge in [Lodge75], but the paradox is not described in Lodge's books.: In a dispute between a Dean (higher in the hierarchy) and a Head of School (lower in the hierarchy), the Dean gave a directive to a lecturer (at the bottom of the hierarchy) "not to do anything the Head of School tells him," the Head of School told him in addition "that he should strictly follow the directive of the Dean."
For a sound foundation of mathematics restrictions were developed (in form of formal set theory and type theory) in order to exclude paradoxes syntactically. While this seems to be adequate in a mathematical context, it seems not to be so in other contexts. Natural language allows to express paradoxes and any formal system that is built to represent static aspects of natural language needs to include the possibility for paradoxes as well. Furthermore there seems not to be a clearcut syntactic borderline of what should be admissible and what not.
Before we discuss related work, let's clarify what we mean by a paradox now. According to the Encyclop\aedia Britannica, 15th edition, paradoxes are "apparently self-contradictory statement[s], underlying meaning of which is revealed only by careful scrutiny. The purpose of a paradox is to arrest attention and provoke fresh thought. The statement of the architectural principle `Less is more' is an example." In a more restricted reading we will understand by a paradox, a definition which cannot be evaluated to the truth value TRUE . A definition is a sentence which introduces a new concept, that is, a constant symbol, function symbol, or predicate symbol, which has not been in the signature. A property we usually expect from a definition is that it forms a conservative extension, that is, that the semantic status of old concepts remains unchanged by the introduction of the new concept.
Let us look at two examples. To add to a knowledge base an expression like A & (~ A ) clearly makes it inconsistent. This is obvious, also it does not involve any definition, and hence it is not a paradox. If, however, we define a new constant predicate like the liar by L :<=> ~ TRUE (" L ") this leads together with the instance of the definition of truth, TRUE (" L ") <=> L to L <=> ~ L . Hence this definition cannot be TRUE . Hence we have a paradox.
Summarising I think a powerful knowledge representation system that is built to capture many features of truth, knowledge, and belief, has to deal with the phenomena of paradoxes that can be formulated in natural language. Rather than considering this as a flaw of a representation system, it should be considered as an interesting feature that models some aspects of knowledge representation in natural language. Surely paradoxes are dangerous and one has to tame them, wild ones should be excluded. Another important goal is to keep things simple. In a way, we may or may not like paradoxes, but the raison d'\etre of representation systems is not to encode paradoxes. So we should try to find a simple system that can deal with the phenomena. Before we do so we will look at related work, also in order to clarify what the options are in the presence of paradoxes. Then in section 5 I present a three-valued system with semantics and calculus that I want to advocate.
And always there are traces, and always somebody else was there,There is an abundance of literature on paradoxes and there has been groundbreaking work on paradoxes, in which the reasons for their existence as well as ways how to deal with them has been explored. So it would be easy to fill books just with an overview on this literature and it will be impossible to give a fair account on a couple of pages.It will be even more difficult to motivate that there is still room for a contribution, since everything that possibly could be said seems to be said.
and always somebody climbed even higher than you ever could, much higher.
That must not discourage you. Climb, ascend, ascend.
But there is no peak. And there is no virgin snow.
Kurt Tucholsky
Paradoxes in different forms have been known for 2500 years and have puzzled and fascinated people over all the years. They have occurred in different forms and disguises, from paradoxes of time and space to paradoxes we are interested here, logical paradoxes.
Logic has been built to speak about the truth of sentences. Intrinsically in the possibility to speak the truth is the possibility to lie. Any sufficiently strong representation language must offer this possibility - natural language certainly does. Apparent contradictions are of the type A & ~ A while paradoxes reveal themselves on a second look only. There are different approaches how to deal with paradoxes. Some of those we briefly sketch in the following.
This approach has been formalised and implemented in the multi-level meta-systems FOL [Weyhrauch80] and Getfol [Giunchiglia94c]. These systems consist of different instances of first-order logic, an object system (standard classical first-order logic) and a meta-system in which the terms stand for syntactical expressions (like terms and formulae) of the object systemsActually the systems are extended to allow different meta-levels, in particular a meta-meta-level and so on.. The relation between the meta-level and the object level is established by so-called bridge rules. They state, whenever in the meta-system TRUE (" A ") has been derived, it is possible to derive A on the object level by a rule called reflection-down, and vice versa, when A or ~ A is derived on the object level, it is possible to assume TRUE (" A ") or ~ TRUE (" A ") on the meta-level.
Let us exemplify this in showing how a simple meta-theorem, namely
FORALL A . TRUE ( A ) -> ~
TRUE ("~< A >")
can be proved in such a system.
We use O to indicate the object level and M for the meta-level.
Proof:
| M | 1 | 1 |- | TRUE (" A 0") | (Ass) |
| O | 2 | 1 |- | A 0 | ( reflection-down 1) |
| O | 3 | 1 |- | ~~ A 0 | (~~ I 2) |
| M | 4 | 1 |- | ~ TRUE ("~ A 0") | (~ reflection-up 3) |
| M | 5 | |- | TRUE (" A 0") -> ~ TRUE ("~ A 0") | ( -> I 1,4) |
| M | 6 | |- | FORALL A . TRUE ( A ) -> ~ TRUE ("~< A >") | (FORALL I 5) |
Self-referentiality is not possible, the meta-system speaks about the object system, but never about itself. Tarski's definition of truth is built into the reflection rules ( reflection-down, reflection-up, and ~ reflection-up). The liar sentence as such is not representable in the system. Useful sentences are, however, excluded as well. It is not possible to extend the system in a way that two agents can make utterances about each other in an unrestricted form. For instance, in order to represent that two agent trust each other, one might represent that A says "Everything B says is right." and B says "Everything A says is right.". This is not expressible in such a system. Alike the situations of the Watergate and the University of Rummidge paradoxes are not expressible any more. If you allow extensions of this kind, you introduce the paradoxes again in sentences such as: A says "Everything B says is wrong." and B says "Everything A says is right." or the Watergate paradoxes.
Barwise and Moss [Barwise96, p.55] give two examples of self-referential knowledge that are useful and occur in everyday contexts:
In my opinion these examples point to the conclusion that excluding self-referentiality altogether is not appropriate for (at least) some applications.
While I agree that many situations which seem to be paradoxical on the first view can be clarified to an extend that the paradox goes away, certain sentences are deliberately paradoxical (like The University of Rummidge paradox). Furthermore, in certain cases it may be difficult to see whether a certain sentence (or set of sentences) is problematic or not. For instance, why is the first definition of the barber problematic, while the second is not? I consider the feature that it is possible to derive substantial facts from the second definition ("The barber doesn't live in Oxford.") as problematic as well, since logic as such should not guarantee that there are people outside Oxford (that is, the definition should not be problematic if it were known in addition that there is no one living outside Oxford). The possibility to derive substantial facts from definitions (like in the so-called Löb paradox) clearly contradicts the idea that definitions should form conservative extensions.
In the first system with truth value gaps, no truth value is assigned to the liar sentence L . In the second, L is evaluated to the truth value nn ,It is not easy to give a not misleading name to the third truth value. unknown is misleading, since it could mean that the value is not yet known (to be either TRUE or false ) but might be eventually determined. For paradoxes this is not adequate, since it is not possible to assign a truth value false or TRUE to it. undefined fits much better, but can be confusing as well, since on the meta-level of discussion the value is defined. For these reasons, I use nn in this paper. Read it as neither-nor (or if you prefer as nomen nescio). in particular TRUE (" L ") is nn as well.
The meta-theorem above FORALL A . TRUE ( A ) -> ~ TRUE ("~< A >") does not hold anymore, since with the instance " L ", we get TRUE (" L ") -> ~ TRUE ("~ L "). This expression reduces to L -> ~~ L and this in turn to L -> L . The last expression does not have any truth value in the first system and the truth value nn in the second.
The question arises how to determine the truth values in such expressions. To this end Kripke proposes a fixpoint iteration, a process in which in each iteration truth values are assigned to more and more formulae. Kripke's approach is an appropriate general semantic account. Such a general approach is necessary for a general approach. It has been made more formal and in particular it has been given a semantic interpretation by Barwise and Etchemendy in [Barwise87]. They use Aczel's theory of hypersets (sets which do not fulfil the foundation axiom) to model self-referentiality. A more detailed description is given in [Barwise96]. A similar approach has been developed in parallel by McGee [McGee90].
A more significant disadvantage is that - compared to the first
approach - a general formula like FORALL A .
TRUE ( A ) -> ~ TRUE ("~< A >") cannot
so easily be proved anymore, but requires an inductive argument on the
construction of the formulae A . Assume, for instance,
A :<=> ~ TRUE ("P") with P a nullary predicate constant.
The corresponding instance of the formula above is:
TRUE ("~ TRUE ("P")") -> ~ TRUE ("~~ TRUE ("P")").
With the new definition of truth and elimination of double negation,
this formula can be rewritten to:
TRUE ("~
P") -> ~ TRUE ("P"), a formula which reduces to ~
P -> ~ P in turn.
In the following we will look at a three-valued approach, which can be viewed as a restricted version of what Barwise and Etchemendy call the Russellian approach. The restrictions imposed makes it possible to base it on a standard calculus for three-valued Kleene logic. Kohlhase and the author have developed efficient calculi for this logic (see [KeKo94-CADE,KeKo96-KGS,KI97]). In the next section we will introduce the three-valued logic and discuss its basic properties and problems in the framework of syntactic theory. Appropriately restricted it will provide a simple but powerful system to deal with the paradoxes in syntactic theory.
Having a fundamental flaw like this in a logic is worrisome, like carrying a loaded grenade; you never know when it might go off.In this section three-valued Kleene logic [Kleene52] is extended in order to deal with syntactic expressions. As we shall see this allows us to treat the liar sentence adequately, but is not yet quite the final system, since not all problems with paradoxes can be avoided that way. The treatment in the following corresponds roughly to the development of the mechanisation of Kleene logic developed by Kohlhase and myself in [KeKo94-CADE,KeKo96-KGS,KI97]. In the first of these contributions a resolution calculus for sorted three-valued Kleene logic is described, in the second a tableau calculus, and in the last a way is shown how to extend a classical two-valued theorem prover by a simple restriction strategy in a way that it can be used as a theorem prover for an interesting fragment of Kleene logic. I give here only a short summary of the results, for details see the corresponding papers.
Ernest Davis. Representations of Commonsense Knowledge. 1990
In addition to the treatment of Kleene logic in the papers mentioned above, in the following we have to deal with strings (a simple conservative extension as discussed above) and additionally the axiom for defining the semantics of the TRUE predicate.
| ~ | |
| false | TRUE |
| nn | nn |
| TRUE | false |
| D | |
| false | TRUE |
| nn | false |
| TRUE | TRUE |
the binary connectives & ( and) and <=> ( strongly equivalent) by
| & | false | nn | TRUE |
| false | false | false | false |
| nn | false | nn | nn |
| TRUE | false | nn | TRUE |
| <=> | false | nn | TRUE |
| false | TRUE | false | false |
| nn | false | TRUE | false |
| TRUE | false | false | TRUE |
Other connectives | , -> , and <=> can be defined in & and ~ just as in classical two-valued logic, e.g., A | B :<=> ~(~ A & ~ B ), and the corresponding truth tables can be calculated from these definitions.
The semantics of the universal quantifier is defined by I phi(FORALL in. xS A) :<=> *FORALL({ I phi,[a/x](A)| a in AS}) *FORALL(T):=
| TRUE | for
T={ TRUE } or T=emptyset |
| nn | for
T={ TRUE , nn } or T={ nn } |
| false | for false in T |
The semantics of the TRUE predicate is defined as I ( TRUE (x))=
| I ( A ) | if
x=" A " and
A is a well-formed formula |
| nn | else |
REMARK
Note that this is not a proper definition of the TRUE predicate,
since it doesn't unambiguously fixes the meaning TRUE .
Let us assume, for instance, a knowledge base
Gamma :<=> { T , T <=> TRUE (" T ")}.
This knowledge
base is consistent (just as the knowledge base
Gamma' :<=> {~ T , T <=> TRUE (" T ")}). Nothing can be
derived if someone protests his sincerity. This again corresponds to
classical logic, where a knowledge base that consists of an atom
A is consistent as well as a knowledge base that just consists of
~ A . The sentence has three fixpoints, false , nn ,
and TRUE . One would here have the option to choose one of the
fixpoints, for instance, in an order false < nn < TRUE select
the smallest or largest fixpoint, but such a choice seems to be
arbitrary. The sentence seems in its semantics as open as a predicate
symbol P, which can be interpreted as false , nn , or
TRUE as well. As a consequence, in the semantics we just fix the
meaning of TRUE to the extent that it mirrors Tarski's definition
of truth and build a calculus which exactly corresponds to this. Of
course we have to guarantee that a fixpoint exists, since otherwise we
would run into the problems we try to avoid. We will give some
indication we fixpoints exist later. A formal construction - making
use of Kripke's, Barwise, Etchemendy's and Moss' work - is left for
the future.
REMARK
If we assume this semantics it is possible to consistently assign
truth values to the liar sentence, namely with I ( L )= nn
the whole equivalence is true (and hence not causing any trouble).
| L | <=> | ~ | L | |
| false | false | TRUE | false | |
| nn | TRUE | nn | nn | |
| TRUE | false | false | TRUE |
In the following a refutation tableau calculus for three-valued Kleene logic is presented, the details can be found in [KeKo96-KGS]. Let A be a formula, then A alpha is a labelled formula if emptyset =/= alpha subset { false , nn , TRUE }.
In order to show that a formula B follows from the formula set { A 1,... , A n} it is possible to prove that the set of labelled formulae { A 1 TRUE ,... , A n TRUE , B nn , false } is unsatisfiable.The formulae in the formula set are implicitly conjuncts. Formulae like B nn , false stand for B is nn or B is false . That is, it is shown that the assumption that the A i hold and B is nn or false is inconsistent ("quartum non datur"-principle).
| (~ A ) TRUE |
| A false |
| (~ A ) nn |
| A nn |
| (~ A ) false |
| A TRUE |
| (~ A ) nn,true |
| A nn,true |
| (~ A ) nn,true |
| A nn,true |
The D rule for the nn case closes the branch (an explicit symbol * is used for that), since ( D A ) nn is unsatisfiable.
| ( D A ) TRUE |
| A false,true |
| ( D A ) nn |
| * |
| ( D A ) false |
| A nn |
| ( D A ) nn,true |
| A false,true |
| ( D A ) nn,true |
| A nn |
| ( A & B ) TRUE |
|
A TRUE B TRUE |
| ( A & B ) nn |
|
A nn,true B nn,true A nn | B nn |
| ( A & B ) false |
| A false | B false |
| ( A & B ) nn,true |
|
A nn,true B nn,true |
| ( A & B ) nn,true |
| A nn,true | B nn,true |
| (FORALL xS . A ) TRUE |
| [yS/xS] A TRUE |
| (FORALL xS . A ) nn |
|
[f(y1,... ,yn)/xS] A nn [yS/xS] A nn , TRUE (f(y1,... ,yn) in S) TRUE |
| (FORALL xS . A ) false |
|
[f(y1,... ,yn)/xS] A false (f(y1,... ,yn) in S) TRUE |
| (FORALL xS . A ) nn,true |
| [yS/xS] A nn,true |
| (FORALL xS . A ) nn,true |
|
[f(y1,... ,yn)/xS] A nn,true (f(y1,... ,yn) in S) TRUE |
For details of the quantifier rules see [KeKo96-KGS].
Now we only need tableau closure rules: The cut rule and the strict rule
| A alpha B beta |
| *|SC(sigma) |
| CGamma (t in D ) false |
| *|SC(sigma) |
We quote the soundness and completeness theorem from [KeKo96-KGS]:
THEOREM [Soundness and Completeness]
REMARK
In order to ensure soundness and completeness for the syntactic
theory add axioms that exactly capture the meaning of the semantic
predicates, i.e., in the case of TRUE add
TRUE (" A ") <=> A either as axiom schema or as
simplification rules:
| ( TRUE (" A ")) alpha |
| ( A ) alpha |
| ( TRUE (" A ")) alpha |
| ( TRUE (" A ")) alpha intersect{ nn } |
| ( A ) alpha |
| ( TRUE (" A ")) alpha |
REMARK
In addition to the rules above we could introduce tableau rules for
dealing with the <=> connective
| ( A <=> B ) TRUE |
| A TRUE B TRUE | A nn B nn | A false B false |
| ( A <=> B ) nn |
| * |
| ( A <=> B ) false |
| A TRUE B nn | A TRUE B false | A nn B TRUE | A nn B false | A false B TRUE | A false B nn |
When we look at the sentence "This sentence is undefined or false",
we can define it in this system as
P :<=> ~ D TRUE (" P ") | ~ TRUE (" P ").
With the definition of truth we can simplify the expression to:
P <=> ~ D P | ~ P . If we look at the possible truth
values for P , that is, false , nn , and TRUE , we
get the following evaluation, which shows that P cannot be
consistently assigned to any of the three truth values that are
available:
| P | <=> | ~ | D | P | | | ~ | P | |
| false | false | false | TRUE | false | TRUE | TRUE | false | |
| nn | false | TRUE | false | nn | TRUE | nn | nn | |
| TRUE | false | false | TRUE | TRUE | false | false | TRUE |
In the same line, the expression G :<=> ( TRUE ("G") <=> ~ TRUE ("G")) <=> TRUE ("G") causes the same problem. That means, while we have tamed some paradoxes like the liar sentence by assigning a third truth value to them, some are still wild beasts and cause mayhem in form of contradictions.
Neither the attempt to remedy the problems with paradoxes by many-valued logics nor the problems of an ad hoc solution by introducing a further truth value are new. In 1939 Bochvar made such an attempt, but as Church pointed out the same year, the problems can be easily reconstructed in the three-valued setting again (see [Urquhart86, p.75]). In order to use Lakatos' terminology [Lakatos76]: the problem with such an ad hoc attempt can be ascribed to fact that introducing a third truth value just means to bar the monster "liar paradox" without gaining any insight why the monster occurred in the first place. As Lakatos lets say a student in his virtual classroom [Lakatos76, p.23]: "I think that if we want to learn about anything really deep, we have to study it not in its `normal', regular, usual form, but in its critical state, in fever, in passion. If you want to know the normal healthy body, study it when it is abnormal, when it is ill. If you want to know functions, study their singularities. ... But even if you were basically right, don't you see the futility of your ad hoc method? If you want to draw a borderline between counterexamples and monsters, you cannot do it in fits and starts." The next section is devoted to this question.
Actually, adding more and more truth values is a solution to the problem. For instance, we could add a fourth truth value nn 4 to deal with sentences of the kind above, which would allow a higher type of paradoxes. Analogously to Tarski's hierarchy of meta-languages, we could introduce a hierarchy of truth value systems with & defined as the minimum on the truth values in a particular order. Here starts already the problem, however. Should one use as order false < nn < nn 4< nn 5<... < nn 1000<... < TRUE , or false <... < nn 1000< nn 5< nn 4<... < nn < TRUE ? While it is intuitively clear what true and false mean and that there are some sentences - typically to be avoided - which are neither nor, it's a bit difficult to see what it means that a sentence is evaluated to a truth value nn 10. For this reason we do not follow this approach, but follow the pragmatic one to deal with just three truth values - analogously as it is a pragmatic approach to have just one single meta-level.
The source of any paradoxes of self-referentiality relies on the fact that it is possible to define a formula to which no truth value can be assigned. This can be done in the propositional logic case if and only if it is possible to define a function in the connectives that is fixpoint free on the truth values. In a two-valued system such a function is easily defined, it is just negation (lambda x .~ x), that is, as soon as negation is part of a self-referential two-valued system, paradoxes are not far away, or to put it the other way around: in a two-valued logic without negation (explicit or implicit) it is not possible to construct paradoxes. To give up negation in a knowledge representation system is, of course, a serious restriction of the expressive power.
In our three-valued setting above lambda x .~ D x | ~ x and lambda x .(x <=> ~ x) <=> x are fixpoint-free as well. If we apply one of these functions to any of the three truth values false , nn , or TRUE , the result will never be the same as the input of the function. From this the paradoxes are easily constructed.
However, if we restrict the language to a system, in which each function that can be built from the connectives has at least one fixpoint (conveniently, nn often plays that role), it is not so easy to construct self-referential paradoxes. This restriction seems not to be a very serious one, since it is rarely desired to communicate paradoxicality. If one really liked to do so one would have to go to a four-valued setting (which would have paradoxes of a higher kind in itself, of course).
As the two counterexamples above show, in a three-valued setting, no language that contains the connectives ~, | , and D and no language with unrestricted use of ~ and <=> guarantees the existence of fixpoints. ~ has nn as fixpoint, lambda x .x | x has the same fixpoint nn as well (and in addition the fixpoints false and TRUE ), and since the fixpoint property is conserved under composition.
However, this restriction would not prevent the construction of more subtle paradoxes, in which we construct paradoxes by defining a binary function f (using only ~ and | ) in the truth values such that x :<=> f(x,y) for a fixed truth value y. This would cause a problem if x =/= f(x,y) for all possible truth values x. In that case y can't be nn , since then x= nn would be a fixpoint. So let us assume that y =/= nn . Is it possible to define a function f in ~ and | such that f(x,y) =/= x for x= TRUE , x= nn , and x= false ? This is not the case, since when restricted to false and TRUE , f has also range { false , TRUE }. In order to be fixpoint free on { false , TRUE }, f(x,y) must be ~ x, since this is the only fixpoint free function of all possible 16 functions. Hence it will have the fixpoint nn on the full set { false , nn , TRUE }. (Typically there are more fixpoints than just nn .)
THEOREM
Any function built from the
connectives ~ and | contains a fixpoint.
Since the connectives & , -> , and <=> can be defined in ~ and | the property holds of course, when we add these connectives to the system.
THEOREM
Self-referential paradoxes are not constructible in the
three-valued system described above that contains just the
connectives ~, | , & , -> , and <=> .
The fixpoint property has as corollary that definitions form conservative extensions.
Let's assume two cases in a definition, i.e., an expression of the form A :<=> phi. Firstly let us assume that A does not occur in phi. We are in the traditional setting and the definition forms a conservative extension for the same reasons as usual. If, however, A occurs in phi (and phi is built up from ~, | , & , -> , and <=> ), we can interpret A by the fixpoint for phi. This fixpoint leaves the semantic status of the atomic formulae involved in phi unchanged. In particular, no additional restrictions are imposed on those. That is, Löb's paradox does not show up.
Note that even in the case of complicated expressions in which different agents make assertion with mutual relations, the fixpoint property allows us to consider everything as undefined. Take, for example, the situation in which agent A says "What B says is wrong." and B says "What A says is right." This can be formalised as: A :<=> ~ TRUE (" B ") and B :<=> TRUE (" A "). With the definition of truth, A reduces to ~ B and B to A , the first expression states that A holds if and only if B does not, while the second states that A holds if and only if B does too. Both cannot consistently hold with the truth values TRUE and false . If however, we assert nn to A and B , the problem is gone, since nn is a fixpoint of ~.
Let us now have a look why universally quantified formulae like phi <=> FORALLx .psi have a fixpoint. This is potentially problematic if x is a string which allows an instantiation with an expression that contains phi. If psi can be made false for one instantiation of x, phi is false. If psi can be made true for all instantiations, then phi is true. In other cases (it is not possible to instantiate x to get a false value, but it is not always true), we can select nn as fixpoint. This possibility is given since the propositional system of connectives is not complete.
Note that the system with truth value gaps and the three-valued system
differ. In both systems a true expression like
~long("long") evaluates to the truth value TRUE .
However, if we add a paradoxical disjunct like the liar sentence, in
the first system the truth value gap is contagious and the whole
expression ~long("long") | L does not get a truth
valueOr we could build a different three-valued system, in
which the truth table for & would look like
. That is, any expression that contains one nn expression
receives the truth value nn . As a consequence nn is
always a fixpoint of any expression.. In the second approach the
expression is evaluated to TRUE | nn , that is, to TRUE . &
false
nn
TRUE
false
false
nn
false nn
nn
nn
nn TRUE
false
nn
TRUE
In order to be able to introduce definitions (like the definition of the liar sentence, but of course of more useful sentences as well), the <=> connective cannot be abandoned altogether, but has to be allowed in definitions, that is, in the form A <=> ... or FORALLx . A (x) <=> ... , where A is a predicate constant. This does not cause problems, since the right hand side will have a fixpoint of A if A occurs in it, hence the equivalence as such can always be made TRUE .
Note that not all self-referential sentences are paradoxical. For instance, "This sentence is true" is self-referential, formally, if we call the sentence T , it can be expressed as TRUE ( T ). T :<=> TRUE (" T "). All three truth values, TRUE , false , and nn , can consistently be assigned to this sentence. Other examples include the Barwise and Moss examples mentioned in the first approach of section 4 to dealing with paradoxes ("copyright note" and "airport announcement").
Note furthermore that due to the restriction on the connectives, the system does not contain many tautologies (on the propositional logic level only the formulae TRUE (" A ") <=> A are tautological). This shouldn't be seen as a disadvantage, since under normal circumstances, we do not want to communicate tautologies, but substantial facts and then reason about consequences of them. In order to do so we enlarge the set of axioms by further assumptions and use these for deriving further formulae.
While we have seen a way how to deal with definitions of sentences, the question of how to define terms has to be dealt with differently in order to avoid running into self-referential paradoxes. Pragmatically, it's an extension of partiality that we need to know explicitly whether a term we can write down, has a meaning or not. That is, initially each term created by definition is under suspicion and needs a proof of well-definedness, like the barber defined as the man who shaves everyone in the village who does not shave himself is ill-defined and gets semantic status F, while anything stratifiable one gets the semantic status of an element of the universe of discourse. In general, whether a term has a meaning or not is a semantic property, that is, depends on the meaning and cannot be defined syntactically. The following is an example to which I was pointed by Aaron Sloman (see [Sloman71] for details): "The father of the subject of this sentence is mortal," formally, phi=Mortal(Father(subject("phi"))) is undefined since nobody can be his own father, while an expression with the same structure like phi=integer(square(subject("phi"))) can be given a meaning, since 0 and 1 are fixpoints of the square function. More challenging is the question whether a definition involving terms forms a conservative extension. In this work, we do not follow this up.
"Logic!" said the Professor half to himself. "Why don't they teach logic at these schools? There are only three possibilities. Either your sister is telling lies, or she is mad, or she is telling the truth. You know she doesn't tell lies and it is obvious that she is not mad. For the moment then and unless any further evidence turns up, we must assume that she is telling the truth."The question is how can we restore a non-trivial system when there are no proper tautologies in it. This is achieved by considering sequents, that is, pairs consisting of a set of formulae phi and a formula A (written phi |- A ), in which all formulae in phi are assumed to be true. The semantic equivalent is the model relation phi |= A that stands for: in all models of phi, that is, all interpretations that evaluate every formula in phi to true, A holds as well, that is, A is evaluated to true as well (for details of the semantics see [KeKo96-KGS]). A proof theory that is built on the refutation principle makes use of the fact that all formulae in phi must be true and refutes the assumptions that A might be false or nn . Note the asymmetry between phi and A in this approach, formulae in phi can't be nn , while A might be.
Clive Staples Lewis. The Lion, the Witch and the Wardrobe. 1950
If we use a restricted language without a D connective it is possible to make use of the strategy for reusing two-valued theorem proving methods in dealing with Kleene logic as described in [KI97]. That means, any efficient standard two-valued theorem provers can be made to an efficient theorem prover for the approach above just by adding a simple restriction strategy.
This forms a calculus for first-order Kleene logic in general. In order to handle quoted expressions, we have to add axiom schemas that describe the semantics of the meta-predicates defined on them. In the case of the truth predicate TRUE , we have looked at more closely, this means we add the following axiom schema to the axioms (in an efficient implementation we would use a corresponding simplification rule labelled by the truth value TRUE ).
| TRUE (" A ") <=> A for well-formed formulae A |
The main results from [KI97] are:
THEOREM [Conservativity]
Each tableau proof (using compound rules only) for a normal problem
phi |= T (no D in phi and T) in 3-valued logic can be
isomorphically transformed into a tableau proof in 2-valued
logic.
The idea is to use the structural similarity between the tableau
rules for truth value sets containing nn and those not
containing nn . It is possible to just eliminate nn
in these rules. For instance, one gets
| (A & B) nn , TRUE |
|
A nn , TRUE B nn , TRUE |
| (A & B) TRUE |
|
A TRUE B TRUE |
Since it is possible to avoid rules for the truth value nn altogether by using combined rules (for false , nn and nn , TRUE ), it is possible to establish the following strategy result, which allows to employ standard first-order theorem proving techniques for Kleene logic as well.
THEOREM [Strategy]
Let S be the control strategy never to close a tableau on
two formulae both stemming from the theorem, then each two-valued
tableau proof found using S can be lifted to a three-valued
tableau proof for the theorem.
REMARK
The strategy S is not complete for classical first-order
logic, since it separates those theorems which hold in Kleene logic
(and hence in two-valued first-order logic as well) from those
holding in two-valued first-order logic (but not in Kleene logic).
With the conservativity theorem, it is possible to prove that
S is complete for proving consequents Gamma |- A with
formulae that use as connectives only the connectives ~ and
| (and those that can be defined in them).
REMARK
The result from [KI97] leaves us with the question how to treat
the <=> connective efficiently. As mentioned above, we could use
tableau rules for <=> . However, we make only restricted use of
the <=> connective, namely for definitions ("
definiendum <=> definiens"). For that reason only the
first of the three tableau rules (( A <=> B ) TRUE ) is
relevant at all. But even this rule splits the tableau into three
tableaux, hence it would be rather inefficient. Even more seriously,
the strategy result mentioned above could not be conserved. Instead of
using this rule, we simply assume that in proofs all definitions can
be expanded, that is, we take a definition expansion rule into the
calculus (for arbitrary truth value sets alpha ).
| A alpha ( A <=> B ) TRUE |
| B alpha |
A logical theory may be tested by its capacity for dealing with puzzles, and it is a wholesome plan, in thinking about logic, to stock the mind with as many puzzles as possible, since these serve much the same purpose as is served by experiments in physical science.In this section, we look at some examples again in order to clarify the three-valued restricted approach to syntactic theory.
Bertrand Russell, quoted from [Poundstone88, p.15]

If the knowledge base just contains the definition of the liar sentence, but does not assert that it holds, that is, Gamma :<=> { L <=> ~ TRUE (" L ")} we can just derive L <=> ~ L (with possibly nested TRUE predicates around expressions). If there is nothing else in the knowledge base, nothing else can be derived from this formula. In case there are other formulae containing L , we are allowed to replace L by ~ L . The formula L <=> ~ L is true, since L is nn and nn <=> nn is true. That is, just defining the liar sentence does not cause any problem. By our remarks above, we get that in general it is not possible to introduce contradictions by definitions.
Assume now a simple knowledge base, in which agent A says " B says the truth", B says " C says the truth" and C says "Joe is the murderer", let's furthermore assume that A speaks the truth. Formally we have the knowledge base Gamma :<=> { TRUE (" A "), A <=> TRUE (" B "), B <=> TRUE ("C"),C <=> TRUE ("murderer(Joe)")}. Expanding the definition of truth, we get A , again by the definition of truth and the <=> rule, we get B , in the next step C, and finally murderer(Joe).
Let us now assume a different knowledge base, where A says " B lies" and B says " A says the truth". Formally the knowledge base consists of Gamma :<=> { A <=> ~ TRUE (" B "), B <=> TRUE (" A ")}. By the definition of truth and the <=> rule, we can derive from Gamma the formula A <=> ~ A , a formula that is perfectly acceptable when we assume that A is nn . If we add to the knowledge base A , however, the knowledge base is inconsistent. In that case, we can derive from the equivalence ~ A as well. With A and ~ A we can close the tableau.
| Löb | <=> | (Löb | -> | A ) | |
| nn | TRUE | nn | nn | nn | |
| nn | TRUE | nn | nn | false | |
| TRUE | TRUE | TRUE | TRUE | TRUE |
The truth, as always, will be far strangerWe have seen in this article a three-valued approach to allow self-referential sentences in a formal system almost without any restrictions. The only restriction that we have to impose on such a system, is to guarantee that any definition gives us a fixpoint. For predicative definitions this can be guaranteed by definitions which make use only of the standard connectives, while excluding connectives which would allow to speak about paradoxicality. On the first view this seems to trivialise the system, since by this restriction we have eliminated almost all tautologies from the system. However, when we assume a background theory, with respect to which reasoning takes place, this is not a serious restriction at all. All formulae in the background theory are assumed to be true (this is equivalent to saying, they are not false and not paradoxical). If we assume a paradoxical formula in the background theory as true, the system becomes inconsistent (and not paradoxical), just as a classical system becomes inconsistent by assuming a formula and its negation.
Arthur C. Clarke. 2001 - A Space Odyssey. 1968.
Interestingly the restriction on the connectives allows to transfer standard two-valued theorem proving techniques to the three-valued case.
Summarising, we have presented a powerful system for stating facts about truth. By adding further axiom schemas it is easy to extend the framework to deal with knowledge and belief as well. No strange restrictions on the expressive power are made in order to avoid self-referential statements. As shown in section \refsec:reasoning standard calculi (with slight restrictions) can be used. The work may also pave the path to deal with paradoxes in higher-order logic without syntactically excluding them.
In this work we have not produced a formal fixpoint construction for the TRUE predicate, but only given an indication why such fixpoints exist. Building on the work of Kripke, Barwise, Etchemendy, and Moss a formalisation should be possible, but is left as future work.