^{1}School of Computer Science
The University of Birmingham
Birmingham B15 2TT, England
http://www.cs.bham.ac.uk/~mmk
^{2}Fachbereich Informatik
Universität des Saarlandes
66041 Saarbrücken, Germany
http://www.ags.unisb.de/~pollet
ABSTRACT
In this contribution we address two related questions. Firstly, we want to shed light on the question how to use a representation formalism to represent a given problem. Secondly, we want to find out how different formalizations are related and in particular how it is possible to check that one formalization entails another. The latter question is a tough nut for mathematical knowledge management systems, since it amounts to the question, how a system can recognize that a solution to a problem is already available, although possibly in disguise. As our starting point we take McCarthy's 1964 mutilated checkerboard challenge problem for proof procedures and compare some of its different formalizations.
We will use the socalled "mutilated checkerboard problem" to study the relationship between different problem formalizations. This problem was introduced by McCarthy as a challenge problem for proof procedures [McCarthy64]. While the challenge was mastered by different proof procedures, it is still a challenge for mathematical knowledge management.
In McCarthy's original paper [McCarthy64] we can find two different formalizations. McCarthy [McCarthy95] himself presented at the second QED workshop a different formalization which makes  compared to the original paper  use of settheoretical notions and basic integer arithmetic. We will also look at the formalizations in Isabelle [paulson01] and Coq [huet96].
There are various proofs of the problem available. Some are close to one of McCarthy's formalizations (e.g., [Bancerek95]), others are using formalizations which are significantly different and need creative thought to understand that they are related to the original problem. Even the close formalizations of [McCarthy95] and [Bancerek95] need some adaptation in order to see that they can be mapped into each other. While McCarthy numbers the checkerboard from 0 to 7, Bancerek uses 1 to 8. A shift of an index in an array by 1 is a trivial rerepresentation to anybody with mathematical training, but it has to be recognized as an index shift. Neither do the strings directly match, nor will simple unification do, since 0 is not 1 and 7 not 8.
Why would not everybody just take McCarthy's original formalization? Obviously, different formal systems have different strengths and limitations. The first formalization by McCarthy is a first order formalization which does not make use of equality and function symbols. While this is a very restricted formalism, some systems can deal with just this formalism only. McCarthy's second formalism contains function symbols and equality. A reasoner which can deal with function symbols and equality will have special procedures how to do that and would probably not live up to its strengths if given the first formalization. Likewise Isabelle and Coq, which have very powerful representation languages, have strengths which these systems could not use if they had to stick to one of the original formalizations. For this reason it should be considered as legitimate that each system user chooses a representation which suits their system best. However, when we take a closer look at the formalizations, it is neither trivial to see that they solve the original problem nor to see that they can be identified or subsumed by one another.
While problems such as a mutilated checkerboard problem exhibit a big gap between their informal description in natural language and their various formal representations, the problem is universal in mathematics. Typically even in a single very strict formal system it is possible to say the same thing in a wide variety of ways and not for all of these possibilities it is obvious that they are equivalent on some level. The reason why one formalization entails another may involve a simple syntactic modification, or may rest on a deep semantic connection.
While the original challenge of the problem was the size of the search space in various formalizations  and still is for a naive usage  the much bigger challenge for proof procedures is that a good proof contains a creative invention, namely the colouring of the arrays of the board and the domino so that the squares with different colours can be counted. For the solution we can differentiate between the following phases:
The formulation of step 2 depends on the problem description given in step 1, and that the proof idea consists of both step 2 and step 3.
At each step there are several possibilities, colouring the arrays is only one of several creative inventions. Most publications which have taken up the challenge start with the motivation that their problem formalization is adequate with respect to the informal description and concentrate on steps 2 and 3.
In this contribution, we want to investigate the different possibilities and choices which can be made in step 1. The mathematical problem formalization assumes certain background knowledge that is available to the reader. Only with this background knowledge it is possible to understand why the problem formalization actually covers the original problem. The main objects, notions, and properties used (explicitly or implicitly) in the proofs are the following.
Later, the creative invention of colouring corresponds to mapping squares, represented by pairs of numbers, to the set {black,white}. Sets can be counted, mapped to numbers, so we have cardinality and relations for the comparison of cardinality.
McCarthy gives already two problem formalizations in his initial paper [McCarthy64] and a settheoretic description thirty years later [McCarthy95]. Whereas the early formalizations describe the problem completely, the latter builds on existing concepts of set theory, including cardinality, and integer arithmetic.
S(x,y)  y=x+1  
L(x,y)  x<y  
E(x,y)  x=y  
G^{1}(x,y),... ,G^{4}(x,y)  square (x,y) and the top/right/bottom/left neighbour square are covered by a domino  
G^{5}(x,y)  square (x,y) is uncovered 
& S(5,6) & S(6,7) & S(7,8)

properties of numbers  

placement of dominoes  

uncovered squares  

adjacency of dominoes  

border of the board 

eight distinct numbers  

uncovered squares  

adjacency of dominoes  

border of the board  

names for numbers  

covering 
Definitions  
Board  =  IN t_{8} x IN t_{8} 
mutilatedboard  =  Board \ {(0,0),(7,7)} 
dominoonboard(x)  <>  (x SUBSET Board) & card(x) =2 & (FORALL x_{1} x_{2})(x_{1} NOT = x_{2} & x_{1} IN x & x_{2} IN x > adjacent(x_{1},x_{2})) 
<>  (x SUBSET Board) & card(x) =2 & (FORALL x_{1} x_{2})(x = {x_{1},x_{2}} > adjacent(x_{1},x_{2}))  
adjacent(x_{1},x_{2})  <>  c(x_{1},1)  c(x_{2},1) =1 & c(x_{1},2) = c(x_{2},2)  c(x_{1},2)  c(x_{2},2) =1 & c(x_{1},1) = c(x_{2},1) 
<>  c(x_{1},1)  c(x_{2},1) + c(x_{1},2)  c(x_{2},2) = 1  
c((x,y),1)  =  x 
c((x,y),2)  =  y 
partialcovering(z)  <>  (FORALL x)(x IN z > dominoonboard(x)) & (FORALL x y)(x IN z & y IN z > x = y  x intersection y = {}) 
Theorem  
~ (EXISTS z)(partialcovering(z) & UNION z = mutilatedboard) 
Note that McCarthy defines dominoonboard and adjacent in two equivalent ways. Which definition to prefer depends on the context of its usage.
Paulson presented his formalization and proof of the checkerboard problem in the Isabelle system [paulson01].
Definitions  
LessThan(m)  :=  {i IN N  i < m} 
board  :=  LessThan(2* s(m)) x LessThan(2* s(n)) 
tiling(A):set(set( alpha ))  :=  {} IN tiling(A) & (a IN A & t IN tiling(a) & a intersection t = {}) > a union t IN tiling(a) 
domino:set(set(N x N))  :=  { (i,j),(i,s(j))} IN domino & { (i,j),(s(i),j)} IN domino 
Theorem  
(board\ {(0,0)})\ {(s(2* m),s(2* n))} NOT in tiling(domino) 
Huet [huet96] formulates the problem on a level which is more abstract than all the other formalizations. The formalization is based on properties such as injectivity and surjectivity of functions, and finiteness (characterized on the basis of injectivity and surjectivity). The geometry of the board is not considered at all, numbers or cardinals are not necessary for the argument.
We can generate from the full checkerboard a mutilated one by taking B' as a proper subset of B. The theorem is then that for an injective function Board' with Board':B > W and finite(B') there is no function Domino:W > B' which is injective (i.e., a  possibly partial  covering) and surjective (i.e., total).
More concretely we can say, if B < W (that is, we have strictly fewer black than white elements) then we cannot have W = B (that is, equal numbers which would follow from a covering with 2 x 1dominoes).
Different systems make use of different problem representations. Fig. 1 gives an overview (which is not complete with respect to problem representations as well as solutions). The initial formalization M64a was verified by the model generator Mace, only the symbols L and E together with the corresponding axioms were replaced by builtin concepts [mccune95]. The proof in the TPS system uses exactly the formalization given as M64b [Andrews96]. There are two proofs in the Mizar system for M95 [Bancerek95,Rudnicki96]. Paulson and Huet constructed proofs for their formalizations in Isabelle [paulson01] and Coq [huet96], respectively. KP05 can be proved in almost any formal system by calling the corresponding calculus level rule.
Every given solution is a formal proof for the corresponding problem representation and thus for the informal representation of the checkerboard problem. While it is difficult to be more rigorous with respect to the informal statement  and probably impossible to be fully rigorous  we can ask whether the Coq proof is a solution for the formalization given as M64a (represented as a dotted line in Fig. 1) and ask for a formal justification. In extreme cases (relationship between M64a and KP05), such a formal justification may be as difficult as a proof of the one problem (here M64a) from first principles. We will come back to the implications of this observation for mathematical knowledge management, but first we look at concrete problem transformations.
Each formalization can be looked at from different perspectives. M64a is very concrete and its proof captures exactly the original problem and is difficult to generalize. M64b is slightly more general (also capturing tori). M95 although concrete in the formulation, allows for an easy generalization of the argument.
P96 and H96 are much more general and go beyond the concrete 8 x 8mutilated checkerboard. In Paulson's formulation the problem is shown for any 2m x 2n rectangular structure in which two opposite corners are missing, while Huet's formulation entails that arbitrary finite sets of black and white squares with an injective mapping from black to white sets, which is not surjective, cannot allow for an injective mapping form the white to the black squares. This proof is very abstract and many different aspects of the problem are irrelevant: the relative relationship of black and white squares, how many fewer black squares than white squares we have, and the shape of the dominoes. Important is only that the set is finite and that a domino covers exactly one black and one white square (actually the problem representation does not speak of coverings, so this is yet another interpretation).
KP05 is even more general than H96. We speak only about cardinality of sets and argue that the cardinality cannot be the same and different at the same time. We do not require anymore that the sets we speak about are finite. If we cover, for instance, pairs (a,b) in which the as are rational numbers and the bs real numbers so that each rational number is covered exactly once by an a then the bs cannot cover the real numbers since there are more real numbers than rational numbers.
Even more general than KP05 is the statement true, which is logically equivalent to all true statements.
In the latter case she would need to give him a better argument (based on KP05) such as "Look at a partition of the checkerboard into two disjoint sets so that there are arguments that the two sets must have the same number of elements and at the same time must not have the same number of elements." Again the student may now believe the statement and accept this as a proof, or not.
If the previous argument is not accepted H96 can further refine it, and so on. Each formulation will get closer to the representation of a concrete checkerboard on the one hand. On the other hand, the proof of why each statement holds becomes increasingly more difficult.
M64a  M64b  
S(x,y)  s(x) = y  (1)  
1,2,... ,8  1,s(1),... , s(s(s(s(s(s(s(1)))))))  (2)  
G^{i}(x,y)  g(x,y)=i for i IN {1,... ,5}  (3)  
E(x,y)  x=y  (4) 
Why is this translation justified? (1) means that we can translate the binary predicate symbol S into a unary function symbol s. In order to justify such a translation we need to show that S not only represents a general relation, but a left total and right unique one. Let us first observe that it suffices to look at the numbers 1 through 8. S is right unique for these numbers since from the axioms 1.5. follows that the usual relationships between these numbers hold, for instance, 1<3 from S(1,2) and S(2,3) with the second and third axiom. With the third axiom follows also NOT S(1,3). Likewise NOT S(1,4) and so on, that is, S is right unique.
The left totality is a more subtle issue. While it is intentionally clear that we have to consider only the eight different numbers 1, 2, ... , 8, the formalization M64a does not warrant that only eight different numbers exist. More seriously, we get from the axioms 1.5. as a consequence that for all x IN {1,2,... ,8} holds NOT S(8,x).If we had S(8,x) for some x we would get with the second axiom L(8,x). Together with L(x,8) (which follows from axioms 1.3. for x =/= 8, and from L(8,x) for x=8) we get L(8,8), hence NOT E(8,8), which contradicts E(x,x). The argument why S can be translated to a function requires either to extend the range to all natural numbers or to give up the idea of a strict order and to consider a cyclic structure in which s(8)=1.
Both approaches are formally possible and amount to the same covering problem, assumed we never make use of s(8). Axiom 6' states that dominoes do not stick out, that is, that only the dominoes in a range from 1 through 8 are to be considered. The first approach means we view the mutilated checkerboard as a subset of the infinite plane. When putting dominoes down, we never cross the boundaries. The second approach means that we consider the checkerboard as a torus, axiom 6' puts up boundaries between the 8th row and the 1st row (and the columns likewise) which we may not cross in covering the checkerboard. Strictly speaking, axiom 6' is not necessary to prove the theorem, since the mutilated torus, that is, the full torus without the images of the two deleted corners of the checkerboard, cannot be covered by dominoes either, even if it is possible to cross the invisible boundaries, since the mutilated torus contains two white squares more than black ones.
In M64b McCarthy chooses the second view (of a torus). The transition from M64a to M64b cannot be syntactically established at a low level (but only almost), since the original binary predicate symbol S is not lefttotal. This does not matter, however, since s(8) has never to be used. Formally the second approach has the advantage that we have only finitely many (eight) different elements in the Herbrand universe, and hence has a finite search space, while the other formalization has a potentially infinite search space. Depending on which procedure one uses, it is advantageous to be more restrictive and to include 6' (for instance, when using Mace to show that no model exists), or to allow coverings of the torus which cross the magic boundary and have no correspondence of coverings in a real checkerboard with 2 x 1dominoes and exclude 6' (for instance, when proving the theorem with TPS).
As replacing the predicate symbol S by a function symbol s, it is possible to replace the predicate symbols G^{i} (with i=1,... ,5) by a function symbol g so that G^{i}(x,y) goes to g(x,y)=i. Note that initially the G^{i} are different names for binary predicate symbols (and could have been called P,Q,R,S,T instead). Since the names do not matter we can use a single ternary predicate symbol G instead which takes the index i as an argument, that is, G(x,y,i) insteadNote also that the order of the arguments does not matter, provided any reordering is done consistently throughout the whole problem description. of G^{i}(x,y). G(x,y,i) corresponds for a lefttotal and rightunique G to g(x,y)=i. The lefttotality and rightuniqueness of the G(x,y,i) follows from axioms 6.10.
Axioms 11 and 12 mean that precisely the squares (1,1) and (8,8) are missing. They are translated as g(1,1)=5 & g(8,8)=5 and g(x,y)=5 > (E(1,x) & E(1,y))  (E(8,x) & E(8,y)) which is with the translation of E(x,y) to x=y logically equivalent to axiom 3'.
Axioms 13 and 14 can be translated directly to 4' and 5' (with the simplification to rename x_{1} to x, and get rid of x_{2} by replacing it by s(x)). Axiom 6 goes to 8', and 15 to 6'.
Note that the translation from S to s and G to g leads to proof obligations, namely that the functions s and g, used to represent the relations S and G are welldefined. While this should be a straightforward syntactical proof and actually is for G, matters are more subtle with S, and intuitively need an argument why a mutilated checkerboard cannot be covered when a mutilated toroid checkerboard cannot. The translation of E to equality strictly also leads to proof obligations that the usual properties of reflexivity, symmetry, transitivity, and substitutivity are satisfied. Only the first of the four is actually formally given (in axiom 5), since only this one is needed.
M95  H96  
{(x,y) a IN Board & 0=(x+y) mod 2}  B  (1)  
{(x,y) a IN Board & 1=(x+y) mod 2}  W  (2)  
Board  Board_{H}:B > W  (3)  
set D with  Domino: W > B  (4)  
partialcovering(D) & W SUBSET UNION D 
None of the objects in the formalization in H96 exists in M95. The sets B and W have to be constructed. The mapping representing the board Board_{H} has to be defined using the set Board and the sets corresponding to B and W. The same holds for the mapping Domino, which additionally needs the set of dominoes covering all white squares.
In terms of the three steps describing the process of solving the problem given in Sec. 2, the formalization M95 is the problem formalization of step 1, the proof given for H96 corresponds to the realization of the proof in step 3, and the transformation which identifies objects in M95 with objects in H96 is the creative invention of step 2.
For the mutilated board of M95 the set B'=B\ {(0,0),(7,7)} corresponds to the restriction of Board_{H}:B > W to Board'_{H}:B' > W, and the theorem of M95 expressed with respect to the notions in H96 translates then to ~ (EXISTS Domino:W > B' injective(Domino)).
We see here that even when there is an agreement how to represent the objects, there can be a difference how properties of the objects are represented. For example, the predicate partialcovering in M95 is defined for the given board, and includes the definition of adjacent dominoes. The predicate tiling in P96 defines the covering independently from dominoes. Here we see one dimension in the choice for a formalization: the generality of the introduced concepts.
In M95 the general concept of tiling is expressed by the formula tiling'(z) <> (FORALL x y)(x IN z & y IN z > x = y  x intersection y = {}) which is equivalent to tiling in P96. Even for equivalent concepts there is a choice in the formalization. This choice may depend on the assistance provided by the different proof systems. McCarthy supposes a `heavyduty' set theory prover, whereas the choice of Paulson is motivated by the support for inductive definitions available in Isabelle.
How can we apply this statement to the case of H96? We need to know that the conditions are given, that is, we assume an injective but not surjective mapping Board: B > W, and the finiteness of B. The injectivity entails B <= W . The nonsurjectivity entails together with the finiteness of B, B =/= W . The two facts together give us B < W . Hence KP05 is applicable and we can conclude NOT W <= B . Hence there is no injective mapping Domino: W > B.
In general we can compare the process of formalization of a problem to a projection from the problem per se to different formalizations as in Fig. 2. Since the problem per se is not given in a formal way, it is difficult to reason about the correctness of a formalization. However, it should always be possible the establish the relationship between different formalizations such as F1 and F2 in Fig. 2.
The relationship between different formalizations should follow the commutative diagram in Fig. 3. Note that the relationship may be nontrivial since the different formalizations may use different formal systems. A proof for the original problem gamma mod els phi will be obtained by transforming the assumptions gamma to gamma ' from which a tranformed theorem phi ' is derived in a potentially different calculus. Then phi ' is translated back to the original theorem phi . Assumed that the translations and the calculus pi are sound, then the overall construction is a sound argument why phi follows from gamma .
It is a standard task when applying mathematical or (more stringently) formal methods to problems that we have to formalize them within a system. It is indeed an important aspect in the application of formal systems in safety critical areas, since any proof can start only once the formalization is given. If different people formalize the same problem they will typically come up with different formulations. Assumed the two formulations cannot mapped to each other in a sound way, then at least one does not capture the original problem adequately.
In order to study the process of relating different formulations to each other and to understand how to represent one example we have looked at an old problem, McCarthy's mutilated checkerboard problem. This problem is interesting since it has been represented and proved in a large variety of systems. Of course, any representation has to abstract from unimportant details, such as the size of a square on the checkerboard and its material. Other details may or may not be represented, for instance, whether we speak of a checkerboard of size 8 x 8, or arbitrary size 2n x 2n, or arbitrary finite size, or arbitrary size. Perhaps astonishingly, this easy problem exemplifies a wide range of problems involved.
Different formalizations are used all the time, and certain standard translations, such as different versions of relativization in sorted logic, representations of equality in a logic without equality, reification of higher order expressions in first order logic, or functional versus predicative formalizations, are used quite regularly when representing problems formally. Often certain relationships between different formalizations are proved by a metalogical argument (e.g., the sort theorem for the relativization).
There is also a very general mechanism how different representations can be linked to each other. Farmer et al. [Farmer92,Farmer00] have introduced a theory mapping in their little theories approach. One representation can be mapped faithfully to another if the axioms of the one become theorems in the other in a systematic translation.
Two major challenges follow from the mutilated checkerboard problem for mathematical knowledge management. Obviously the problem is not restricted to this particular problem, but a general one.
Firstly, if a system is told that H96 represents a proof for the mutilated checkerboard problem, how can it automatically check that this is so indeed. Secondly  even more difficult  if a system is given a problem such as the mutilated checkerboard problem, how can it find relevant information which may be given in forms so diverse as M64a, M64b, M95, P96, H96, or KP05.
The second problem may be eased by proper annotations. The first in its full generality can be arbitrarily hard. While retrieving and relating information given in different shapes seems a core activity not only in mathematical knowledge processing, the ways in which mathematical information is transformed is very rich. A good representation is one that is simple and at the appropriate level so that it captures the main ideas of the proof.
What is appropriate is, however, not independent of the actual user. An argument about cardinalities, for instance, requires at least some basic knowledge of the concepts by the user. Huet's elegant proof remains obscure to somebody who does not know anything about the relationships between injective and surjective mappings on finite sets of the same cardinality, and so on.
As Ayer [Ayer36, p.85f] puts it "A being whose intellect was infinitely powerful would take no interest in logic and mathematics. For he would be able to see at a glance everything that his definitions implied, and, accordingly could never learn anything from logical inference which he was not fully conscious of already." Everything is trivial for such a being. Experts may do with some easy arguments on a high level, while beginners need lowlevel lengthy arguments at a low level. The relationships between the different levels are often syntactic, but can be complicated in detail.
A truly helpful system would not only find relevant information but also present it to a user on an appropriate level. A good understanding of typical transformations is an important step toward such a system.