Designing programs

Suppose you wish to get sugar from the food shop near your house. If you had another adult whom you could send, you might say something like: "Go to the shop and get me some sugar". To an adult who knows the way, this is adequate set of instructions. If the person was a stranger to your locality, you would need to write more precise instructions detailing the route, for instance the names of the roads walked along. However, if the sugar was being bought by a robot rather than an highly intelligent person, then the instructions would need to be very precise. For instance, you might have to specify the direction the robot should travel in each time it came to a road junction. This does rather assume that the robot can understand instructions such as "turn right", "move twenty metres", etc. This level can be thought of as a high-level programming language for robots. Below this would be the low-level machine instructions, each of which would perform some minute function, such as bend the right knee, exert pressure on the left foot (to push the robot forward).

Here we see some of the problems of programming. A computer is mercilessly exact in its execution of programs: it cannot put in what you forget to include and it will do exactly what you ask it to do, even if this involves recursing forever.

When writing a program we are writing a specification of a task. Some tasks are inherently general. For instance, we recognise that the job of sorting a list of numbers can be generalised to accommodate any list of numbers of reasonable finite length. Such very general tasks turn up again and again in computing and therefore people have developed algorithms that describe how to perform it efficiently. (An algorithm is a process that is guaranteed to halt in finite time.) An algorithm is machine and language neutral. That is to say that an algorithm will stay the same while machines can be replaced by more powerful systems and newer languages developed in which the algorithm can be coded. This does mean that we must have a language in which to describe the algorithm, preferably a language which is neutral in respect of any computer language. However, what usually happens is that interesting algorithms are coded in versions for particular programming languages, so in a Prolog text book, it is common to see sorting algorithms such as bubble sort and quick sort, tree traversal algorithms as well as game playing search algorithms such as minimax.

As an aside, this throws some light on criteria for choosing a language for a particular task. One factor (amongst very many) in the choice of a language is the ease with which the necessary algorithms can be written. Prolog's strengths lie in its pattern-matching features (through unification) and its support of non-determinism (which uses backtracking search), which make it useful for symbolic computation such as natural language processing. It is generally unsuitable for problems that require fast processing of arithmetic, for instance neural networks.

Given a task to program, we set out to write an algorithm that will accomplish that task. This is a difficult intellectual activity. Most programmers find it easier to write a program in some programming language if given an algorithm (ie a method for accomplishing a task) than they do to invent an algorithm in the first place. There is no guaranteed method that will allow you to start with a specification and work through an algorithm to a working and complete program. "The design of algorithms requires creativity and insight, and no general rules can be formulated." (Goldschlager and Lister, 1988, p. 11). Computer scientists frequently aid the design process by using a technique or framework that assists in the analysis of a problem and dividing it into successively smaller parts, each of which can be conquered individually.

Stepwise refinement

One framework that is fairly widely used is that of "stepwise refinement". It lends itself very well to application to procedural languages but can also be applied to functional and logic programming languages. It has a simple idea: a problem is refined by stages to a complete program. (It is a kind of "divide and conquer" strategy.) Prolog (like other logic and functional languages) has one advantage over procedural languages in that sub-parts of the program are easily testable.

To illustrate the method, we'll study the problem of the magic square.[+] Magic squares are n x n squares of numbers, filled with the integers from 1 to (n*n). The rows, columns and diagonals of the squares have to all add up to the same number. This number can be calculated by adding all integers from 1 to (n*n) and dividing by the number of rows.

As an example, suppose we want to discover if a 3 x 3 magic square exists. This would have to be filled with the integers from 1 to (3*3), ie from 1 to 9. The rows, columns and diagonals all have to add up to 45 (the sum of the integers from 1 to 9) divided by 3 (the number of rows). So each row, column and diagonal must add up to 15.

A natural language specification of the problem might read something like:

A magic square is 3 x 3 square, filled with the integers from 1 to 9. The numbers in each row, column and diagonal must add up to 15. Write a program that will compute and display a magic square.

Method
The first stage is to ensure that you understand the problem. Here you usually have to write notes and draw diagrams. For instance, you might draw a representation of a magic square:

          -------------
          | 1 | 4 | 7 |
          -------------
          | 2 | 5 | 8 |
          -------------
          | 3 | 6 | 9 |
          -------------

and the numbers to be put inserted into the square:

     1,2,3,4,5,6,7,8,9.

Then there are a number headings to consider.

Outputs and inputs
It may seem strange to decide what you want out of the program before you describe the inputs, but knowing what the program has to produce usually helps you to decide what the inputs should be.

The major output should be as drawn above for successful squares. If no magic square exists, the Prolog computation will fail and the word 'no' will be displayed by Prolog.

There are no inputs, either in the argument list or while the program is running.

Data structures
There are two major data structures.

The first will contain the numbers from 1 to 9. As a first approximation, we'll write these as a list:

     [1, 2, 3, 4, 5, 6, 7, 8, 9]

We could have used a structure with nine arguments. However, it would be more difficult to modify the program if we wanted to extend the program to a 4 x 4, 5 x 5, etc. magic square.

The second will hold the square. We can encode this as a list too, with the first member representing the first cell labelled a, the second the second cell labelled b, etc.

          -------------
          | a | b | c |
          -------------
          | d | e | f |
          -------------
          | g | h | i |
          -------------

If our program doesn't work first time, we may need to trace its execution. It would help us if we have some way of knowing which cell of the square the program is processing. It might therefore be more convenient if we encode each cell as a structure with two arguments, the first being the letter that identifies each cell and the second argument holding the integer.

So the square could be written as:

     [cell(a, Cell_a), cell(b, Cell_b), ..., cell(i, Cell_i)]

The algorithm
There are three basic processes in this task: filling the square, testing the square and displaying the square.

We can write this down better as follows:

This doesn't look anything like Prolog but we won't worry about this at this stage. Instead we will number each line:

{1}	generate square
{2}	test square
{3}	output square

and then attempt to expand each line.

{1}	Generate square

Notes:
This will fill an empty square from the list of numbers.

Outputs and inputs:
The output is a square with all cells instantiated to unique integers.

The input is an empty square and the list of numbers.

Data structures:
The square (an output and input) as defined in the top-level.

The list of numbers (an output) as defined in the top-level).

Algorithm:
There are the same number of empty cells as there are numbers and both the cells and the numbers are contained in lists. The obvious technique is to recurse over one list or the other, but which list? The list of numbers is simpler because each element is atomic, whereas each element in the list of cells is a structure of arity two.

As this refinement is using recursion, the boundary condition or conditions should be identified. If we have to place all numbers from the list of integers into the square, the obvious boundary condition will be when this list is empty. The complementary condition is when the list of integers isn't empty. So we can sketch out two heads of clauses for this refinement:

{1.1}	% 1 boundary condition
	fill_square([], Cells) :- ...
{1.2}	% 2 recursive condition
	fill_square([Number|Tail], Cells) :- 

If we look back over outputs and inputs we identified one output (the square) and two inputs (the square and the integers). This does rather suggest that we ought to have three arguments, along the lines of:

     fill_square(Numbers, Cells_In, Cells_Out) :-

This is quite an important point, because it makes a great deal of difference to the ease with which we can write our program. If we use Cells_In and Cells_Out, then presumably we're intending to find an empty cell in Cells_In and copy a new version into Cells_Out.

This method is unduly complicated because it requires us to maintain both Cells_In and Cells_Out correctly. However, if we allow the second argument of each cell/2 structure to be uninstantiated then we merely have to arrange for each second argument to be instantiated in location.

So, looking again at the current refinement.

{1.1}

This is complete as a fact and can be written as:

     % 1 boundary condition

fill_square([], _).

(Note that we've also changed the second argument to an anonymous variable because otherwise its name would only occur once in this clause.)

{1.2}

We wrote this as:

     fill_square([Number|Tail], Cells) :-

We need to insert Number into Cells. If we use the conventional definition of member/2 in the following way:

     | ?- member(cell(Name, 7), [cell(a, Cell_a), cell(b, Cell_b)]).
     Name = a,
     Cell_a = 7 ? ;

     Name = b,
     Cell_b = 7 ? ;

     no

we can see that Name (uninstantiated in the first argument) is first instantiated to a and then to b, and that Cell_a is first instantiated to 7 and then to an anonymous variable and Cell_b vice versa. Thus we can see that member/2 will shuffle the numbers around the cells in the square.

We can write the full clause as:

     % 2 recursive condition
     fill_square([Number|Tail], Cells) :-
          member(cell(_, Number), Cells),
          fill_square(Tail, Cells).

Is this second clause complete? Should we test to insure that Number is an integer (ie by adding integer(Number) as a sub-goal)? We could do so, but we may judge it to be unnecessary (and therefore time wasting) as this goal has to be executed every time fill_square/2 %2 is called, because we can assume that the list of numbers has been verified when it is first entered in the program.

Testing:
We have refined part {1} but our task is not yet complete. We need to devise some queries that will allow us to rigorously test our code.

Clause 1:

This will only accept queries that have [] as the first argument and anything as a second argument. We could therefore have a query such as:

     fill_square([], [cell(a, Cell_a), cell(b, Cell_b)]).

As this is our instruction to run the program and our input, we should also record what our expect output is. Here it will be two variable bindings, both to uninstantiated variables, eg:

          Cell_a=_13537
          Cell_b=_13539

Clause 2:

This will only accept queries that have a list in the form [Head|Tail] as the first argument and the second argument must be a list, with each element in the form cell(Name, Number). We could have a query such as:

     | ?- fill_square([1,2], [cell(a, Cell_a), cell(b, Cell_b)]).

It is not sufficient to test the procedure to see if it will accept what you expect. It must also be tested to ensure that it doesn't produce undesirable results from unexpected inputs. We could use a query that has uninstantiated variables for each of its arguments, eg:

     | ?- fill_square(Var1, Var2).

Unfortunately this can cause problems if we have written a procedure that doesn't terminate under unusual circumstances, ie the search tree has an infinite branch. We could try something less general, for instance queries with non-lists as arguments:

     | ?- fill_square(1, Var).
     | ?- fill_square(Var, 2).
     | ?- fill_square(cell(a,2), Var).

This looks a rather odd assortment of queries selected by no obvious criterion. The reason for this is quite obvious after a little thought. It is relatively easy to work out a representative set of legitimate queries, but rather more difficult to enumerate a representative set of "illegal" queries. If our ability to test the programs is so limited, this does make it rather difficult to believe that a program will not produce undesirable results. The best we can hope for is that by being methodical, we are able to produce programs which have been reasonably tested. In Prolog (as with other logic and functional languages) we are able to test each procedure as we write them, rather than having them embedded within complete programs (as is necessary with imperative languages like Pascal and C++).

We have completed the refinement of part {1} and in so doing, have moved straight to program code. We will now refine the remaining parts in the same way.

{2}	Test square

Notes:

This will check a square to ensure that each row, column and diagonal sums to the special total (which for the 3 x 3 square is 15).

Outputs and inputs:
There is no output.

The input is a square with each cell instantiated to a number. The algorithm has to access the "magic" number of 15.

Data structures:
The square as defined in the top-level.

Algorithm:
Add the numbers of each cell in the first row and check that it sums to the "magic" number; do the same for the following rows, then for the columns, then the two diagonals.

Head of the clause:

We need to be able to access each cell individually and one of doing this is to enumerate each element in the list representing the square.

     test_square([cell(a, Cell_a), cell(b, Cell_b), cell(c, Cell_c), 
                  cell(d, Cell_d), cell(e, Cell_e), cell(f, Cell_f),
                  cell(g, Cell_g), cell(h, Cell_h), cell(i, Cell_i)]) :-

This procedure need not be recursive, neither is it non-deterministic (ie there can only be one successful solution). We could write sub-goals that look like:

     15 is Cell_a + (Cell_b + Cell_c),
     15 is Cell_d + (Cell_e + Cell_f), ...

This would work, but should we wish to adapt our program later we would have to make sure that we changed every instance of the number 15 to be certain that our new version worked. This is the same problem we had when coding the width of columns in the examples above. In this instance, we could again have a fact at the beginning of the program:

     sum_of_row(15).

Now we can complete the body of this rule:

     sum_of_row(Sum),
     % test rows
     Sum is Cell_a + (Cell_b + Cell_c),
     Sum is Cell_d + (Cell_e + Cell_f),
     Sum is Cell_g + (Cell_h + Cell_i),
     % test columns
     Sum is Cell_a + (Cell_d + Cell_g),
     Sum is Cell_b + (Cell_e + Cell_h),
     Sum is Cell_c + (Cell_f + Cell_i),
     % test diagonals
     Sum is Cell_a + (Cell_e + Cell_i),
     Sum is Cell_c + (Cell_e + Cell_g).

We should review the sub goals to ensure that they are complete. There are three sub goals for the three rows; three sub goals for the three columns; and two sub goals for the two diagonals. Note also that the tests are ordered so as to help us ensure completeness. It starts with the top row, then the middle row and the bottom row. It then lists the columns from left to right.

Testing:
To test this rule, we need a query with a legal magic square as an argument. However, the point of writing the program is, in part, to find a legitimate magic square without having to calculate one manually. Therefore we don't know of a legal magic square with which to test the rule. This is not unlike many problems in artificial intelligence programming where we investigate the behaviour of AI systems and cognitive simulations by writing and extending programs whose behaviour we don't fully understand at first. However in this case, we can use an illegal magic square whose rows, columns and diagonals do add up to 15:

     | ?- test_square([cell(a, 5), cell(b, 5), cell(c, 5), 
     | :  cell(d, 5), cell(e, 5), cell(f, 5),
     | :  cell(g, 5), cell(h, 5), cell(i, 5)]).

We also need to test the procedure with a square whose rows, columns and diagonals don't sum to 15, for instance:

     | ?- test_square([cell(a, 1), cell(b, 5), cell(c, 5), 
     | :  cell(d, 5), cell(e, 5), cell(f, 5),
     | :  cell(g, 5), cell(h, 5), cell(i, 5)]).

This should fail. We could go further and attempt to test each sub goal in the rule, but we could find this quite difficult if we don't know how a legitimate magic square is constructed.

{3}	Display square

Notes:

This will output a square on the screen.

Outputs and inputs:
The output is the square in the form:

          -------------
          | 1 | 4 | 7 |
          -------------
          | 2 | 5 | 8 |
          -------------
          | 3 | 6 | 9 |
          -------------

The input is a square with each cell instantiated to a number.

Data structures:
The square as defined in the top-level.

Algorithm:

Head of the clause:

Like {2}, this will include an enumeration of all the cells.

     display_square([cell(a, Cell_a), cell(b, Cell_b), cell(c, Cell_c),
                     cell(d, Cell_d), cell(e, Cell_e), cell(f, Cell_f),
                     cell(g, Cell_g), cell(h, Cell_h), cell(i, Cell_i)]) :-

The body of the rule could be:

          tab(10),

writeln('-------------'), tab(10), write('| '), write(Cell_a), write(' | '), write(Cell_b), write(' | '), write(Cell_c), writeln(' |'), tab(10), writeln('-------------'), tab(10), write('| '), write(Cell_d), write(' | '), write(Cell_e), write(' | '), write(Cell_f), writeln(' |'), tab(10), writeln('-------------'), tab(10), write('| '), write(Cell_g), write(' | '), write(Cell_h), write(' | '), write(Cell_i), writeln(' |'), tab(10), writeln('-------------').

This rule calls writeln/1 defined earlier in this module. We could write a separate procedure that outputs a horizontal line, for instance:

     horizontal_line :-
          tab(10),
          writeln('-------------').

Testing:
We can test this procedure using a square with each cell instantiated to a different number. This should ensure that we can tell that each cell appears in its appropriate place in the display.

     | ?- display_square([cell(a, 1), cell(b, 2), cell(c, 3), 
     | :  cell(d, 4), cell(e, 5), cell(f, 6),
     | :  cell(g, 7), cell(h, 8), cell(i, 9)]).

Putting the program together

We can put all the parts together with various "constants" we've discovered a need for during refinements. At the top level we have to replace each line of text with which we started the process of stepwise refinement with a sub goal in the form of a call to the procedures we've gone on to specify.

     % facts about the domain

     squares([cell(a, _), cell(b, _), cell(c, _), cell(d, _),  
              cell(e, _), cell(f, _), cell(g, _), cell(h, _),
              cell(i, _)]).

     numbers([1,2,3,4,5,6,7,8,9]).

     sum_of_row(15).

     generate_square :-
          % retrieve facts about the domain
          squares(Square),
          numbers(Numbers),
          % generate square
          fill_square(Numbers, Square),
          % test square
          test_square(Square),
          % output square
          display_square(Square).


     % 1 boundary condition
     fill_square([], _).
     % 2 recursive condition
     fill_square([Number|Tail], Cells) :-
          member(cell(_, Number), Cells),
          fill_square(Tail, Cells).


     test_square([cell(a, Cell_a), cell(b, Cell_b), cell(c, Cell_c), 
                  cell(d, Cell_d), cell(e, Cell_e), cell(f, Cell_f),
                  cell(g, Cell_g), cell(h, Cell_h), cell(i, Cell_i)]) :-
          sum_of_row(Sum),
          % test rows
          Sum is Cell_a + (Cell_b + Cell_c),
          Sum is Cell_d + (Cell_e + Cell_f),
          Sum is Cell_g + (Cell_h + Cell_i),
          % test columns
          Sum is Cell_a + (Cell_d + Cell_g),
          Sum is Cell_b + (Cell_e + Cell_h),
          Sum is Cell_c + (Cell_f + Cell_i),
          % test diagonals
          Sum is Cell_a + (Cell_e + Cell_i),
          Sum is Cell_c + (Cell_e + Cell_g).


     display_square([cell(a, Cell_a), cell(b, Cell_b), cell(c, Cell_c), 
                     cell(d, Cell_d), cell(e, Cell_e), cell(f, Cell_f),
                     cell(g, Cell_g), cell(h, Cell_h), cell(i, Cell_i)]) :-
          tab(10),
          writeln('-------------'),
          tab(10),
          write('| '),
          write(Cell_a),
          write(' | '),
          write(Cell_b),
          write(' | '),
          write(Cell_c),
          writeln(' |'),
          tab(10),
          writeln('-------------'),
          tab(10),
          write('| '),
          write(Cell_d),
          write(' | '),
          write(Cell_e),
          write(' | '),
          write(Cell_f),
          writeln(' |'),
          tab(10),
          writeln('-------------'),
          tab(10),
          write('| '),
          write(Cell_g),
          write(' | '),
          write(Cell_h),
          write(' | '),
          write(Cell_i),
          writeln(' |'),
          tab(10),
          writeln('-------------').


     % utilities

     % member/2 
     % 1 boundary condition
     member(Elem, [Elem|_]).
     % 2
     member(Elem, [_|Tail]) :-
          member(Elem, Tail).

     % writeln/1
     writeln(Arg) :-
          write(Arg), 
          nl.

Testing:
We need to be able to test the whole program. This is rather easy because the main goal has no arguments.

The query will be:

          | ?- generate_square1.

We have to check that it produces the magic square as specified above. However, it may be that there is one more than one 3 x 3 magic square. To obtain this without having to add an argument to generate_square/0, we can use the query:

          | ?- generate_square1, fail.

This will give all the solutions, but you should be warned that this query can take rather a long time to complete. For instance it might take about two minutes on a SUN 4/60 workstation and half-an-hour on a personal computer of 1980 vintage.

Final words

This module has illustrated a program development technique called "stepwise refinement". The example of the magic square as programmed here is a relatively simple program and it may be that the stepwise refinement method looked excessively verbose. With more practice, such refinements become concise until (with sufficient experience) the process is done almost without conscious thought.


References and bibliography

Glaser, H.; Hankin, C. and Till, D. (1984) Principles of functional programming. Prentice-Hall. - Section 1.2: Top-down stepwise refinement.

Goldschlager L. and Lister, A. (1988) Computer science: a modern introduction. 2nd ed. Prentice-Hall. - Section 1.4: The importance of algorithms; Section 2.3: Stepwise refinement of algorithms.


[+] I am grateful for Dr D. M. Peterson for bringing this problem to my attention.


--------------------------------------------------------------------------

Navigation Menu

--------------------------------------------------------------------------

These Pages are maintained by Dr Peter Hancox

Last updated October 1998