TEACH RECURSION A.Sloman Jan 1997
Based on
TEACH RECURSE1 Steven Hardy August 1978
Updated A.Sloman Nov 1989
Updated to use lvars Oct 1996
Minor updates Jan 2001
RECURSIVE PROCEDURE DEFINITIONS
(Illustrated with POP11 examples)
CONTENTS - (Use g to access required sections)
-- Introduction
-- Example: Arithmetic expressions
-- Prerequisites for the rest of this file
-- The example of factorial
-- Why are circular definitions permitted?
-- Example: adding the numbers in a list
-- Expressing sum in Pop-11
-- Using the Pop-11 tracer on a recursive procedure
-- Using the functional style
-- Finding the largest number in a list
-- A Pop-11 definition of largest(list)
-- Does largest really need to test whether the length exceeds 1?
-- An inefficient definition of "largest"
-- A more efficient version of largest
-- Searching procedures and accumulating procedures
-- The importance of recursion
-- Recursion vs iteration (loops)
-- How recursion works: the "little man" metaphor
-- How elves run a procedure from a script
-- -- What happens when other elves are active on sub-tasks
-- -- Why Elf1 needs a separate script with space for local variables
-- -- Elf2 runs the recursive call
-- Demonstrating how elves start and stop procedures
-- -- Elves need instruction pointers
-- Designing a recursive procedure
-- Exercise: listsize
-- Exercise: countwords
-- Exercise: nonwords
-- Searching a tree
-- -- Counting the elements of a tree
-- Exercise: addup_items
-- Exercise: find_items
-- Exercise: findbetween
-- Evaluating arithmetic expressions
-- Further work
-- Some things to do:
-- Introduction -------------------------------------------------------
This teach file discusses 'recursive' definitions, that is defining some
concept in terms of itself (or some group of concepts in terms of each
other).
For example you cannot say what a noun phrase is without including the
possibility of a noun phrase that contains a prepositional phrase, e.g.
the man on the moon
However, you cannot specify what a prepositional phrase is without
saying that it can contain a preposition and a noun phrase, e.g.
on the moon
TEACH GRAMMAR introduces recursive definitions of grammatical
constructs.
It is also often useful to use recursion in defining a procedure. An
example is the procedure for evaluating an arithmetic expression, whose
components may also be arithmetic expressions.
-- Example: Arithmetic expressions ------------------------------------
Consider arithmetical expressions, which we can represent a lists, where
each expression starts with an operator, e.g. + or * or / and one or
more arguments, which may themselves be arithmetical expressions. Thus
the following are lists representing arithmetical expressions:
[+ [- 3 2] [* 4 5]] which evaluates to 21
[/ [* [+ 3 2] [- 4 8]] 2] which evaluates to -10
Notice that I defined an arithmetical expression as something that could
include arithmetical expressions. However I implicitly assumed that the
numbers on their own, 2, 3, 4, 5, 8 etc. were all arithmetic
expressions.
Also the definition of the procedure to evaluate an arithmetical
expression says that if it is a number then the value is just that
number otherwise evaluate the arguments and then apply the operator to
the results.
That's recursive, as it defines evaluation in terms of evaluation.
Some people object to circular definitions. But these circular, or
recursive definitions, or types of structures and of procedures are not
viciously circular, because they always "bottom-out" on something that
is handled without further recursion.
-- Prerequisites for the rest of this file ----------------------------
The examples below assume that you are familiar with procedure
definitions in POP-11 (see TEACH * DEFINE), including the definition of
procedures with inputs and outputs.
You should also be familiar with the basic arithmetical operations
+ (plus)
- (minus)
* (multiplied by)
and the elements of list processing, as introduced in TEACH * LISTS.
In particular, you need to know that
(a) [] is an empty list
(b) HD is a built-in operation to find the first element of a list
(c) TL is a built-in operation to find the tail of a list, where
the tail of L is a list containing all the elements of L except
the first.
-- The example of factorial -------------------------------------------
An example which may be familiar to you is the definition of a
factorial, a concept often used in mathematics. The factorial of a
number is got by multiplying together all the numbers from 1 up to that
number.
So (using "*" for multiplication) we have the following examples:
Factorial of 5 is
5 * 4 * 3 * 2 * 1
which is 120. Factorial of 9 is
9 * 8 * 7 * 6 * 5 * 4 * 3 * 2 * 1
which is 362880 !
It is easy to give examples of this concept, but how can we define it
in general for any number? Here is one way:
(a) factorial 1 is 1.
(b) factorial of a number n greater than 1 is n multiplied by the
factorial of n-1,
We represent the factorial of n as "factorial(n)", though mathematicians
often represent it as "n!" and read it as "n factorial".
From (a) and (b) you can work out the factorial of any number. For
example, because 3 is greater than 1, definition (b) implies that
factorial of 3 is :
3 * factorial(2)
Because 2 is also greater than 1, factorial(2) is 2 * factorial(2 - 1),
so we can replace "factorial(2)" above giving:
3 * 2 * factorial(1)
and, because factorial 1 is 1 (using definition (a)), that is
3 * 2 * 1
Algebraically the definition reads:
(a) factorial(1) = 1
(b) if n > 1 then factorial(n) = n * factorial(n-1)
We can express that in Pop-11 thus:
define factorial(num) -> result;
;;; compute the factorial of a number
if num == 1 then 1 -> result
else
num * factorial(num - 1) -> result
endif
enddefine;
Now compile that and try it out on these examples:
factorial(1) =>
** 1
factorial(5) =>
** 120
factorial(50) =>
** 30414093201713378043612608166064768844377641568960512000000000000
-- Why are circular definitions permitted? ----------------------------
That second clause, (b) defines 'factorial' in terms of factorial - how
is that possible without endless circularity?
The key idea, which is often hard to remember, is that once a procedure
P has been defined and the definition has been compiled, any procedure
can use P. I.e. any procedure can "call" P, provided that it gives P the
appropriate inputs. The caller will simply wait till P's instructions
have been obeyed, and then continue.
If ANY procedure can use P, then P can also use P. I.e. it can use
itself to solve some sub-problems. While writing the instructions for P
you may find it hard to believe that P can be used in the middle,
because the definition has not yet been completed. However, once the
procedure has been compiled, all the instructions for doing P are in the
computer.
So when P calls P, the computer will do exactly the same as if anything
else calls P: i.e. it will run the instructions for doing P.
Those instructions may also call P. But if the procedure is well
designed it will not go on forever calling itself. Eventually it will
reach a point where the answer is given in some other way. For example
in the case where the number is 1 the factorial is defined directly as
1, so that's where the recursion stops. For instance in the case of
factorial(50) the procedure had to deal with 49, 48, 47, ... etc. until
it got down to 1, when the recursion stopped (though the procedure
continued using the result).
However if you try to compute the factorial of -30, the recursion will
never stop, for the sequence of numbers is then
-30, -31, -32, -33 ....
which never reaches 1.
-- Example: adding the numbers in a list ------------------------------
We'll now use a slightly different example, as some readers will be
unfamiliar with factorial and may find the concept unmotivated.
You often have to add up a set of numbers. Suppose we represent a set
of numbers by a list, e.g.
[3 5 13 6 22 4]
You probably know how to add up the numbers presented in such a list,
but can you tell a computer how to do it?
Consider the following definition of the 'the sum of a set of numbers':
(a) The sum of a set of numbers is zero if the set is empty
(b) otherwise it is the first element of the set plus the
sum of the rest of the set.
(b) Implies that the sum of the numbers in [3 5 13 6 22 4], which we
can represent as sum([3 5 13 6 22 4]) is
3 + sum([5 13 6 22 4])
since [5 13 6 22 4] is the tail of the list. Once again we can use
the definition (b) to "expand this" giving
3 + 5 + sum([13 6 22 4])
using (b) repeatedly we eventually get to
3 + 5 + 13 + 6 + 22 + sum([4])
and then, (since [] is the tail of [4]):
3 + 5 + 13 + 6 + 22 + 4 + sum([])
To finish this off, we have to use definition (a), giving
3 + 5 + 13 + 6 + 22 + 4 + 0
We can define "sum" more formally thus:
(a) sum([]) = 0
(b) if L is not empty then
sum(L) = hd(L) + sum(tl(L))
where "hd" is the Pop-11 procedure that gives the first element of the
list, and "tl" is the procedure that gives the tail, i.e. a list
containing all elements apart from the first.
Notice that hd and tl are not defined for empty list: these Pop-11
procedures give an error if applied to []. But this is not a problem,
since case (b) excludes the possibility of L being the empty list.
-- Expressing sum in Pop-11 -------------------------------------------
When translated into POP11 the above definition becomes:
define sum(set) -> total;
if set == [] then
0 -> total
else
hd(set) + sum(tl(set)) -> total
endif
enddefine;
You should be able to test that procedure on a number of examples.
sum([]) =>
** 0
sum([15]) =>
** 15
sum([1 3 5 7]) =>
** 16
;;; find the sum of a list containing number 1, 2, ... 1000
vars x;
sum([% for x from 1 to 1000 do x endfor %]) =>
** 500500
-- Using the Pop-11 tracer on a recursive procedure -------------------
One advantage of recursive procedures is that if you trace them then
the trace information shows exactly what they are doing.
Use the Pop-11 tracer to help you see what is going on when sum runs.
Trace "sum" and "tl" and "+".
trace sum tl +;
(Because "tl" and "+" are "system" procedures you will have to
recompile "sum" to make the tracing work for them.
sum([1 3 5]) =>
> sum [1 3 5] ;;; Going into sum with [1 3 5]
!> tl [1 3 5] ;;; Calling tl to get the tail
!< tl [3 5] ;;; tl returns [3 5], for the recursive call
!> sum [3 5] ;;; Calling sum recursively with the tail
!!> tl [3 5] ;;; The new call of sum needs the tail of [3 5]
!!< tl [5] ;;; tl returns [5]
!!> sum [5] ;;; Call sum recursively with the new tail [5]
!!!> tl [5] ;;; The new call of sum needs tail of [5]
!!!< tl [] ;;; which is the empty list
!!!> sum [] ;;; Call sum recursively with the empty list
!!!< sum 0 ;;; At last sum knows what to return - 0
!!!> + 5 0 ;;; Call + with the previous head and 0
!!!< + 5 ;;; Plus returns the result 5
!!< sum 5 ;;; which is then used as the result of sum
!!> + 3 5 ;;; Call plus with previous head and previous
!!< + 8 ;;; result: Plus returns 8
!< sum 8 ;;; which is then returned by sum
!> + 1 8 ;;; At last the initial call of sum can
!< + 9 ;;; call + with 1 and 8, getting back 9
< sum 9 ;;; which is the final result of sum
** 9 ;;; And "=>" prints it out
Notice how NO addition is done until after the recursion has got to the
end of the list, and produced the first result, the number 0.
We can see the same thing without the tracing of "+" and "tl".
untrace + tl;
sum([1 3 5]) =>
> sum [1 3 5]
!> sum [3 5]
!!> sum [5]
!!!> sum []
!!!< sum 0
!!< sum 5
!< sum 8
< sum 9
** 9
-- Using the functional style -----------------------------------------
Some people find that the output local variables clutter up the
procedure definitions given above, and prefer to use the following form
of definition, sometimes referred to as "the functional style":
define sum(set);
if set == [] then
0
else
hd(set) + sum(tl(set))
endif
enddefine;
As far as Pop-11 is concerned this is equivalent to the previous
version. However, in more complex cases it is convenient for the header
to make it clear that the procedure returns one result. So having an
output local is helpful. You can compromise by using the following
format, which is also equivalent to the above, but is slightly less
cluttered.
define sum(set) -> total;
if set == [] then
0
else
hd(set) + sum(tl(set))
endif -> total
enddefine;
In this format you don't have to do an explicit assignment in each
branch of the conditional expression. You simply assign the result of
evaluating the conditional expression to the output local variable, in
this case total. This has the advantage that if you omit the "else"
clause, a common problem in such cases, you'll get an error message when
the explicit conditions are all false, instead of an obscure result
being returned which causes an error to occur later.
If you forget to assign to the output local you'll get the following
obscure result, or worse in some implementations of Pop-11.
define sum(set) -> total;
;;; buggy version, where total is not given a value
if set == [] then
0
else
hd(set) + sum(tl(set))
endif
enddefine;
;;; test it
sum([3 5]) =>
** 3 5 0 0
sum([1 2 3]) =>
** 1 2 3 0 0
All the recursive calls produce spurious extra results, one for the
value of the conditional and one for the value of total, which on a Sun
computer defaults to 0. On some implementations an uninitialised lexical
variable might default to garbage.
Exercise: define an iterative version of sum, using the format
for item in list do .... endfor
Compare your iterative version with the above recursive versions, and
try to decide which you prefer. Can you trace the iterative version to
see how it behaves?
For more on the "functional style" of programming, see
TEACH * FUNCTIONAL.STYLE
-- Finding the largest number in a list -------------------------------
Consider finding the largest number in a list of numbers.
We can specify what largest means along the same lines as before, except
that now we say that if the list is empty there is no largest number. We
can represent that by producing an error in that case.
If there's only one element then that is the largest. Otherwise, find
the largest element in the tail of the list, and compare that with the
head of the list. The larger of the two is the largest in the original
list.
We can use the procedure "max" to specify this more precisely. max(x,y),
where x and y are numbers, is always one of x and y, the larger one if
one is larger, or x if they are the same number E.g.
max(3, 55) =>
** 55
max(55, 3) =>
** 55
max(55, 55) =>
** 55
max(-3, -2) =>
** -2
We therefore have three cases to consider:
(a) If L = [] then largest(L) is undefined, and produces an error
(b) If L has only one element, then largest(L) is that element
(c) If L has more than one element then
largest(L) = max( hd(L) , largest( tl(L) ) )
Notice that because "max" takes two numbers as input, this definition
assumes that hd(L) is always a number and that largest(tl(L)) is always
a number, which it must be if "largest" is defined properly!
-- A Pop-11 definition of largest(list) -------------------------------
One way of saying this in POP11 is:
define largest(list) -> result;
if tl(list) = [] then
hd(list)
elseif length(list) > 1 then
max( hd(list), largest(tl(list)) )
endif -> result
enddefine;
Test that definition:
largest([33]) =>
** 33
largest([2 66 4 8]) =>
** 66
vars ages = [23 42 17 9 16 32 15 12 7];
largest(ages) =>
** 42
Try some of your own examples.
What happens if you give it the empty list?
largest([]) =>
If the list given is empty, then the call of "tl" in the first condition
will produce an mishap, with the error message
;;; MISHAP - NON-EMPTY LIST NEEDED
;;; INVOLVING: []
;;; DOING : tl largest ...
Exercise:
compile that procedure and testing it on some examples of lists of
numbers, including the empty list. If you trace it, and trace max, then
you can see how it works.
To turn off tracing use
untrace largest;
-- Does largest really need to test whether the length exceeds 1? -----
If you think hard about it, you should be able to work out that
the test for the length being more than 1 is redundant. The length
cannot be 0 at that stage because an empty list would not have got
past the call of "tl" in the first test: it would have produced an
error. The length cannot be 1 because that would involve having a tail
that is empty, so the first condition
tl(list) = []
would be true, and the first action would be done, giving the result
hd(list).
So if the length is not 0 and not 1, then it must be greater than 1,
by the time it gets to the "elseif" instruction. Hence the test is
redundant.
We can therefore have a more economical definition without the second
condition, thus:
define largest(list) -> result;
if tl(list) == [] then
hd(list)
else
max( hd(list), largest(tl(list)) )
endif -> result
enddefine;
-- An inefficient definition of "largest" -----------------------------
We can also define largest differently, without using "max". Instead
we can use the arithmetic operator ">", i.e. "greater than", thus:
define largest(list) -> result;
if tl(list) = [] then
hd(list)
elseif hd(list) > largest(tl(list)) then
hd(list)
else
largest(tl(list))
endif -> result
enddefine;
Try compiling and testing that one, to make sure it works (using trace
if necessary).
Notice that if the "elseif" condition is FALSE, then it will call
largest on the tail TWICE. So you may find, if you trace this, that it
has more recursive calls than you expect. This means that it is a
wasteful definition. Look back at the previous definition to see why it
did not need to call largest(tl(list)) twice.
-- A more efficient version of largest --------------------------------
We can avoid calling it twice by calling it once and storing the result
temporarily in a local variable "temp", and use that to
compare with the head of the list, without having to find the largest
again if the head of the list is smaller than temp. So
define largest(list) -> result;
if tl(list) == [] then
hd(list)
else
;;; Get the largest from the tail
lvars temp = largest(tl(list));
;;; Now check whether to use this as the final result or not
if hd(list) > temp then
;;; The head is bigger, so use it
hd(list)
else
;;; use temp, as it is bigger or the same
temp
endif
endif -> result
enddefine;
Compile and test that. Make sure you choose a "good" set of test
examples. What is a good set?
Here is a slightly different version, which avoids the extra local
variable temp, by using the output local variable result as the
temporary variable. Here we have to assign to result on each conditional
branch. (Why?)
define largest(list) -> result;
if tl(list) == [] then
hd(list) -> result;
else
;;; get the largest element in the tail
largest(tl(list)) -> result;
;;; now check whether to use this as the final result or not
if hd(list) > result then
;;; The head is bigger, so use it
hd(list) -> result
;;; no need for an else clause: just use result as it is
endif
endif
enddefine;
Exercise:
Use recursion to define a procedure smallest, that can be given a list
of numbers and returns the smallest one. Test your procedure with
different lists of numbers, including a list containing only negative
numbers.
Try also writing an iterative definition of smallest, using
for num in list do ... endfor
-- Searching procedures and accumulating procedures -------------------
Notice that unlike the procedure sum, defined above, the procedure
largest sometimes DISCARDS the result of the recursive call on the tail
of list, after using the result to compare with the head of the list.
It discards the result when the head of the list is bigger.
That is because largest is a searching procedure, not an accumulating
procedure.
An accumulating procedure builds up its result using everything it finds
in the list.
A searching procedure uses only one element of the list to create its
result.
An intermediate procedure could use a subset of its elements, e.g. a
procedure that takes a list and returns a list of all the numbers in it
that are between 3 and 30, for instance, or adds up all the numbers in
the list between 3 and 30.
The procedure sum is an accumulating procedure, as it uses every
element in the list to build up the total. So we added the result from
the recursive call to the head of the list.
-- The importance of recursion ----------------------------------------
In each of these procedure definitions - SUM and LARGEST - we see a
mention of the procedure being defined. It is this feature that makes
the definition recursive.
Definitions of procedures, like definitions of almost anything else
typically involve mentioning something that isn't part of the
definition. Dictionary definitions always use words that aren't defined
in that definition - how could they avoid it?
Recursive definitions are just a very special case of this more general
phenomenon - a phenomenon bound up with our capacity to NAME things -
objects, procedures, events and so on. Once you can name a procedure,
you can invoke it inside another procedure. If you can do that, you
can invoke it inside itself.
This circularity may appear to produce an infinite regress. Why does the
recursion stop?
If our recursive procedure is well defined, there are always special
cases e.g. factorial(1), sum([]), or largest([3]) that don't involve
recursion.
Moreover, we define the procedure so that it always eventually gets to
such "terminating" case when it recurses. That is how the recursion
avoids going on forever. All recursive procedures must have one or more
"stopping conditions".
What the stopping condition is depends on the problem. Often it is the
empty list, but we saw with "largest" that it was a list of length 1.
Often the stopping condition in numerical recursive procedures is an
input of 0, but with factorial, as we defined it, the stopping
condition was an input value of 1.
The significance of recursion lies in its ability to deal
straightforwardly with variable structures. For example in sum, and
largest, we used the same definition to cope with lists of very
different lengths, from one element up to as many as can fit in the
machine.
-- Recursion vs iteration (loops) -------------------------------------
In both of those cases, it would have been possible to use a loop
instead of recursion, because the structures were all linear. Executing
a loop is generally referred to as "iteration".
Here is how we could define the above two procedures iteratively:
define sum(set) -> total;
0 -> total;
lvars item;
for item in set do total + item -> total endfor
enddefine;
sum([3 4 5]) =>
** 12
define largest(list) -> result;
lvars item;
hd(list) -> result;
for item in tl(list) do
max(item, result) -> result
endfor
enddefine;
largest([3 4 5]) =>
** 5
largest([-3 -4 -5]) =>
** -3
Many people find iterative procedures much easier to design and
certainly easier to create. That's because iterative constructs are very
familiar from everyday life:
Keep walking till you see the station on your left.
Keep hitting the nail until its head is flat.
Examine every pocket in all your clothes until you find
the lost key.
Collect all the children who are over four feet tall.
When dealing with a range of numbers it is very convenient to use a
"for" loop, and similarly when searching down the elements of a list,
using "for" or "foreach" is very easy, whether your procedure is an
accumulating one or a searching one. (See HELP FOR, HELP FOREACH)
However for investigating or manipulating objects with more complex
non-linear structures, recursion may be more appropriate.
For example, one approach to a definition of complex sentences uses
recursion in such a way as to 'embed' simple sentences inside another.
Such an approach to language structures makes it possible to encompass a
knowledge of potentially infinitely many sentences in a brain of only
finite capacity.
(See TEACH *ISASENT, and John Lyons, CHOMSKY, Fontana Modern Masters
paperback).
The example of evaluating arithmetic expressions, introduced above, is a
natural case for the use of recursion, as we'll see below. The same is
true of programs that search tree structures or network structures.
-- How recursion works: the "little man" metaphor ---------------------
A metaphor can be used to explain how the computer is able to understand
these recursive definitions. (The metaphor is originally due to Seymour
Papert, though it has been modified slightly here.)
Imagine that inside the computer there is a library of procedure
definitions each in the form of a set of instructions for running the
procedure. For example it might include the following recursive
definition of sum, introduced above
define sum(set) -> total;
if set == [] then
0
else
hd(set) + sum(tl(set))
endif -> total
enddefine;
The library also contains a data-structure section, were information
is stored about the contents of various data-structures, such as lists,
or strings.
So for example if I do
vars ages = [23 42 17 9 16 32 15 12 7];
Then the library has the information that the variable called "ages" has
as its value the list of numbers
[23 42 17 9 16 32 15 12 7]
Next to the library is a waiting room full of little men and little
women (lets call them elves) sitting around doing nothing.
There is also a table called "the stack" on which things can be placed
by elves when then do their work. So if one elf asks another to compute
something it gives that elf its inputs by putting them on the stack. The
second elf takes what it needs from the stack, and when it is finished,
if it produces some results it puts them on the stack.
A "boss" elf, interacts with users and tells the other elves what to do.
In particular each elf can be given a copy of the instructions for
running a procedure and will know how to follow them.
-- How elves run a procedure from a script ----------------------------
So when you type in a request to run a procedure, for example, asking
for the procedure sum to be used to add up a list of ages of people:
sum(ages) =>
one of the elves in the waiting room is instructed by the boss to work
this out for you. Suppose that is Elf1.
The description that follows is quite complex. You may find it easier to
understand if you create a diagram which keeps changing, depending on
o what items are on the stack,
o which elf is active,
o which elves are sleeping while they wait for another one to
finish.
o which set of procedure instructions each active or sleeping
elf has
o which instruction is the *current* one for each elf (whether
active or sleeping).
This is what happens when the instruction sum(ages) is obeyed.
The boss gives Elf1 a photocopy of the instructions defining the
procedure SUM, and then finds on the stack information about where the
list AGES is stored in the library (i.e. a pointer to the list), i.e.
the list
[23 42 17 9 16 32 15 12 7]
(NOTE:
Elf1 will not work on a copy of the list, but the list itself.
Sometimes programs have instructions to change the contents of a
list, and that may or may not cause problems. If the contents of the
original list are not to be changed, then the elves carrying out a
procedure could use a copy of the contents of the list, so that
changes made to the copy will not affect the original.
However, the instructions in the procedure SUM do not change
the contents of the list, so in this case no copy needs to be
made, and none is made.)
Elf1 looks at the procedure header
define sum(set) -> total;
From this, Elf1 learns
(a) The procedure SUM has two local variables "set" and "total".
The first is an input local, the second an output local.
The first will be given a value at the beginning of the process,
as explained in point (b), and "total" will get a value later,
as described in point (c) below.
So Elf1 draws two boxes on the sheet of paper, one labelled "set"
(for the procedure's input) and one labelled "total" (for the
procedure's result, still to be worked out).
Similar boxes would be created for any other variables declared as
local in the procedure. (In the case of SUM, there are no more.)
(b) Next Elf1 has to take one item from the stack, and temporarily
(i.e. while following the instructions for sum) call it "set".
This can be done by writing in the box labelled "set" whatever
was found on the stack (in this case a pointer to a list of numbers
in the library).
(c) Nothing is written in the box labelled "total". But Elf1 notes that
just before the job is finished, i.e. when all the instructions
in the definition of SUM have been carried out, whatever has been
stored in the "total" box has to be put on the stack.
(That's where it could be found by another procedure that invoked
SUM.)
Since Elf1 is going to go through different steps in the following the
definition of SUM he could attach a label corresponding
to each instruction in the definition (e.g. "I1", "I2", "I3", ...).
Then he could create a box on his scriped labelled "Current
instruction", and write "I1" in it, and then keep changing what is in
the box as he steps through the instructions. (This requires an eraser.)
Alternatively he could keep a finger on the current instruction and move
it to the next instruction after each instruction is completed.
To a first approximation, this produces the following labelled
instructions:
I1: -> set
if
I2: set == []
then
I3: 0
else
I4: hd(set) + sum(tl(set))
endif
I5: -> total
I6: Put value of "total" on stack
I7: Tell boss elf I am finished.
Where I1 is the instruction to get the input value from the stack and
give it to "set", and I6 is the instruction to take whatever the value
of "total" is and put it on the stack.
The words "if" "then" "else" are "control" instructions. They
help the elf decide what to do next. "if" says that the next
instruction must produce a value that will be used to decide what to do
after that.
"next" says that if the test instruction produces the result true
then instruction I3 should be obeyed (put the number 0 on the stack).
"else" says that if the test instruction I2 produces the result false,
then the next instruction is I4.
There is also implicit information that whichever of I3 or I4 is obeyed
the next instruction is that coming after "endif", namely I5.
Working through the definition of SUM Elf1 first does I1, i.e. gets
whatever was at the top of the stack and writes it in the box labelled
"set". (This will be information about the list in the the library).
Then El1 turns to the next instruction (labelled "I2").
I2: set == []
This is an instruction to run the equality procedure "==" with the value
of "set" and the empty list.
But the instructions given to Elf1 do not say how to do equality tests,
So Elf1 puts the value of set on the stack, followed by the empty list
[], and asks the boss to arrange for the procedure called "==" to be
run.
Then Elf1 goes to sleep, waiting to be woken up when that procedure is
finished.
The Boss gets another elf, e.g. Elf2, who then gets a copy of the
instructions for doing a strict equality test (which is what "==")
means.
Elf2 finds that the script for "==" tells him to get the two things on
the top of the stack and compare them.
[]
[23 42 17 9 16 32 15 12 7]
He decides that the result is false. So he puts the object FALSE on the
stack, and tells the boss he has finished then goes back to the waiting
room.
The boss wakes up Elf1, who is waiting for the result of I2, and still
has his finger on the "then".
Elf1 looks at the stack, finds FALSE there, and concludes that the
condition has failed.
So he removes FALSE from the stack, jumps to after the word "else"
where the instruction found is
I4: hd(set) + sum(tl(set))
This has to be broken down into several smaller instructions
invoke hd on set and put the result on the stack
invoke tl on set and put the result on the stack
invoke sum on what tl left on the stack, and then put
the result of sum (a number) on the stack
invoke + on the results of hd and sum, and put the
result, another number, on the stack.
Most of that is done by other elves, obeying instructions for hd, for
tl, for sum and for +.
For example, when the first bit of I4 is done, the value of set is put
on the stack, Elf1 tells the boss that the result of hd is wanted and
then goes to sleep.
Elf2 is called in, handed a copy of the instructions for hd, and
finds on the stack a pointer to our list called "set" in the library.
Elf2 follows the instructions for hd, i.e. looks in the list to get the
first element, in this case the number 23, and then leaves the number 23
on the stack as the result of the procedure hd. Elf2 informs the boss
that running the procedure hd is finished, and the boss then wakes up
Elf1, who does the next bit, i.e. tl(set), by putting the value of "set"
on the stack and asking the boss to get an elf to do tl.
When all the component in structions of I4 are done, Elf1 is woken up by
the boss, and moves to instruction I5
I5: -> total
This means removing the top item of the stack and writing it in the box
labelled "total".
Instructions I6 and I7 cause Elf1 to put the value of "total" on the
stack and inform the boss that the job is complete.
-- -- What happens when other elves are active on sub-tasks
When the "==" test was required, Elf2 was given that task. Later
Elf1 might be given other tasks, e.g. running the instructions for
"hd", the instructions for "tl", the instructions for the recursive call
of "sum", or the instructions for "+".
The same Elf as before can be used because there is no need for the elf
to remember anything about what it did previously: e.g. time it starts
with a clean sheet, i.e. as set of instructions for a procedure.
However if those instructions are complex, then that elf may have to
wait for another elf to finish. So when Elf2 is running SUM on the
shorter list it may need Elf3 to do various taks while Elf2 is sleeping.
Elf1 is also sleeping but can't be used because Elf1 is in the middle
of the original task and has to remember what to do on being woken up.
-- -- Why Elf1 needs a separate script with space for local variables
After another elf has run tl, Elf1 then wakes up, ignores the what has
just been added to the stack and instead tells the boss that sum now
needs to be run, in order to complete this bit of I4:
sum(tl(set))
After the "nested" bit, i.e. tl(set) has been complete the following
list, the tail of set, will have been left on the stack, (more precisely
a pointer to it will be on the stack):
[42 17 9 16 32 15 12 7]
Elf1 wakes up finds that step 5 requires the script for sum to be run.
Now he is already running the script for sum, so you might think he
could do it. But he has to remember where he is on that script (near the
end of the else condition) and also what his value is for the local
variable "set".
If Elf1 tried to remember two (or more) locations and two (or more)
values for the same variable, that could lead to confusion and error. So
Elf1 sticks with the task of computing the sum of the original list, and
asks the boss to get another elf to do the job of finding the sum of the
tail of the list.
So while Elf1 sits with its copy of the instructions for sum, another
elf, possibly Elf2 again, is given a new copy of the instructions and
is told to obey those instructions. Elf1 can then go back to sleep,
while Elf2 gets the subsidiary task done.
Elf2 will have different values for "set", and for "total" and while
Elf1 sleeps waiting for Elf2 to do sum, Elf1 will have one *next*
instruction to do and Elf2 will have a sequence of next instructions.
So each of the elves needs its private workspace recording the values of
the local variables and also which instruction is the current one.
One way to achieve that is to think of each elf with a sheet of paper
containing a script, with spaces in which it can write down its private
information.
You could try to invent a different model, e.g. one where there is only
one elf which uses different bits of paper at different times, and keeps
them in the right order so that as each one is finished with the
preceding script is found with its "private" information safe and sound.
-- -- Elf2 runs the recursive call
Elf2 has to go through the same rigmarole as Elf1 has just been through,
getting help with ==, hd, tl, and sum. So Elf2 will need yet another elf
to do these tasks, while Elf2 waits patiently, i.e. sleeps.
Elf3 could do all of them.
However, when Elf3 is asked to do sum the list at the top of the stack
is this:
[17 9 16 32 15 12 7]
Then Elf4 will have to be used for various subtasks while Elf3 sleeps,
and so on and so on, until eventually an elf, call it Elf11, turns up to
find the empty list [] on the stack when it is asked to do the script
for sum.
Its task is then very easy, as it asks the next elf to do the equality
test (==), and at last gets back the result TRUE on the stack. So at
last, after these instructions
if
I2: set == []
then
it is possible to do instruction I3 (done for the first time):
I3: 0
Then Elf11 can ignore I4 and jump to I5
I5: -> total
which puts 0 in the box for total.
So Elf11 does that, then does the next two instructions I6 and I7.
So the previous instruction is woken up in the middle of instruction I4
I4: hd(set) + sum(tl(set))
except that by now the instruction "sum(tl(set))" has been finished,
leaving the number 0 on the stack.
Then "+" can run to add the 0 to what was previously on the stack
(i.e. the number 7) in our example and eventually this elf finishes and
the previous one continues finishing its version of I4.
-- Demonstrating how elves start and stop procedures ------------------
To see what's going on we can trace sum and get it to compute the total
of a list of numbers.
Each line with a ">" represents a new elf starting his job with a
certain value of SET -- a list of numbers. Each line with a "<"
represents an elf finishing his task of carrying out the SUM
instructions, and it shows the result achieved, which will then
be used by the elf who asked him to do it. The vertical lines help you
relate the start and the finish of the job done by each elf.
trace sum; ;;; make pop11 show all the "activations" of sum
vars list = [3 77 5 9];
sum(list) =>
> sum [3 77 5 9] ;;; Elf1 starts doing sum
!> sum [77 5 9] ;;; Elf2 starts doing sum with the tl of list
!!> sum [5 9] ;;; Elf3 starts doing sum with a shorter list
!!!> sum [9] ;;; Elf4 starts, with a 1 element list
!!!!> sum [] ;;; Elf5 does sum, with empty list on stack
!!!!< sum 0 ;;; Elf5 leaves 0 on the stack
!!!< sum 9 ;;; Elf4 leaves 9 + 0 = 9 on the stack
!!< sum 14 ;;; Elf3 leaves 5 + 9 = 14
!< sum 91 ;;; Elf2 leaves 77 + 14 = 91
< sum 94 ;;; Elf1 leaves 91 + 3 = 94
** 94 ;;; The number is printed out by "=>"
Each elf has to WAIT while the other elves do their task, one at a time.
Some of them may have to call other elves to do further sub-tasks. All
the "nested" waiting is shown by the trace printout if you run a traced
recursive procedure.
Elf1 has to wait longest. Eventually its recursive call of SUM, handled
by Elf2 is finished, and the result 91 is found on the stack by Elf1.
Elf1 does not know or care how it is done, as long as the instructions
have been followed properly.
Notice that Elf1 does not get involved with evaluating the recursive
call. A second elf comes from the waiting room, with his own photocopy
of the SUM instructions and his own idea of the value associated with
the name SET. This second elf will in turn ask a third elf to perform a
SUM operation. At one stage, there will be several elves using the very
same definition of SUM, to work on slightly different versions of a SUM
process, though most of them will be sleeping or waiting patiently for
the next one along the chain to finish its task, as they cannot get on
until that is done.
The last elf to be invoked will associate the name SET with the empty
list [], and as explained above, will be able to 'return' the result 0
without 'invoking' yet another SUM operation.
This is where the recursion "bottoms out".
Once this last elf has finished, the previous one continues (by getting
a friend to do the actual addition) and then returns his result to the
second last elf and so on until the whole process is complete.
The advantage of this way of thinking about recursive procedures is that
each of the elves involved has a straightforward task - merely to
remember what its own 'local variables' are and to remember how far
through the procedure definition (the script) it has gone after each
step.
That elf does not have to understand the process as a whole - just its
own part in it (which is simple, and completely specified by ITS copy of
the instructions).
Remembering local variables includes both coping with input and output
variables and also any variables declared locally within the procedure
using "lvars" or "vars". (In Pop-11 these are handled differently in the
machine, but that need not concern us now. See TEACH VARS_AND_LVARS)
As explained above, the easiest way to think about local variables is to
assume that when each elf gets its script to follow, there are some
named boxes on the sheet, and the local variables have their values
written in those boxes by the elf using the script. Different elves
running the same script may put different values in the boxes.
Note that in the case of something like a list or other datastructure,
the whole list is not copied into the local variable's box. Rather the
box contains information about where that structure is stored in the
library.
For simple objects, e.g. small integers or decimal numbers that can fit
within one box (within a machine word) a symbol for the entity itself is
written in the box, not a pointer to a complex structure containing it.
-- -- Elves need instruction pointers
Each elf who waits for others will have to remember where he had got to.
So he keeps his finger on the instruction he has to obey as soon as the
next elf has finished. At any one time different elves may have their
fingers waiting on different instructions.
If the procedure involves a loop, the finger may have to go back to the
top of the loop several times, until a stopping condition is reached.
Everything said here is applicable not only to recursive procedures but
also the non-recursive case, where procedure A calls B, and B calls C,
and C calls D, etc. Each procedure is run as if by a little elf with a
copy of the instructions for the current procedure, while the preceding
elves sleep holding on to the insructions for their procedures.
-- Designing a recursive procedure ------------------------------------
When you design a recursive procedure, you have to think of yourself as
being like Elf1. Namely assume
1. that the "base level" case that does not require recursion is
handled properly.
2. that the recursive call handles all of the problem except a small
part (e.g. the first element of a list, or the first number in a
sequence of numbers).
Then write your procedure to complete the task as if those assumptions
were both true. If you have the courage to make assumption 2 (and are
very clear in specifying the inputs to and the outputs from the
recursive call), then it is often easy to do the rest, as in the case of
the procedures sum and largest, defined above.
-- Exercise: listsize -------------------------------------------------
Look at the definition of sum and see if you can modify it to produce a
recursive procedure that takes a list and computes its length. You can
not call it "length" as there is already a procedure called length in
Pop-11, and its name is a "protected" identifier.
So call it listsize:
define listsize(list) -> size;
if list == [] then
...
else
... + listsize(tl(list))
endif -> size
enddefine;
-- Exercise: countwords -----------------------------------------------
Define a recursive procedure called countwords that takes a list which
may contain a mixture of items, some of which are words and some are
not, and returns a number as a result, which is the total number of
words in the list.
Use the procedure isword to recognise the words:
isword("cat") =>
**
isword('cat') =>
**
isword(99) =>
**
isword(hd([the big cat])) =>
**
define countwords(list) -> words;
if list == [] then
...
elseif isword(hd(list)) then
... + countwords(tl(list))
else
...
endif -> size
enddefine;
;;; use these tests to check your procedure
countwords([]) =>
** 0
countwords([the 3 cats]) =>
** 2
countwords([1 2 3 4]) =>
** 0
countwords([a b c d]) =>
** 4
-- Exercise: nonwords -------------------------------------------------
Define a procedure called nonwords that takes a list and counts the
things that are in it that are not words. It should behave like this:
nonwords([]) =>
** 0
nonwords([the 3 cats]) =>
** 1
nonwords([1 2 3 4]) =>
** 4
nonwords([the silly old cats]) =>
** 0
-- Searching a tree ---------------------------------------------------
For all the recursive procedures defined so far, there are fairly
obvious ways of translating them into iterative, looping, procedures,
and the iterative version will often be found clearer by someone who is
not used to the "functional style" of programming, which makes heavy use
of recursion.
But there are some tasks that are very hard to do entirely using
iteration.
Consider a row of houses, each containing a family, and suppose that for
each family you have a list of the ages of the individuals. Then if
there are four families, each with between two and five members, their
ages might be in a list of lists, like this.
vars family_ages = [[3 6 27 28] [39 39] [ 13 35 37]];
Suppose that in a village there are several streets and for each
street you make a list of the families and for each family a list of
the ages. Then you could have a list of lists of lists of ages.
Suppose however, that in some streets there are houses that are shared
between different families. So you may want to group the occupants of a
house together in a list, but subdivide the list into two sublists, one
for each family. Then you may have some individuals who are of no fixed
abode, but move around and sleep in different places at different times.
You could add them as extra numbers, not in any list.
The upshot is that you have a list of lists of lists of lists... where
eventually you get down to numbers in the lists, but at varying depths
in the list structure.
vars village_ages =
[
16 19 ;;; homeless
[[3 6 27 28] [39 39] [ 13 35 37]] ;;; street1
24 33 ;;; homeless
[[5 9 28 32] [ 55 58]] ;;; street 2
[[22 25] [[33 33] [1 26 28]] [2 6 33]] ;;; street 3
52 21 23 ;;; homeless
;;; ... etc.
];
I've put the ages for homeless people in different places, to indicate
that they are currently in different parts of the village.
This is a bit like a tree, which starts with a trunk, and then branches
at various points, eventually ending in leaves. But some leaves are
reached after a few branches from the trunk and some after many.
Now suppose you want to compute the average age of the members of the
village. To do that you have to find the total number of occupants of
the village, and the total of all their ages, and divide the latter by
the former.
-- -- Counting the elements of a tree
It is not easy to use an iterative construct to explore a tree-structure
of the sort described, where you don't know how many branches there are
at each level, and how many different levels of branching you go through
if you follow different branches to the leaves at their extremities.
But if we think very clearly we can define a recursive procedure to
count the number of items (leaves) in a structure like the list assigned
to village_ages above.
Consider what happens if you look at the head of the list village_ages.
You may find their either a number, if it's a homeless person, or a
list, giving information about a street. If its a list, it may be an
empty list, or its head may be a number or another list, and so on.
Using that subdivision of cases we can define a recursive procedure to
count the number of items in a tree structure like village_ages. We'll
assume that eventually the counting procedure will get down to the
leaves of the tree, i.e. the numbers and in that case it will produce
the result 1 for the object. If by following tails, it gets to a list
that's empty, the result should be 0, as there are no members in the
empty list. If the list is not empty, it should count the head, count
the tail and add the two. So:
define count_items(tree) -> total;
;;; return a number specifying how many numbers are in
;;; the tree, at any depth
if isnumber(tree) then
1 -> total;
elseif tree == [] then
0 -> total
else
count_items(hd(tree)) + count_items(tl(tree)) -> total
endif
enddefine;
Compile that then try it out on village_ages:
count_items(village_ages) =>
** 32
Check that it has got the number right, by counting the number of
numbers yourself!
Try tracing it and do it again
trace count_items;
count_items(village_ages) =>
That will produce a LOT of trace printing. See if you understand it.
Then
untrace count_items;
Defining a non-recursive procedure to do what count_items does would
be very much harder. In fact it would be so messy that no attempt will
be made here to explain how to do it. The key idea is that you'd have to
build a stack of things waiting to be examined.
-- Exercise: addup_items ----------------------------------------------
Now you know how to find out how many people there are in the village.
But you don't yet have the total age which you need before you can
compute the average age.
So try to define a recursive procedure called addup_items, which can use
a structure similar to count_items, but instead of adding 1 to the total
for each number found adds the number. Try completing this.
define addup_items(tree) -> total;
;;; Add up all the numbers found in tree
...
enddefine;
Then test it on the collection of village ages.
addup_items(village_ages) =>
** 811
-- Exercise: find_items -----------------------------------------------
Here is a tree containing mixture of words and numbers at its
extremities:
vars mixed_items =
[ ant 1 [[bat] cat 2] [dog 3 4 [elephant flea 5] goose] 6 ];
Try to define a procedure find_items that takes such a tree, and a
recogniser procedure such as isword or isinteger and returns a list of
all the items in the tree that satisfy the recogniser.
In Pop-11 the best way to do this is to define a sub-procedure that
searches the tree and leaves all the things it finds on the stack.
Then you can call that procedure inside list brackets [% ... %] to make
a list of all the things found.
So to define this
define find_items(tree, pred) -> list;
....
enddefine;
we first do this:
define sub_find(tree, pred);
if tree == [] then
;;; got to end of list, do nothing
elseif atom(tree) then
if pred(tree) then
;;; leave the item on the stack
tree
endif;
else
;;; It must be a list, so
;;; get all the items in the head of the list
sub_find(hd(tree), pred);
;;; get all the items in the tail of the list
sub_find(tl(tree), pred);
endif
enddefine;
Try that out on the list, mixed_items created above, and on simpler
lists, looking for words, and looking for integers.
sub_find([a 1 [b 2] [c [d 3 4]] 5 e], isword) =>
** a b c d e
sub_find([a 1 [b 2] [c [d 3 4]] 5 e], isinteger) =>
** 1 2 3 4 5
Now complete the definition of find_items, so that it uses sub_find and
makes a list of all the results.
NOTE: in Pop-11 a procedure can be nested inside another. So having
debugged sub_find, you could move the definition inside find_items, and
indent it. Thus:
define find_items(tree, pred) -> list;
define sub_find(tree, pred);
.....
enddefine;
.... -> list
enddefine;
sub_find is then accessible only from find_items.
You might like to try replacing that definition of find_items with
one that creates lists as it goes along, and then joins up the sublists.
For that to work, every call of find_item must return a list, possibly
an empty list.
It might start something like this.
define find_item(tree, pred) -> list;
if tree == [] then
;;; got to end of list, nothing more to find
[] -> list
elseif atom(tree) then
if pred(tree) then
[^tree] -> list
else
[] -> list
endif;
else
... complete the recursive calls and their use...
endif
enddefine;
Try finishing this and try it out on the test cases, given above.
This sort of procedure, which makes lots of temporary lists then joins
them into bigger lists. can sometimes be a lot more inefficient than one
which puts all the objects found on the stack, and then, at the end,
makes a list of all the items on the stack.
-- Exercise: findbetween ----------------------------------------------
Here is a tree containing only numbers at its leaves:
vars num_tree = [ 1 [2 [3] [4 5 6]] [7 [8 9] 10] 11] ;
Define a procedure called find_between which takes two numbers, lower
and upper, and a tree, and searches the tree for numbers that are
between the lower and the upper number and makes a list of all of them.
You can use < and > for testing whether a number is in the required
range.
Complete this:
define find_between(lower, upper, tree) -> found;
;;; return a list of all the numbers between lower and upper
;;; in the tree
....
enddefine;
;;; test it on the above list
find_between(3, 8, num_tree) =>
find_between(4, 9, num_tree) =>
find_between(-5, 0, num_tree) =>
-- Evaluating arithmetic expressions ----------------------------------
Now for something rather different. Earlier we introduced expressions
like these:
[+ [- 3 2] [* 4 5]] which evaluates to 21
[/ [* [+ 3 2] [- 4 8]] 2] which evaluates to -10
We can define a grammar for these expressions by saying that they are
trees satisfying the following rules:
Operators are: + - * /
Trees are: numbers
Trees are: [OP T1 T2], where
OP is an operator, T1 is a tree and T2 is a tree.
How can you evaluate an arbitrary tree?
1. If it is a number it evaluates to itself.
2. If it is of the form [OP T1 T2] then, evaluate T1 and
T2 and apply the operator OP to the two values.
Try to define a recursive procedure eval_tree that can evaluate such a
tree:
define eval_tree(tree) -> result;
;;; Evaluate a possibly tree structured arithmetic expression
....
enddefine;
Complete that definition, then try it out on these tests, and others:
eval_tree([+ [- 3 2] [* 4 5]]) =>
** 21
eval_tree([/ [* [+ 3 2] [- 4 8]] 2]) =>
** -10
HINT:
Each of the sub-trees which is a list will have a word as its first
item, namely one of the words "+", "-", "/", "-". If you apply the
procedure valof to a word, you'll get its value. In these cases the
value is always a procedure, the procedure associated with the operator.
Thus if the word is held in the variable op_name, and the two values of
the sub-expressions are v1 and v2, you can evaluate the whole thing
thus:
valof(op_name)(v1, v2)
Or, if you find it clearer thus:
(valof(op_name))(v1, v2)
The second form makes it clearer that something is being applied to v1
and v2.
It is possible to define eval_tree in about six lines of Pop-11.
-- Further work -------------------------------------------------------
This teach file by no means exhausts what can be accomplished using
recursive definitions. For example 'mutually recursive' definitions -
where two or more operations are defined in terms of each other - can be
handled by the POP11 system as easily as the more straightforward
'simple recursion' shown above.
Many of the examples given given above relied on your understanding of
numbers. POP11 has many structure building and structure modifying
facilities. Recursive definitions are often the ONLY sensible way of
manipulating the complex and varied objects one can create using these
facilities.
-- Some things to do: -------------------------------------------------
(a) Write out English versions of SUM and LARGEST, and
get some friends to help you enact their behaviour, pretending
to be elves.
(b) Write a POP11 definition of your own to find the
SMALLEST number in some set, and test it out.
(c) What does the following procedure do, when applied to
a list:
define sillyprint(set);
if set == [] then
"finished" =>
else
hd(set) =>
sillyprint(tl(set));
hd(set) =>
endif
enddefine;
sillyprint([a b c d]);
Write an explanation of what it does and how it works.
(d) Try TEACH * SETS, TEACH * SETS2
The file TEACH * RECURSE2 discusses some recursive procedures for
drawing pictures. If you try it out, you can change the examples to
use the RC_GRAPHIC facilities described in TEACH RC_GRAPHIC
TEACH * RECURSE3 analyses some of the problems of dealing with the
"terminating" condition, e.g. recursing down a list till you reach
the empty list.
--- $poplocal/local/teach/recursion
--- Copyright University of Birmingham 2001. All rights reserved. ------