TEACH SETS Revised Aaron Sloman Nov 1989 An INTRODUCTION TO SET MANIPULATION USING LISTS =============================================== CONTENTS - (Use g to access required sections) -- Introduction -- Differences between lists and sets -- Prerequisites for this teach file -- A first example -- Checking whether a person has a property -- Defining iselement with a "for" loop -- Re-doing iselement as a recursive program -- Using the matcher to define iselement recursively -- How to do it -- Words need to be quoted to refer to themselves -- Defining the procedure ismale -- A procedure to test for two properties: happyman -- Finding an item with a property -- Using valof -- A schema for FINDONE -- How to specify a procedure -- Specifying FINDONE -- FINDALL -- Digression: findintegers -- Now let's return to findall -- A schema for FINDALL -- If you need help -- It still doesn't work? -- A set of properties: findset -- Help with recursive issubset -- Some advice on testing -- Returning to FINDSET -- If you needed help -- A different findset -- Further reading -- Introduction ------------------------------------------------------- One common use of a list is to represent a set of objects. It is then possible to write programs that perform operations on the list, to represent operations on the set represented. The empty list [] would then represent the empty set. Sets of sets are then represented as lists of lists, and operations on sets of sets can also then be defined as operations on lists. For example, the following are common operations on objects (represented here as O1, O2, O3, ....) and sets represented here as S1, S2, S3, ...): (1) Test whether O1 is a member of the set S1 (2) Test whether the set S1 is a subset of the set S2 (3) Given an object O1 and a set S1, containing O1, create set S2 containing all the elements of S1 except O1 (4) Given two sets S1 and S2, create the "union" set S3, containing all the elements that are EITHER in S1 or S2 (or both). (5) Given two sets S1 and S2 create the "intersection" set S3, containing all the elements that are in BOTH S1 and S2. (6) Generalize (4) and (5) as follows: given S, a set containing sets S1, S2, S3 ... Sn as its elements, create the union of all of S1 to Sn or the intersection of all of S1 to Sn. (7) Given two sets S1 and S2, create the "difference" set S3, such that S3 contains all the objects in S1 that are not in S2 (8) Given an object O1 and a set S containing sets S1, S2,... Sn, create a new set of sets containing all and only the sets that contain O1. (9) Define a predicate (a procedure returning a boolean --- true or false --- result) which tests whether O is a member of S. (10) Define a predicate which tests whether S1 is a subset of S2. (11) Define a predicate which tests whether S1 intersects S2, i.e. whether they have any elements in common (so that their intersection is not the empty set). (12) Define a predicate which tests whether S1 and S2 are the same set, i.e. contain exactly the same elements (not necessarily represented in the same order in the lists representing the sets). (13) Given a set S1 and a predicate P create a set S2 containing all the elements of S1 that satisfy the predicate P (i.e. the set of all the objects such that P returns true when applied to them). (14) Given a set S and a predicate P, define a new predicate that returns true if and only if every object in S satisfies P. (15) Given a set S and a predicate P define a new predicate that returns true if at least ONE object in S satisfies P, and false if none do. (16) Given a set S and a predicate P define a counting procedure that returns an integer N such that N is the number of objects in S that satisfy the predicate P. (17) Given a set S some of whose elements are sets, and some of them have elements as sets, and some of those sets have elements of sets, i.e. so that sets can occur to arbitrary depth, create a new set containing all the "atomic objects" (i.e. the non-sets) that are either in S or in one of the elements of S or one of the elements of the elements of S or .... (In short if we think of S as a tree of sets, create the set of the "leaves" or "tips" of the tree, sometimes called its "fringe". These are just illustrations of operations on sets that are often found to be useful. There are many more. You could try to extend this list. AI programming languages often provide built-in or library procedures that perform these tasks. However, it often turns out that a particular operation that you require is not available, so it is useful to develop the techniques required for building procedures to perform such operations on sets. This will also help to develop general programming techniques. The exercises in this teach file, in TEACH SETS2 and in TEACH LISTQUESTIONS are designed to help you develop these techniques. -- Differences between lists and sets --------------------------------- Notice that lists are not exactly like sets: for example the items in a list have an order, whereas sets are just collections of things without any particular order. Further, lists may have repeated elements, as in [Fred John Mary Fred Sue] whereas sets contain each item only once. For these reasons the use of lists to represent sets requires some care. For example you can't simply form the union of two sets S1 and S2 (i.e. the set containing everything that is in at least one of S1 and S2) simply by concatenating the corresponding two lists, for that could produce repeated elements, as in [Tom Dick Sue] <> [Dick Mary Joe] => ** [Tom Dick Sue Dick Mary Joe] Similarly, because the order of elements is immaterial in sets, the two lists [Tom Dick Sue] [Sue Tom Dick] would represent the same set. So you cannot use the equality tester "=" or the match operator "matches" to test for whether two sets are the same. [Tom Dick Sue] = [Sue Tom Dick] => ** -- Prerequisites for this teach file ---------------------------------- Before trying these exercises, you should be fluent in the use of the editor and know how to mark ranges in files in order to copy them, load them etc (TEACH * MARK, TEACH * LMR). It will also be useful if you worked through the following: TEACH * RESPOND teaches the use of conditionals. TEACH * LISTS, * DEFINE, * VARS, * MATCHES, * ARROW, The section on lists in TEACH * PERCENT. TEACH * ARITH introduces some "loop" instructions TEACH * RECURSE1 introduces recursion You may also find that TEACH * STACK is useful. -- A first example ---------------------------------------------------- Suppose we want to store information about several people, declaring a variable for each person, and then assigning a list of descriptions to that variable. E.g. if John is male, happy, single, a clerk and tall, then we could do the following. vars john; [male happy single clerk tall] -> john; and similar things for other people. We could then have a list called "people" with the names of all the people, e.g. vars people; [john ethel mary fred] -> people; You could declare variables and construct suitable lists of properties for the other people. (For now, let's not worry about whether this is a GOOD way to represent information about people. Choosing a good representation for a particular task is often very difficult, and there are very many different ways the same informaion can be represented in a computer.) We now have the following situation: the variable "people" contains a list of names of people, and for each person the name is also a variable containing a list of properties of that person. For example, The variable "john" represents information about about John. We should now be able to write programs to examine those lists, in order to answer questions, for example: Does John have the property "tall" ? Does Ethel have the properties "rich" and "clever"? Does Fred have either the property "rich" or the property "clever"? Additional tasks that you might require a program to peform would be such things as "find all the people who are tall" (i.e. make a list of their names), or "find all the people who are tall and either rich or happy". Try thinking about how these tasks break down into sub-tasks. This will help you anticipate some of the problems discussed in the rest of this file. -- Checking whether a person has a property --------------------------- How can a program check whether a person has a property (using the above representation of properties)? We see immediately from the list assigned to "john" that John is tall, but that it is not known whether he is fat. How? When asked, a person might say something like the following: "Well I look at the list, starting at the left hand end and going along the list to the right, until I come across the appropriate word. If I get to the end of the list before finding the word, I know the property isn't in the list." Pop-11 includes a built in procedure member, that can be used to perform this test (try it): member("tall", john) => ** member("fat", john) => ** For now, suppose the procedure member did not exist, and you therefore had to define a procedure called "iselement" that behaved like member. How would you do it? -- Defining iselement with a "for" loop ------------------------------- One easy way is to use the built-in "for" loop mechanism that Pop-11 provides, that enables you to use the format for in do endfor; So you could define a procedure called iselement thus define iselement(item, list) -> result; ;;; take in one item, and a list, and return a truth value, ;;; true if the item is in the list, otherwise false. vars element; for element in list do if element = item then ;;; found it true -> result; return(); ;;; leave the procedure now endif; endfor; ;;; failed to find the item in the list so false -> result; enddefine; (Not all the semi-colons are essential - those immediately before closing brackets like "endif", "endfor" and "enddefine" can be omitted.) Type that in and test it with various combinations of arguments: iselement(3, [1 2 3 4]) => iselement(3, [1 2 3]) => iselement(1, [1 2 3 4]) => iselement(3, []) => ;;; always check the "empty list case vars john; [male happy single clerk tall] -> john; iselement("tall", john) => -- Re-doing iselement as a recursive program -------------------------- Now consider how you might have written the procedure iselement if Pop-11 had not contained any looping constructs, like "for ... endfor" or "while .... endwhile", "until ... enduntil" etc. We are going to assume that "if" is available (as introduced in TEACH RESPOND), and that procedures can be defined and "called", not only by other procedures, but also by themselves. We'll see that, as if by magic, the procedure iselement can call itself, in order to get the same effect as a looping instruction. This is called "recursion". Later you will learn that recursion is useful in some cases where loops would be very clumsy. Before proceding you must know about output local variables (as explained in TEACH DEFINE). You will also need to know how to use the procedure HD to get the first element of a list, and TL to get the "tail" i.e. a list composed of all but the first element, as explained in TEACH LISTS. Thus, if "list" is a variable whose value is a list, and "item" is another variable, whose value is anything (e.g. a word) the expression item = hd(list) is a "boolean" (i.e. true or false) expression that will be true if the first element of list is the same thing as item, otherwise false. (Experienced programmers may like to look at HELP * EQUAL to see how POP-11 deals with different kinds of equality.) Using these ideas, let's work up to a new definition of the procedure iselement. Imagine the procedure iselement is running and that it has input local variables item and list, and an output local variable called result, and the input variables have been given values thus: vars item, list; 3 -> item; [1 2 3 4] -> list; How can we check if item is in the list? We can first see if the following is true item = hd(list) is it? If it is true, then all the procedure iselement has to do is to assign true to its output local variable, and then stop. Otherwise, if it isn't true, then it can do the same test on the tail of the list, i.e. item = hd(tl(list)) If that is true it can assign true to result and stop. If not then it has to check the next thing in the list item = hd(tl(tl(list))) But how many times will it have to do this? Without knowing the lengths of the lists we can't tell. And if we want the procedure to work on lists of ANY length, then we need some way of going on indefinitely. Now, if the procedure we are defining already has an instruction to compare item with hd(list) (as above), then instead of explicitly looking at the head of the tail of the list and then the head of the tail of the tail, etc, it can just call itself "recursively". There's the magic (until you get used to it). Thus we can come up with the following schema (which has a bug, but will work in some cases): define iselement(item, list) -> result; if item = hd(list) then true -> result; elseif iselement(item, tl(list)) then ;;; the item was found (somewhere) in the recursive call, so true -> result; endif; enddefine; Try defining that. Notice that the embedded (recursive) call of iselement has to take TWO arguments, because at the beginning it is being defined as a procedure that takes two arguments. Notice also that the recursive call can succeed (produce TRUE) not just because the very next element in the list is the one It should already be clear that something is wrong, because there is no case in which FALSE is assigned to the result. Nevertheless, you can compaile that procedure then test it: iselement(3, [1 2 3 4]) => iselement(3, [1 3]) => So far so good. You can even see how it works if you trace the procedure and redo those commands. trace iselement; iselement(3, [1 2 3 4]) => etc. But suppose you search for something that isn't in the list - what happens? iselement(99, [1 2 3]) => > iselement 99 [1 2 3] !> iselement 99 [2 3] !!> iselement 99 [3] !!!> iselement 99 [] ;;; MISHAP - NON-EMPTY LIST NEEDED ;;; INVOLVING: [] ;;; DOING : hd iselement iselement iselement iselement ... When iselement failed to find 99 at the head of the original list, it went into recursion and tried on the tail. So it then failed to find it at the head of that, and recursed again. Eventually it tried to compare 99 with the head of the empty list. But hd([]) produced an error (see TEACH LISTS). And the procedure aborted. Clearly the procedure iselement has to be able to cope with the "terminating" case when it gets the empty list as its second argument. You might think that all you need is to add an extra "else" clause to deal with the case where list is the empty list, thus: define iselement(item, list) -> result; if item = hd(list) then true -> result; elseif iselement(item, tl(list)) then true -> result; else false -> result; endif; enddefine; You can try that and test it on the same examples as before, and you will find that it still eventually produces an error when the item is not in the list, because it eventually gets down to the empty list and tries to take the hd of the list. I.e. the "else" clause comes TOO LATE in the procedure - at least too late to handle the empty list, because the error produced by testing hd([]) has already occurred. It is possible to get round this using the matcher, which doesn't produce an error, but can do the test. -- Using the matcher to define iselement recursively ------------------ Instead of if item = hd(list) then we can do if list matches [^item ==] then This condition is true if the list starts with item, no matter what comes after it, even if nothing comes after it. This is a bit wasteful, as we have to build a pattern containing item every time, in order to do the comparison, but never mind that for now. Try the following and test it: define iselement(item, list) -> result; if list matches [^item ==] then true -> result; elseif iselement(item, tl(list)) then true -> result; else false -> result; endif; enddefine; If you try that you will now find that instead of getting an error message from hd saying NON-EMPTY LIST NEEDED you will get it from tl, because of the line containing tl(list). You could overcome that problem by again using the pattern matcher and using an extra variable to dig out the tail of the list, thus define iselement(item, list) -> result; vars rest; ;;; new variable for tail of list if list matches [^item ==] then true -> result; elseif list matches [= ??rest] and iselement(item, rest) then true -> result; else false -> result; endif; enddefine; Notice the use of "and" to join two conditions to form a more complex condition. Now test it as before iselement(99, [1 2 3]) => iselement(3, [1 2 3 4]) => iselement(3, []) => iselement("tall", john) => iselement("fat", john) => It works, but it's pretty clumsy. It needs the extra local variable (rest) for the second call of the matcher and we have to make a special list containing the item for the first test. We can avoid all that if instead of starting by testing the head or the tail of the list, we check whether it has a head or tail. I.e. FIRST see if the list is empty. If it is empty, what should the result of iselement be? -- How to do it ------------------------------------------------------- Try completing the following schema. DEFINE ISELEMENT(ITEM, LIST) -> RESULT; IF THEN false -> result; ELSEIF THEN true -> result; ELSE ENDIF ENDDEFINE; So, at each stage of the process there are three options, either the list is empty, or the item is the first element of the list, or it is not and we have to try the tail! The above schema represents these options, using the IF statement. By putting the test for the empty list near the beginning we make sure that if it IS empty, the procedure stops and does not try testing the hd or the tl of the list. That is one of many possible schemas for the required procedure. If you have managed to define a procedure, test it: iselement(3, [ 1 2 3 4])=> iselement("tall", john) => iselement("fat", john) => These tests should each produce ** or ** If you did not manage to complete that schema for iselement successfully, look at TEACH SETS1. If you find the example hard to follow you could try TEACH RECURSE1 for some more examples and explanation. -- Words need to be quoted to refer to themselves --------------------- The list john included the word "single". So try this john=> iselement(single, john) => You will find that Pop-11 declares "single" as a variable, gives it an "undef object" as a value, and then iselement fails to find the undef object in the list john. So its result is false. In order to get the desired effect, i.e. searching for the word "single" itself in the list, rather than its value as a variable, you have to enclose it in double quots (i.e. WORD quotes, not string quotes): iselement("single", john) => It should now work. Can you explain why iselement("single","john") => does not work? If not, ask for help. The difference between a word and a list is crucial. Compare iselement("single", valof("john")) => That works, because, if "john" is the name of a variable, then valof("john") and john are equivalent Pop-11 expressions denoting the value of the variable. (Actually, if you use sections later on you will find that they are not equivalent in all contexts.) Try experimenting with ISELEMENT, creating several more variables like "john" and checking that the ISELEMENT process does in fact work. E.g. vars mary; [female happy married teacher thin tall] -> mary; iselement("male", mary) => -- Defining the procedure ismale -------------------------------------- If your program frequently needs to check whether something has the property male, then you can define a procedure called "ismale" to do this: define ismale(person) -> result; ;;; Person is a list of words. Return true if the ;;; list contains "male" otherwise return false. iselement("male", person) -> result; enddefine; Test it: ismale(john) => ismale(mary) => Define a similar procedure called "ishappy" which takes a list of words as input and returns true if the list contains "happy" otherwise false. -- A procedure to test for two properties: happyman ------------------- Try writing a procedure called HAPPYMAN which takes a single argument, a list, and returns TRUE if that list represents a happy man. I.e. it should test for whether the list contains both "happy" and "male". The easiest method is to use the two procedures ismale and ishappy, defined in terms of iselement. You could also try defining it directly in terms of iselement. Either way the following schema should work: DEFINE HAPPYMAN(LIST) -> RESULT; IF THEN IF THEN true -> RESULT ELSE false -> RESULT ENDIF ELSE false -> RESULT ENDIF ENDDEFINE Test your procedure: happyman(john)=> happyman([ male old hungry])=> happyman(mary) => Some questions: (a) Why did the above schema need TWO "else"s. (b) The nested IF ... ENDIF expression does a test, and if it is true assigns true to result and if it is false, assigns false to result. Can it be written so that it achieves the same thing without using IF at all? (c) Can you write the whole procedure without using IF at all. Just use "and", and assign the result of "and" to result? Use the fact that if is an expression and is an expression, then and is also an expression with a result that is FALSE if both and have the value false, otherwise it has the same value as . So you can use imperatives of the form and -> result; (If is false, the whole thing is false without being evaluated. This is often useful for preventing an error that will occurr if the preconditions of are not satisfied.) ("OR" is another useful infix "boolean" operator. See HELP * AND, HELP * OR, HELP * BOOLEAN) -- Finding an item with a property ------------------------------------ Suppose we had a list of people thus: vars people, john, ethel, mary, fred; [john ethel mary fred] -> people; [male happy single clerk tall] -> john; [female unhappy single prime_minister ] -> ethel; [female happy married teacher thin tall] -> mary; [male small typist old] -> fred; where each element of the list of people has been declared as an ordinary variable and assigned a list representing the person. A question one might want to ask is "do you know of a happy person?" or "do you know of a clerk?". The answer should be either the name of the person or FALSE. More precisely, suppose we want to define a procedure called "findone" that takes a word, representing a property, and a list of names of people, and then returns true or false, thus: findone("male", people) => ** john findone("old", people) => ** fred findone("female", [john fred]) => ** false The procedure findone must look at each item in the list (its second argument), and can use iselement to see if the property name is in the corresponding list of properties. But how does it get the corresponding list of properties? The items in the list are WORDS, and there are lists of properties corresponding to them as their VALUES. To deal with this you need to know about the built in Pop-11 procedure "valof", which, when given a word that has been declared as a variable, returns its value. -- Using valof -------------------------------------------------------- Assuming that the above declarations and assignments have been compiled, try the following: "john" => john => valof("john") => valof(john) => It should be clear that the second and third of these have the same effect. Now consider vars item; "john" -> item; What will the following print? (Try to work it out before trying): item => valof(item) => valof("item") => VALOF produces an error if you apply it to anything but a word. Now try all these: vars list; [john ethel] -> list; "list" => list => valof("list") => hd(list) => valof(hd(list)) => valof(list) => If you are not thoroughly confused, you should now have the following model (rembering that inside list brackets words are quoted, not replaced by their values -- unless you use "^" or "^^" or percent signs): "list" | +--valof--->[john ethel] | | | +-valof-->[female unhappy single prime_minister] | +-valof-->[male happy single clerk tall] The list called -people- can be thought of in the same way, except that it would require a bigger diagram. This if our procedure findone is to search down a list of words naming people, then test for the corresponding property, it has to use valof on each name to get at the list of properties. -- A schema for FINDONE ----------------------------------------------- Although findone could be defined using "for ... endfor" like the first version of "iselement" above, it is a useful exercise to try using recursion. Below is a schema for it. You should be able to use the procedure iselement defined above, in the ELSEIF line. You will also have to use valof(hd(list)). Don't be worried by the fact both iselement and findall use a local variable "list". Because it is local in both cases, the two uses of "list" will not interact. Each procedure effectively has its own "private" version. Try completing this schema: DEFINE FINDONE(PROP, LIST) -> RESULT; IF THEN false -> result ELSEIF THEN -> result ELSE FINDONE(PROP, TL(LIST) -> RESULT ENDIF ENDDEFINE; -- How to specify a procedure ------------------------------------------ In thinking about designing a procedure like FINDONE you have to have a very clear idea of what you are trying to do. Produce a specification for the procedure before you try writing any Pop-11 code. The specification for a procedure should contain the following items, each set out with its own sub-heading, for clarity. (a) A specification of the type(s) of arguments (if any), e.g. "it takes a list of words and an integer" "it takes a list (of anything) and a procedure of one argument producing an integer as a result" etc. (b) A specification of the type(s) of outputs (if any) e.g. "it produces a list of integers" "it produces either false or an integer" "it produces two results, a list and a boolean (true or false)" etc. NB the specification of types of inputs and outpus should not be concerned with what they look like when typed in or printed out. (E.g. it is not relevant to talk about commas and spaces and list brackets: these are not objects that will be in the machine when the procedure runs). So the arguments should be specified in terms of the kinds of objects (integers, words, lists, lists of lists, etc) that will be put on the stack as input(s) for the procedure, and similarly the outputs. In specifying the TYPES of inputs and outputs you don't say how the output is derived from the input. That comes in the next section when you say WHAT the procedure does. (c) A description of WHAT the procedure does - without saying HOW it does it. This should say how the results are related to the input, without describing the method by which the results are produced. This section should also include a specification of any side-effects the procedure may produce (i.e. effects it has other than taking arguments off the stack or putting results on the stack. Side-effects could include reading something from the terminal, printing something to the terminal, changing a file, changing the value of a global variable, such as "database") - though in general changing global variables is something to be avoided if possible, as it can lead to obscure errors. (d) Some typical test cases, with an indication of what the procedure does with them (not how it does it). The test cases should include "limiting" or "extreme" test cases, e.g. empty lists as input, lists of only one argument, lists where the item searched for is at the beginning, lists where it is at the end, lists that don't contain what is searched for, etc. Selecting a good range of test cases is not always easy, but is crucial to good design. The test cases can be described using simulated program executions, e.g. iselement(99, []) => ** iselement(99, [99]) => ** iselement(99, [a b 3 4 c d]) => ** iselement(99, [a b 96 c 97 d 99]) => ** iselement(99, [a [99] b]) => ** and so on. (e) When you have specify all that you can begin to write a description of HOW the procedure should work in order to meet its specification. E.g. iselement takes a word W and a list L and it searches down the list checking to see if any item in the list is the same as W. If it finds such an item it terminates the search and returns TRUE as its result. If it gets to the end of L without finding W it returns FALSE as its result. Notice that the specification does not say anything about the programming language used, it does not include any bits of code, and it is written using concepts that should be clear to someone who does not know the programming language. -- Specifying FINDONE ------------------------------------------------- (a) Types of arguments FINDONE takes two arguments, a word and a list of words, naming people. The words in the list should already have been declared as Pop-11 variables and their values should be lists of words indicating properties of people. (b) Type of output FINDONE produces either FALSE or a word as its output (c) What the procedure does. Given a word W and a list of people's names P it produces the first name in P that denotes a list containing W. If there is no such name it produces FALSE. It has no side effects. (d) Typical test cases findone("happy", []) => ** ;;; assume lists -john- and -ethel- etc exist as defined above findone("female", [john ethel]) => ** ethel findone("rich", [john ethel]) => ** vars people; [john ethel mary fred] -> people; findone("typist", people) => ** fred findone("angry", people) => ** Now try writing the code for FINDONE, so that it meets this specification, using the SCHEMA presented above, or any other method you like. When finished write down a description of HOW it does it, along the lines illustrated with the description of iselement under (e) in the previous section. -- FINDALL ------------------------------------------------------------ It's harder to answer questions of the form "Find all the people who are clerks", for example: findall("male", people) => ** [john fred] findall("rich", people) => ** [] In order to solve this task you have to know how to traverse a list, testing each element, and make a list of all those that pass the test. -- Digression: findintegers ------------------------------------------- Here, for example, is a procedure that takes a list and returns a list of all the integers in it. We use the built in procedure "isinteger" which produces TRUE if applied to an integer, and FALSE if applied to anything else. (An integer is a "whole" number, positive or negative, e.g. 0 3 999 -5 -392, but not 3.5 or -77.111). define findintegers(list) -> result; vars item; [% for item in list do if isinteger(item) then item endif endfor %] -> result enddefine; Notice that the line if isinteger(item) then item endif puts things on the stack if and only if they are integers, and the list brackets with percents [% ... %] make a list of everything put on the stack, between them. Because the percents are used, the list brackets do not quote all the enclosed text: instead it is treated as program and run. See the list examples in TEACH PERCENT. Try that out findintegers([1 2 3 cat mouse 99 6]) => ** [1 2 3 99 6] findintegers([one 1 two 2]) => ** [1 2] findintegers([one two three]) => ** [] Try some other tests. E.g. what should it do with the empty list? Doing it recursively is a bit more tricky if you have not met the idea previously. Try completing the following schema, replacing the dots and the bits of English in angle brackets, with bits of Pop-11: define findintegers(list) -> result; if list = [] then .... -> result elseif isinteger(hd(list)) then -> result else .... -> result endif enddefine; Notice that there are two recursive calls here. Try completing that schema, and then test it on the above examples, with tracing where necessary to help you understand what is going on. You can do it with only one recursive call if you use an extra nested conditional and modify it thus: define findintegers(list) -> result; if list = [] then .... -> result else findintegers(tl(list)) -> result; if isinteger(hd(list)) then endif endif enddefine; Once again, try completing that schema, and then test it on all the previous examples of findinteger. -- Now let's return to findall ---------------------------------------- If you've forgotten what findall was supposed to do, look back at its definition. You can search back with the VEd command \findall You could try using the "for item in list" format with [% ... %] to make a list of all the things found that satisfy the test of the form if iselement(prop, valof(hd(list))) then as in the iterative definition of findintegers. Alternatively try doing it recursively, on the model of the recursive definition of findintegers. Remember that before you start trying to write the Pop-11 definition for the procedure it is essential to be clear about what you want the procedure to do. Follow the example given above, and write the specation in steps (a) to (e) as was done above for "iselement" (and "findone"). Examine all the test cases carefully and make sure that your plan for it covers all the cases. Make sure that it always returns a list, even if it's the empty list. If you use recursion make sure that you give the recursive call the correct types of inputs (see the specification for the inputs), and also make sure that you use the result of the recursive call properly (i.e. the list it produces). -- A schema for FINDALL ----------------------------------------------- Here is a suitable recursive schema for findall. Remember that you'll have to use iselement as well as the procedure valof on the hd of the list (as in findone, above). DEFINE FINDALL(PROP, LIST) -> RESULT; IF THEN [] -> RESULT; ;;; i.e. empty set is result ELSE -> result; IF THEN ENDIF; ENDIF; ENDDEFINE; Try testing it, with findall("female", people) => and other examples. To see what is going on trace findall and iselement. -- If you need help --------------------------------------------------------- If you didn't succeed, try completing this: DEFINE FINDALL(PROP, LIST) -> RESULT; IF THEN [] -> RESULT; ;;; i.e. empty set is result ELSE FINDALL(PROP, TL(LIST)) -> RESULT; IF THEN [^(HD(LIST)) ^^RESULT] -> RESULT; ENDIF; ENDIF; ENDDEFINE; N.B. VALOF is needed as before in the second condition, so that, for example, ISELEMENT looks in the list associated with "john", not in the word "john" itself. Test your procedure. Try Tracing it, and ISELEMENT, when you do. -- It still doesn't work? --------------------------------------------- If you didn't manage, try typing in this define findall(prop, list) -> result; if list = [] then [] -> result else findall(prop, tl(list)) -> result; ;;; Got all the things from the tail. Check whether the hd ;;; of the list needs to be added to the front of result if iselement(prop, valof(hd(list))) then ;;; create a new result list, using the hd of the list ;;; and the old result, got from the recursive call [^(hd(list)) ^^result] -> result endif; endif; enddefine; Then try: trace findall; findall("happy", people) => findall("female", people) => Try other tests. If you haven't traced ISELEMENT trace it too - though you may get a lot of printout. -- A set of properties: findset --------------------------------------- A yet harder task is to find ALL the people with a set of properties, for example: findset([male happy], people) => In order to do this you will need to define a procedure that takes a list of properties and a person (in the form of a list) and then returns true if the person has all the properties. E.g. suppose we call it has_all, then has_all([male tall], john) => ** has_all([female tall], john) => ** What should we do if the first list is empty? has_all([], john) => It turns out to be convenient to assume that if increasing the list of properties provides a tougher, more restrictive, test, then having an empty list of properties means that everyone passes the test. I.e. has_all([], john) => ** And the same would apply to mary, ethel etc. The only really tricky bit is . You may find it useful to define a procedure called ISSUBSET which checks if every element of one list is in another. It should take two sets and see if all the elements of the first are ISELEMENTs of the second. can then write: ELSEIF ISSUBSET(PROPLIST, VALOF(HD(THINGLIST))) Try defining ISSUBSET iteratively (using "for item in littleset do ...") and recursively. -- Help with recursive issubset --------------------------------------- Here is a recursive schema for ISSUBSET: DEFINE ISSUBSET(LITTLESET, BIGSET) -> RESULT; IF THEN ELSEIF THEN IF THEN ELSE ENDIF ELSE ENDIF ENDDEFINE; Try completing that. Test it, then use it to define FINDSET. Then test FINDSET, tracing everything. -- Some advice on testing --------------------------------------------- An important point: whenever you define a new procedure TEST it. Try to make sure it works on all the limiting cases, e.g. empty lists, lists with "target" items at the beginning or at the end, lists with only one item, etc. Try to make sure you have covered all the significantly different cases in your tests. Plan those tests BEFORE you write the proceudre definition, so that you can use them in thinking about what the procedure has to do, i.e. you can make sure it covers all the different cases that can arise. NEVER type in a long list of procedures and then try testing them all by running one high level procedure that runs all the others. That will ensure that you waste a lot of time searching for obscure bugs. Keep a set of test cases in the file, e.g. in comment brackets /* test1 test2 ..... */ so that you can conveniently re-run the tests if you change the program. ALWAYS re-run them after every change. Make sure that the first tests are SIMPLE ones, so that if procedure doesn't work on them it should be easier to find out why not. By including the test cases in a comment, you can safely load the whole file without running the tests every time. -- Returning to FINDSET ----------------------------------------------- What should we get with an empty list of properties: findset([], people) => It all depends on how you interpret that command. Depending how you had defined your procedures, POP-11 is likely to interpret it to mean find the set of people without any constraint at all, i.e. it will produce a list of ALL the people. This is quite natural if you think of the fact that adding more properties will REDUCE the list of people. E.g. if you give the property "happy" then that selects the subset that are happy. If you give the properties "happy" and "male" then the second property will (usually) select an even smaller subset from the first. So if you have no properties it should not reduce the list at all. The above definition of issubset is consistent with that line of thought. The procedure FINDSET takes a list of properties and a list of things. It is to 'return' a list of those things which have ALL the given properties. Here is a recursive schema for FINDSET: DEFINE FINDSET(PROPLIST, THINGLIST) -> RESULT; IF THEN ELSE ;;; recursively call FINDSET on the tail -> RESULT; ;;; Now use issubset to see if the head of THINGLIST should ;;; be included in the result IF THEN -> RESULT ENDIF ENDIF ENDDEFINE; It may look VERY odd that you have to work on the tail before the head. At first it may appear that the procedure can't do anything because nothing can get added to the result till everything else has been added. But the crucial point is that ultimately the recursion gets down to the empty list [], and this final recursive call returns a result of [], after which it works back through thinglist from right to left as the recursion unwinds, adding items to the result if they satisfy the test. You should be able to see this by tracing the procedure. (The "sum" example in TEACH RECURSE1 is similar.) Try completing that, and test it if you succeed. -- If you needed help ------------------------------------------------- If you didn't manage try: DEFINE FINDSET(PROPLIST, THINGLIST) -> RESULT; IF THEN [] -> RESULT ELSE ;;; recursively call FINDSET on the tail -> RESULT; ;;; Now use issubset to see if the head of THINGLIST should ;;; be included in the result IF ISSUBSET(PROPLIST, VALOF(HD(THINGLIST)) THEN [^(HD(THINGLIST) ^^RESULT] -> RESULT ENDIF ENDIF ENDDEFINE; -- A different findset ------------------------------------------------ Can you understand this alternative definition of FINDSET, which makes use of the previously defined FINDALL: define findset(proplist, thinglist) -> result; if proplist = [] then thinglist -> result else findset(tl(proplist), findall(hd(proplist), thinglist)) -> result endif enddefine; (This assumes that you have already defined FINDALL) Try defining and testing it, tracing everything to see how it works. Incidentally, the Pop-11 system includes a procedure MEMBER, which does what your procedure ISELEMENT does. So you don't really need to define your own. You can change all occurrences of ISELEMENT above to MEMBER (except in the definition of ISELEMENT). -- Further reading ---------------------------------------------------- Several teach files have been mentioned already, including TEACH * LISTS, * RECURSE1 Additional exercises on list processing can be found in TEACH * RECURSE2 TEACH * RECURSE3 TEACH * LISTQUESTIONS TEACH * SETS2 -- takes you through a collection "set" operations. There are additional list processing exercises in the books by Laventhol (e.g. chapter 5) and Barrett, Ramsay and Sloman (e.g. chapter 7) listed in the file HELP *POPREFS --- C.all/teach/sets --- Copyright University of Sussex 1989. All rights reserved. ----------