These notes were written in response to discussions arising out of a workshop on intelligent robotics at BT in April 1998. See http://www.labs.bt.com/people/mulhaug/docs/currentwork/workshop/workshop.htm DRAFT SOME NOTES ON COMBINATORIAL SEARCH AND HOW (NOT?) TO TAME IT Aaron Sloman Last modified 5 Feb 2000 CONTENTS: -- WHAT IS COMBINATORIAL SEARCH? -- COMBINATORIAL AND NON-COMBINATORIAL SPACES -- SETS OF RELATED COMBINATORIAL SPACES -- HOW (NOT?) TO TAME COMBINATORIAL SEARCH -- -- (1) Direct the search towards the required solution -- -- (2) Find ways to transform a search space into a smaller one, -- -- (3) Find a new powerful form of representation, -- -- (4) Find a transformation to add new structure enabling hill-climbing -- -- (5) Find a clever algorithm which computes the result without search -- -- (6) Use constraint manipulation techniques -- -- (7) Use stochastic as opposed to deterministic methods -- -- (8) Weaken the criterion for a solution. -- -- (9) Divide and conquer -- -- (10) Use domain-specific knowledge. -- -- (11) Use chemical DNA-based machines to achieve enormous parallelism -- -- (12) Use human or human-like spatial/visual reasoning abilities -- GENERALITY vs DOMAIN SPECIFICITY (domain specific vs topic neutral) -- "LAWS OF FORM" MAY DEFEAT COMBINATORICS -- DIFFERENT TYPES OF LAWS OF FORM -- INTERACTING TRAJECTORIES IN "DESIGN SPACE" AND "NICHE SPACE" -- DESIGN SPACE AND NICHE SPACE VS COUPLED FITNESS LANDSCAPES -- VIRTUAL MACHINES IN THE BIOSPHERE -- EXAMPLE OF A TRADEOFF: HELPLESS VS COMPETENT INFANTS -- REFERENCES -- ACKNOWLEDGEMENTS -- WHAT IS COMBINATORIAL SEARCH? The phrase "combinatorial search" refers to search in a space defined by possible combinations of components in a structure (linear or non-linear). I.e. it is search in a combinatorial space. Items within the space may vary according to their structure, or according to the values of components (e.g. weights, parameters) within a structure, or both. How a space is searched is a separate question from whether it is combinatorial or not. Many search strategies are possible in ANY search space. -- COMBINATORIAL AND NON-COMBINATORIAL SPACES A non-combinatorial search would be a linear search through a fixed store of data items, such as a set of book titles. In some cases the fixed store might have the same contents as a combinatorial space, e.g. the set of all book titles of up to six words that can be generated by a simple grammar. However, typically a pre-existing store can be preprocessed and embedded in a mechanism which facilitates search so that answers to questions are found at great speed (e.g. using database technology), depending on the nature of the question. E.g. if the question is "is there an item matching pattern P" the indexing may give direct and immediate answer, if it was designed to anticipate such patterns. However if the questions is "find the item which gives the highest value when function F is applied to it", the index may not help, in which case searching through items in the set will be required. Whether the search has to be exhaustive or not will depend on the structure of the set and the nature of the Function. In general the items in a combinatorial search do not all exist in some database or other indexed structure in advance of the search and are created during the search. Example tasks are: find a set of prime numbers less than 50 which add up to 100, find a program which discovers maximal cliques in binary graphs. Combinatorial search seems to characterise biological evolution. For instance new DNA combinations, new morphologies for organisms, new combinations of behavioural competence, new types of information processing competence, were all produced during biological evolution. Evolution involves a vast combinatorial search through a number of different spaces. (See my discussion of design space and niche space in [1].) Combinatorial spaces typically have a size which is an exponential (or worse) function of the number of components of the structure (whose size need not be bounded in advance). For example if the solution to a problem requires N decisions involving selection between two items then the size of the space is 2**N. If the size of the solution is not known in advance the actual space through which search processes can travel may be very much longer, or even infinite. E.g. when searching for a proof of a mathematical theorem you do not normally know in advance how long the proof is likely to be, and since proofs (in predicate calculus) can be arbitrarily long the search space is infinite. Searching for a value in a single continuous dimension involves a different kind of infinite search, since continuous sets are infinite, even in a bounded interval. But (paradoxically?) this kind of search is often (though not always) much easier to deal with because of the nice mathematical properties of the set of real numbers and some of the functions on that set. (E.g. if a function is known to have a single maximum then simple hillclimbing starting anywhere will find it.) In more complex search contexts a lot of effort has to go into trying to tame the combinatorics. This can be done in several different ways. A first draft list of strategies for taming search is given below. It is certainly not complete and can probably be better structured. Moreover, since it reflects information I have picked up over many years in many contexts, I cannot easily provide references to examples, and would welcome help in producing an annotated bibliography. -- SETS OF RELATED COMBINATORIAL SPACES A preliminary consideration is that there are often different ways of looking at a problem involving different search spaces. In general combinatorial spaces are closely related to other combinatorial spaces generated in different ways. E.g. consider a grammar G and lexicon L, and the set S1 of all sentences of at most N items of L generated by G. There is one space consisting of the set S1, and another S2 consisting of the set of selections of rules in G and items in L to generate the sentences in S1. (I.e the space of sentences and the space of derivations are closely connected). A structurally ambiguous sentence might be generated in more than one way, e.g. "They saw the man with the telescope" can specify WHICH man was seen (the one with the telescope) or HOW he was seen (with the telescope). Here elements of S1 (sequences of words) and elements of S2 (derivations of sentences) are related, e.g. because each element of S2 might be a tree whose "fringe" is an element of S1. Since different trees can have the same fringe, a search where structure is important may involve a much larger space S2, but where it is unimportant, the search can sometimes be reduced by focusing on S1. In general there will be many combinations of lexical items which are not legal sentences. So S1 will in general be a subset of S3, the set of all possible N element sequences of items from L. Finding a way to focus on S1 without having to explore S3 can be very difficult, except by using S2, which may be smaller than S3, even though it is larger than S2. Likewise if lexical items are composed of characters from an alphabet (including spaces) then each of S1 and S3 will determine a set S4 of sequences of characters. However if N is the longest sequence in S4 then there will be many sequences of characters of that length which contain non-words. I.e. if S5 is the set of sequences of characters of length N, then S5 will be a superset of S4. Given the existence of all these sets, searching for a sentence satisfying some set of conditions could in principle be done in any of these spaces, though it would normally be silly to search in S3, S4, or S5, because of the cost of eliminating large numbers of redundant items by showing that they are not legal according to G and L. Nevertheless the fact that a particular combinatorial space will normally be mathematically related to many other spaces can be the basis of techniques for transforming problems so as to reduce search. -- HOW (NOT?) TO TAME COMBINATORIAL SEARCH There are several different types of strategies for taming combinatorics, all generally requiring design creativity. Here are some examples. In general each technique works only for a restricted set of problems. Finding which technique is best for a particular problem or class of problems is itself a very difficult combinatorial search, and there may be no general technique for doing it! (Perhaps there is some kind of "diagonal" proof of impossibility?) Note that the strategies listed here are not all mutually exclusive. Sometimes a particular technique can be viewed in two different ways. -- -- (1) Direct the search towards the required solution This is one of the oldest and most widely used ideas in AI (in the form of evaluation functions, search heuristics, evolutionary fitness functions, etc.). For instance, given a road map and the problem of finding a route from town A to town B, one is searching in a space of partial routes. By growing routes from A by always choosing the option closest to the direction towards B it is sometimes possible to find a good route without wasting time on irrelevant roads, using the metrical properties of 2-D space to guide the route. However if there are many dead-ends (e.g. due to impassable mountain ranges, rivers, etc.) then this technique can fail. In many cases directed search fails because the choices selected as best by the evaluation often lead either to dead ends requiring a great deal of backtracking, or else down an infinite path. It is frequently rediscovered that simply directing a search doesn't make the problem tractable except in special cases e.g. in search spaces where good solutions have very large basins of attraction, or where good solutions with smaller basins of attraction are densely distributed, etc., or where a useful starting point happens to be chosen by luck. Even where there is "good" route from some starting point to a solution, without any dead ends requiring backtracking, if the space is large enough then the good route might be too long to be traversed in the time available -- e,g, a route in a very large space spiralling ever closer to a solution might have to traverse most of the points in the space, despite always getting closer to the goal. So making combinatorial search *directed*, is a far from adequate answer except in special cases. It is sometimes argued (e.g. by Kauffman in [2]) that Darwinian selection mechanisms, which essentially do hill-climbing in a space where neighbours are found by "simple" genetic changes such as mutation and crossover, would have very little chance of achieving organisms which now exist in the time available since the earth was formed, unless additional structuring principles were involved in controlling the search. Combining directed search with other techniques (e.g. those listed below) often helps, but different techniques are needed in different classes of problems. Moreover finding a good way to direct the search (e.g. finding a good evaluation function or fitness function) itself often requires detailed analysis of the problem and a high degree of creativity. -- -- (2) Find ways to transform a search space into a smaller one, to reduce the combinatorics. This may involve any of a variety of techniques: (a) Folding of one part of the space into another, if structures are repeated, or where differences are irrelevant, or where a solution in one part can easily be used to derive a solution in another part, e.g. using symmetries. (This can include the use of memo-functions which remember re-usable partial solutions for sub-spaces, sometimes also referred to as "structure sharing"). (b) Pruning the space by selecting branches or sub-spaces which are known to be adequate to the task or which are representative of the sub-space discarded (sometimes called "refinement"), (c) Searching at a higher level of abstraction if the more abstract solution is sufficient for the task. E.g. replacing individual nodes in a search space with sets of nodes which are in some sense equivalent, and treating such a set as a node in a more abstract space, which will therefore be smaller. Instead of using a space of sets containing many nodes (extensional abstraction) it may be possible to use descriptions which are applicable to many nodes (intensional abstraction). Where sets are characterised by common predicates, the former becomes a case of the latter. (d) Using the result of a more abstract search to guide (prune) the search in the original space (e.g. using proof plans in theorem proving, or using the straight line joining two cities to guide the search in a map for a sequence of roads leading from one to the other.) (f) Reducing the strictness of the requirement on the problem, which may enable a less thorough search to suffice (e.g. looking for a less than optimal solution to a problem). This may, for example, transform a search in a space of points into a search in a space of regions containing points of approximately equal value, where the total number of regions is far smaller than the number of points. The development of new programming languages in computer science can be seen as an example of switching to more abstract but smaller spaces: the task of searching for a Lisp or C++ program to solve a problem is totally different from the task of searching for a machine code solution, even if the former is eventually translated into the latter by a compiler. Even going from one high level language to another can make a huge difference: as many programmers found in the mid 80s when they first learnt Prolog after trying to use Pascal to do AI, for instance. -- -- (3) Find a new powerful form of representation, which produces an entirely new search space, or which makes a new strategy possible, e.g. one which avoids search, or reduces search, or allows a more rapid search. An example is switching from a search in a space of incrementally grown partial solutions (e.g. partial routes) to a space of complete solutions (e.g. complete routes) where directing the search may be easier. (For an example see [3].) The history of mathematics includes many examples of powerful new representations. Many of the other techniques listed here can be based on a change of representation, e.g. moving to a higher level of abstraction requires finding a way to represent solutions, heuristics, paths, subspaces at the new level. Much ongoing work in computer science is concerned with attempts to find new representations and inference systems to reduce search spaces. E.g. linear logic programming which allows each axiom to be used at most once in a search can reduce search spaces. Whether such general techniques can have much impact on really intractable problems is another question. -- -- (4) Find a transformation to add new structure enabling hill-climbing where it wasn't previously feasible. Sometimes *enlarging* the space does this -- e.g. embedding a discrete space in a continuous one. (See Ref [4] for an example.) -- -- (5) Find a clever algorithm which computes the result without search The history of mathematics has many examples of this, e.g. an algorithm to find the highest common factor of two numbers without searching. Sometimes clever algorithms are linked to clever representations. E.g. if you had to form a collection of sets of numbers less than 100 you might consider lists of numbers, one for each set. Alternatively you can use bitstrings of length 100, where the Nth bit is 1 or 0 depending whether N is in the set. Then operations like finding the union or intersection of two sets can be performed trivially. Some operations can use the fact that bitstrings can themselves be treated as numbers. -- -- (6) Use constraint manipulation techniques Many combinatorial spaces are defined by sets of options linked by constraints. A search problem could be to find a selection from each set, where selections from linked sets satisfy all the relevant constraints. This is a case where there are actually two spaces: S1 the space of all possible selections, and S2 the space of selections satisfying the constraints. Working systematically through the options in S1 checking the constraints may be intractable. Sometimes, however, constraint analysis can be used to remove options iteratively from nodes in the constraint network (e.g. using Waltz filtering). This can sometimes remove large subsets from the search space. But beware of phase transitions: in some cases constraint propagation can be more work than exhaustive search, e.g. where there are very many solutions so that most arbitrary selections satisfy the constraints. (NOTE: Geoff Hinton showed in his PhD thesis (Edinburgh Univ. circa 1976) that where the initial constraint network is noisy, constraint propagation may eliminate "good" options. If such noise is a possibility then the search for a "correct" set of selections needs to be replaced by a search for "the best" or "a good" set of selections, making use of a cost function for violated constraints. Relaxation was used in his thesis.) -- -- (7) Use stochastic as opposed to deterministic methods (possibly in combination with the other techniques: e.g. non-deterministic constraint propagation). This sometimes helps where hill-climbing fails because there are too many hills with unsatisfactory peaks. Randomly sampling hills to climb in parallel may give a good chance of finding a high hill. Examples of non-deterministic methods include: simulated annealing, some neural nets, evolutionary mechanisms, various optimisation techniques Where the search landscape is very rugged, i.e. small changes in the combinations of options cause large changes in the value of the resulting combination, hill-climbing is not possible. It then becomes necessary to do some kind of sampling of the space. If it is too unstructured, nothing short of exhaustive search will do. Where the space is not so rugged but there are large numbers of local maxima, hillclimbing will find a solution if you start anywhere on a "good" hill. So sampling starting points from which to climb may also be a good idea in this case. More generally, we can sometimes find a way to "sample" the search space which is not necessarily random. However a random search is more appropriate when there is not enough knowledge about the structure of the space to guarantee that a deterministic sampling strategy will give representative results. -- -- (8) Weaken the criterion for a solution. This has already been mentioned under (f) in section (2) above. For instance instead of looking for the best item look for something that is good enough, or better than some threshold. (H.A. Simon called this "satisficing".) This may enlarge the set of solutions enough to turn an intractable search into a tractable one. Moreover, where the search includes storing unexplored options for later consideration, weakening can allow fewer unexplored options to be saved if it is known that rejected options cannot lead to much better solutions than saved ones, thereby turning a search with impossible storage requirements to a feasible one, as in some variants of A* search. (I learnt this from Brian Logan.) -- -- (9) Divide and conquer Replacing a single search in a huge space (searching for a structure with a lot of components) with a large number of parallel searches in much smaller spaces (searching for smaller structures). This can be useful if the search is organised in such a way that a global solution emerges from all the parallel searches. For example if the original problem required K*N binary selections to be made, then the search space is of size 2**(K*N). However if this can be replaced by K separate searches each involving only N binary selections, then the total number of nodes to be searched is K*(2**N) which is typically a very much smaller number (e.g. if K = 1000, and N = 10, it can transform a totally intractable search into something quite feasible. (It makes an even bigger difference if the selections are not binary.) Note: this assumes some relatively cheap mechanism is available for combining the separate results, Examples of this technique include the following: Neural net architectures using "local" adjustment criteria during searching Use of simulated "ants" leaving pheromone trails, Systems that depend on voting mechanisms, "Swarming" systems (?) Some evolutionary searches (coevolution???) Use of models of a "market economy" with large numbers of small individuals attempting to optimise their decision making, using payments in a common currency for services or information, in the expectation that this will optimise the global system. Sometimes it doesn't, as we've often seen in real market economies! (How many big companies which recommend this for a whole country would replace their own corporate planning with an internal market economy?) Many physical systems can be seen as solving problems this way. E.g. water poured into a container of arbitrary shape solves the problem of arranging molecules so as to minimise total potential energy subject to various constraints, some determined by the shape of the container. Folding of complex molecules is another case. The use of a network made of pieces of string to find the shortest route between two locations is another. This sort of approach can work well on computers in problems which are inherently decomposable (or, nearly decomposable) into separate sub-problems, and where a relatively cheap way of combining solutions exists. However it will fail where the individual search spaces remain large (or infinite as in the case of searching for sub-proofs in a proof) and where there are very strong interactions between sub-problems. E.g. which options are best for component 256 may depend on which options are best for component 243, which may depend on which options are best for component 575, etc. Many people seem to think animal brains (including human brains) work entirely by the divide and conquer method because so many complex problems are solved so quickly and the brain clearly uses highly parallel mechanisms. An alternative hypothesis is that brains have large numbers of highly specific mechanisms which use a great deal of pre-compiled information (e.g. about the environment, or the animal's body), with each mechanism able to function well (often without search) in a special class of problems. When a truly novel problem turns up combinatorial search is required. How that's done depends on how much expertise and knowledge the person has in the field (see next para.) One of the reasons divide and conquer with a utility measure or some sort of market economy or voting system often fails is that the attempt to introduce a uniform currency as the sole or main method of communication between components is inadequate because it loses information. E.g. using a numeric evaluation function in A* search doesn't help you understand why the search fails when it fails. If instead of simply getting a numerical fitness or utility value back you got a structured description of inadequate solutions, you might be in a better position to change your requirements and start again, perhaps building on some of the better failures. (This is frequently rediscovered by people who try to solve complex problems using utility measures, fitness measures, evaluation functions, and then fail. For some recent work by on how to use a non-numeric multi-dimensional evaluation function see [5] and [6].) -- -- (10) Use domain-specific knowledge. An approach which often works when others fail is to use domain specific knowledge to tailor a representation, algorithm or architecture to the properties of a domain. Animal visual systems are an example, closely tailored to the structure of 2-D images and 3-D structure and motion. Many commercial expert systems are highly knowledge-based. There are many different ways domain knowledge can be stored and used. E.g. large collections of well indexed condition-action rules or logical axioms are common in expert systems. By contrast some neural nets store large numbers of points in a uniformly structured solution space with a convenient metric, along with fast parallel methods for finding the nearest known neighbour to a new problem point. In some cases an association mechanism can use interpolation to generate good solutions for new intermediate points. In general, however, even considerable expertise and prior knowledge does not rule out the need for creative searching in a combinatorial space. I am sure Mozart, Beethoven, Shakespeare, Newton, Einstein, etc. all did that some of the time. Beethoven's drafts for some of his compositions show that he did. -- -- (11) Use chemical DNA-based machines to achieve enormous parallelism The August 1998 issue of Scientific American has an article [7] showing how chemical mechanisms based on reproducing, joining and splicing DNA sequences can be used to solve certain combinatoric problems in a reasonable time by making use of very large numbers of parallel processes in a chemical soup. No doubt this technique will be developed for use on more and more complex problems, as it promises to handle within days or weeks problems that might take months or years on computers. The really hard task is to transform a problem into a form which can be addressed in this way. -- -- (12) Use human or human-like spatial/visual reasoning abilities An ant finding its way back to its nest in long grass has to use either local clues or a record of its path in order to select a direction in which to move at each moment in time. A human who has a view of the region from a high vantage point, or who has a good map, can normally take in the structure of the terrain, see where the main obstacles are, which valleys, paths or roads could be joined to form a good route, and often, with hardly any searching at all, select a good path. This human capability, though very powerful, is subject to real limits. If the terrain consists of a complex maze (as in maze puzzles in children's books) it may be impossible to see a good route without a considerable amount of searching and backtracking. The fact that humans have powerful spatial reasoning abilities has often been noticed. For many interesting examples in mathematical reasoning see Nelson's book [8]. It is often claimed that work in AI is hampered by the fact that as yet machines do not have such abilities and many people have attempted to implement subsets. (I made this claim at my first AI conference in 1971, but later found that it had been said before. The idea is frequently reinvented. For some discussion of what the claim means and some examples see [9] and [10]. For an interesting recent attempt to automate diagrammatic reasoning in connection with arithmetic see [11]. It has so far proved very difficult to give machines human-like spatial reasoning capabilities for reasons that are obviously closely related to the fact that machine vision is currently nowhere near human or animal vision in power and flexibility despite successes in particular classes of visual tasks (controlling a device, recognising faces, visual diagnosis of skin lesions). The reasons for these limitations are too complex to be discussed here. However, it should be remembered that even if we managed to analyse and replicate this spatial human ability its limitations may remain in machine form. I.e. there may be deep reasons why beyond a certain type of complexity, solutions to mazes and other visually presented problems cannot simply be "seen", so that some form of searching will then have to be used. As suggested by the history of mathematics there may be special cases where this can be avoided by relocating the problem in a more abstract space where the structure is more visually perspicuous. (An example, involving switching between metrical and topological representations is discussed in [10]). -- GENERALITY vs DOMAIN SPECIFICITY (domain specific vs topic neutral) During the first 30 or so years of the life of AI, and especially during the 70s and early 80s, there has been an enduring tension between those who seek *general purpose* solutions to problems and those who believe that only by using *specific* knowledge about particular problem domains can truly powerful systems be produced (e.g. planners, problem solvers, language understanders, image interpreters -- all of which are, in different ways, confronted with search problems). (remember GPS the "General Problem Solver", by Newell, Shaw and Simon, and all the general purpose planners and logic based problem solvers, etc.?) For instance, for some time the Stanford AI department, largely influenced by John McCarthy, was associated with the search for general methods, including general theorem provers and logic based planners. Likewise, the work by Allan Newell and Herbert Simon at Carnegie Mellon University (e.g. work by Newell, Shaw and Simon on the "General Problem Solver") addressed general issues, at least in the 1960's and and 1970's. By contrast the MIT AI department, led by Marvin Minsky and Seymour Papert, sought power in domain specific knowledge. There were similar divisions in the UK (E.g. Edinburgh and Imperial College vs Sussex University and others.) Of course the geographical divisions were not rigid: Feigenbaum at Stanford liked the slogan "The power is in the knowledge" and there were people at MIT who worked on logic. The contrast was also associated with a division between "neats" and "scruffies". The neat people (logicians, generalists) hoped for powerful and elegant general methods, forms of representations, mechanisms. The scruffies (including Roger Schank's team, first at Yale, then at Northwestern University) thought that AI systems, forms of representation, architectures etc. would have to be complex and messy to have enough power to deal with the real world, as opposed to toy problems. I believe Chris Mellish, either during or shortly after, working on his PhD in Edinburgh coined a phrase which is very relevant. He noticed that when a domain specific technique has been found to work, there is some general re-usable structure in the problem and the solution which can be abstracted from the particular subject matter. That structure could then be described as "domain specific but topic neutral". (I am writing this from memory. I don't have any references.) This seems to me to be very important: however scruffy effective domain specific knowledge is, what makes it work is something about the ontology of the domain, the structure of the problems, and it should be possible to express the relationship between the problem solving knowledge and the domain to which it applies and problems it is useful for, in a manner which at least in principle forms a pattern which could be applied elsewhere. Eventually everyone, on both sides of the debate, came to realise that there was a tradeoff between generality and power. Given any really general technique for solving a class of problems, it was often, if not always, the case that for any subclass with common features there existed a more specific technique which was better, e.g. it worked faster, or required less memory, or was more accurate, or found better solutions, etc. The relevance to the present discussion is this: ALL of the following are essentially very general methods: logical theorem proving neural nets constraint-manipulation genetic algorithms classifier systems etc. In order to make any of these techniques work well on a particular hard problem domain it is normally important to analyse the intricacies of that domain and tailor the method to the domain, e.g. by finding a good representation that captures important features of the domain. It may then turn out that once that work has been done, whatever was learnt about the domain can equally well be used in connection with other methods. So it may be that proponents of the different methods have to re-invent each other's wheels to make their methods work well in certain domains. That suggests that it may be useful to seek ways of classifying types of problems or problem domains in a manner which is independent of whether you plan to use theorem provers, neural nets, or genetic algorithms to solve the problems. If that can't be done, that will be an important fact about how knowledge has to be structured. It may be that even though there is domain-specific but topic-neutral knowledge, it is not method-neutral. A related point is that parts of the world may, as a matter of fact, have constraints that restrict the problems that can arise and the possible solutions to those problems. If those constraints can be discovered and understood they could be deployed in specialised problem-solving machinery or search strategies. For instance if a visual system that is for use in a world where most physical objects do NOT have surfaces covered everywhere with vast discontinuities then the visual system can exploit the fact that most surfaces are mostly continuous, in controlling its search for ways of segmenting and analysing images. Without that constraint the space of possible interpretations of any particular image might be many orders of magnitude larger. Similarly a planner working in a state of uncertainty about aspects of the environment can use some very general assumptions, including assumptions about frequencies or probabilities of occurrences of certain classes of events, to consider a vastly reduced set of contingencies. In some previous papers (e.g. [12]) I have discussed dimensions in which an environment could be more or less cognitively friendly. If an organism or robot can detect and make use of that friendliness, it could benefit enormously in reduced computational requirements. Evolution seems to be able to find and use many kinds of cognitive friendliness suited to different organisms. In some cases it actually produces cognitive friendliness, through co-evolution, e.g. features of flowers which help pollinating insects to find and land on them. How evolution finds the mechanisms which can make use of cognitive friendliness is another matter. -- "LAWS OF FORM" MAY DEFEAT COMBINATORICS As far as evolutionary search is concerned, I conjecture that there is a deep point, which is hard to make clear, which pervades work by Stuart Kauffman [2], two recent books by Ian Stewart and Jack Cohen ([13] [14]), as well as earlier work by the biologist Brian Goodwin. Brian Goodwin (whom I used to know and talk to at Sussex University) proposed long ago that there are "laws of form" which constrain what can evolve and how organisms can develop. I think he thought of these as mostly arising out of the properties of space and time (e.g. what kinds of folding of tissue are possible in 3-D space). (I'll try to find a reference to published work by Goodwin. See http://www.gn.apc.org/schumachercollege ) These eternal (mathematical) laws of form may constrain possible organisms to a far higher degree than the dynamics of natural selection operating on randomly generated genetic structures. The conjecture is that within the space of possible trajectories available for evolutionary and developmental processes there are constraints which limit the combinatorics to such an extent that the normal emphasis on repeated selection among randomly produced items as the explanation of how complex biological organisms developed is misleading, since that alone could never have defeated the combinatorics. This idea has been generalised by Kauffman and others (e.g. Cohen and Stewart) in the idea that "order for free" arises inevitably out of the combinatorics of certain large classes of structures. Many of the examples Kauffman gives arise from connectivity properties of graphs. E.g. he claims that a very high (surprisingly high) proportion of a certain class of randomly generated dynamical systems turn out to have a small number of basins of attraction which exhaust nearly all of the state space. In that case finding and comparing all the attractors could be relatively easy. (Though convergence rates are also important.) This leads to the notion that trajectories that are possible in biological evolution are all subject to "laws of form" which provide "order for free". These laws can have different forms. E.g. some determine which structures and processes are possible or impossible, some determine probability distributions in classes of structures based on particular kinds of building blocks. I am particularly interested in those which determine the tradeoffs relevant to different designs which fit the same niche. (More on this below.) Such considerations led Kauffman ([2], page 150) to the claim that although most biologists believe that "selection is the only source of order", this view is "deeply inadequate" though it has "much to commend it". Where such laws of form constrain what is possible (or probable) within a combinatorial space it may be the case that a directed search using a selection mechanism suffices to ensure that good solutions can be found in a reasonable time. (What "reasonable" means will be context dependent.) However this may not be as simple as it seems, for instance if different selection criteria and different selection mechanisms are needed in different parts of the space. Natural evolution may achieve this in ways that are not easy to replicate in artificial evolution, for instance if, as I suspect, the process depends crucially on coevolution of many different species of organisms. For instance this may be a requirement both for guidance of the search (different species "guide" one another through increasingly difficult problems, just as a good teacher guides a pupil), and for adequate sampling of the space of possible evolutionary trajectories in a reasonable time. (More on this below.) -- DIFFERENT TYPES OF LAWS OF FORM It is worth noting that there are different kinds of laws of form of different degrees of generality and abstractness. Some very general laws relate to the properties of systems with certain classes of "small" building blocks which can be assembled in a very wide range of structures. Examples include: Kauffman's arguments about the high probability of autocatalytic networks given some very general assumptions about a type of chemistry; the probabilities associated with mendelian inheritance; the incompleteness and complexity/incompressibility results concerning very general classes of computations. I conjecture that there is also a vast collection of more specific laws of form concerning "intermediate level" more specialised architectures and processes waiting to be studied, some of which I expect we'll find have already been "discovered" by evolution. Examples might be laws about tradeoffs between different designs in the contexts of different sorts of niches. (AI researchers are beginning to understand tradeoffs between reactive and deliberative architectures, and conditions under which both separately, or some combination of the two, can be useful.) These tradeoffs are closely related to what first rate engineers learn from experience: e.g. when a software engineer learns that for a certain type of problem there is a choice between a solution which maximises speed but is rigid and hard to extend and a solution which runs more slowly but is much easier to maintain and extend. Although such things may be learnt through experience, once fully analysed they can sometimes be seen to have a deep mathematical basis. (Complexity theory is an example.) Other laws might be concerned with mechanisms of evolvability, i.e. ways in which large strides can be made in a search space. Many engineers have discovered the importance of reuse: i.e. copying a solution to an old problem and modifying it to solve a new one versus trying to build the new solution from scratch. It seems that at some point biological evolution also "discovered" this. Was that an inevitable or highly probable feature of any physically possible evolutionary process in a certain type of environment, or was it just a lucky result of some accidental co-occurrence of events and circumstances? As Kauffman repeatedly notes, there may be constraints on biological evolution and development which are not just empirical generalisations about how biological mechanism work on earth, but follow from mathematical principles (some of which we shall not understand properly until we have developed new kinds of mathematics). -- INTERACTING TRAJECTORIES IN "DESIGN SPACE" AND "NICHE SPACE" I conjecture that there are similar interesting constraints on interacting trajectories in design space and niche space in systems where there are collections of co-evolving organisms (natural or artificial). See [1], and related papers referred to therein. Design space is a space of possible designs for behaving systems. It's structure is very rich, complex, full of large and small discontinuities, and extremely non-homogeneous: the topology of a neighbourhood can vary enormously from one part of design space to another, or from one dimension to another. Niche space is less familiar: a niche is a set of requirements or constraints which may or may not be satisfied to varying degrees by different designs. The notion of niche space is an attempt to unify some ideas from biology and from engineering. Trajectories are possible both in niche space and in design space. there are i-trajectories, possible for adaptive, or self-modifying individuals, e-trajectories possible only through evolutionary change across generations of individuals, r-trajectories possible only when some external agent "repairs" or modifies a design (by changing an instance of that design), c-trajectories which depend on cultural mechanisms, and no doubt many more classes of trajectories waiting to be discovered and analysed. The mechanisms guiding trajectories, and the possible structures of trajectories, are different in different parts of the spaces: as stated above, the spaces are not homogeneous. It is very important that the mapping between a design and a niche is typically not a single fitness value but something more like a vector of values (hence the importance of tradeoffs between different values.) This may be a basis for bifurcation of trajectories, e.g. where different paths improve different components of the fitness vector. In coevolution, organisms which change alter the niches for other organisms. Hence there can be all sorts of complex positive and negative feedback loops between trajectories in design space and in niche space. -- DESIGN SPACE AND NICHE SPACE VS COUPLED FITNESS LANDSCAPES After working on these ideas for some time I was directed to Kauffman's work by a research student at Birmingham, Ian Millington, who helped me to understand the relationships to Kauffman's work. He pointed out that Kauffman's discussion of co-evolution (e.g. [2], page 221ff) involves "coupled fitness landscapes", where the fitness landscape for each species is continually deformed by changes in others. This assumes that each design has one fitness value, which may be continually changing. Ian suggests that the analysis of collections of trajectories in static niche spaces and design spaces with mappings between them (e.g. defined by fitness functions whose results are vectors of fitness values) may prove more fruitful, or at least easier, than the attempt to understand mutual interactions between continually deforming fitness landscapes. Whether this is so remains to be seen. One important point, if I understand Kauffman, is that what changes the fitness landscape for one organism can only be external changes. However, if you think of an organism developing or evolving from one sort of design to another, then that in itself can change its requirements, and therefore its niche. E.g. a new behaviour can change energy requirements. This relates to the fact that an organism's niche is not determined solely by its environment: two organisms in the same physical location can have quite different niches. Rather some aspects of a design determine requirements for other aspects. I.e. given an environment (including other organisms) what determines a niche is the organism itself. This is related to the analysis of the concept of a "need". Among the things we need to understand are what sorts of discontinuities there are in niche space and design space when organisms first begin to understand their own needs (How does that happen?) and so appreciate at least some of the relationships between their designs and their niches. -- VIRTUAL MACHINES IN THE BIOSPHERE I conjecture that we can usefully think of such trajectories in niche space and design space as defining possible causal interactions at different scales and different levels of abstraction in virtual machines in the biosphere. I suspect that when we understand all this, including the evolution of mechanisms which influence evolvability, it may help us to see that biological evolution is in some ways far more like a process of design than either darwinians or creationists ever supposed. (Compare Dennett's comments in [15]). (Sexual selection in organisms with explicit knowledge of their own needs and how to serve them is one obvious mechanism for this, and cultural influences can provide another, but I think there are more subtle and older processes e.g. constraints on the development and use of evolutionary mechanisms that allowed a sub-mechanism to be copied then modified. Compare the importance of reuse in engineering design.) Anyhow, within this framework, as in Kauffman's, natural evolution is NOT a blind combinatorial search, nor even a search directed simply by selection, but is highly constrained by "laws of form" which determine which trajectories are possible and which types of interactions between trajectories are possible (or probable). -- EXAMPLE OF A TRADEOFF: HELPLESS VS COMPETENT INFANTS Some mammals which need to be able to run with the herd very soon after birth (horses, deer) are born with considerable perceptual and motor competence providing independence, presumably largely genetically determined (though maybe with some fine tuning prior to and immediately after birth?) Many others (e.g. hunting animals, monkeys, primates, humans) are born in a very helpless form, they can take days, weeks or months in which processes of groping around, feeding, playing, fighting, competing for food, etc. apparently produce or aid the development of competence. Perhaps there is a very important tradeoff between (a) the kind of grasp of space and motion which can be encoded in genes and (b) the kind which can be developed through a process of learning and adaptation by an articulated individual exploring the vast array of structures in the environment and movements of which it is capable and which it can observe. It may be that because of this, horses, deer and other animals born able to run with the herd can never perceive and understand the same intricate variety of spatial structures and motions as those whose brains work things out through interacting with the environment during early (helpless) infancy. I predict that a similar difference can be found between altricial and precocial types of birds. The former are hatched very young, physically immature and dependent on parents. The latter hatched at a later stage, when they are more physically mature, and are less dependent immediately after hatching. The prediction is that the former will have a better grasp of spatial structure and motion and use this in performing more complex tasks than the latter, e.g. building nests in trees, flying through trees, catching fast moving insects, animals or fish, etc. The latter would mainly feed by pecking at static readily available food, live on the ground or mainly on water, etc. Understanding such tradeoffs may give us a better insight into why humans (and many other animals) are born helpless with underdeveloped brains, than the notion that bigger neonate brains would require a larger female pelvis. (An elephant pelvis is pretty big.) There are many other important tradeoffs, e.g. those involved in the development of general deliberative capabilities (which require smaller memory stores and perhaps shorter periods of evolution than purely reactive capabilities using large collections of stored behaviours). Another interesting tradeoff involves the development of types of internal self-monitoring mechanisms. Another is the tradeoff between distributed control and centralised global control. When we have understood how these tradeoffs work in different contexts we'll see that debates about which options are "better" are silly (e.g. debates about "traditional" vs "new" AI). (To be continued, perhaps, or at least revised and improved). -- REFERENCES [1] A draft paper (in postscript format) is available in the Cognition and Affect project directory (see also [1a]: Aaron Sloman, Design Spaces, Niche Spaces and the ``Hard'' Problem, http://www.cs.bham.ac.uk/research/cogaff/Sloman.design.and.niche.spaces.ps (NOTE: this paper is likely to be revised and extended.) [1a] A more recent paper, extending the ideas: A. Sloman, The ``Semantics'' of Evolution: Trajectories and Trade-offs in Design Space and Niche Space, in Ed. Helder Coelho, Progress in Artificial Intelligence, 6th Iberoamerican Conference on AI (IBERAMIA), Lisbon, October, 1998, pp. 27--38, Springer, Lecture Notes in Artificial Intelligence Also at: http://www.cs.bham.ac.uk/research/cogaff/Sloman_iberamia.ps (and .pdf) [2] Stuart Kauffman, At home in the universe: The search for laws of complexity, Penguin Books, London, 1995, [3] Brian Logan and Riccardo Poli; `Route planning in the space of complete plans', in Proceedings of the 15th Workshop of the UK Planning and Scheduling Special Interest Group, 21-22 November 1996, Liverpool John Moores University, Liverpool, pp 233-240 (also available as University of Birmingham School of Computer Science technical report CSRP-97-18). Also Brian Logan and Riccardo Poli; `Route planning with GA*', CSRP-97-17, University of Birmingham School of Computer Science, 1997. Both the above are available in ftp://ftp.cs.bham.ac.uk/pub/tech-reports/1997/ [4] Riccardo Poli, Mark Ryan and Aaron Sloman, A New Continuous Propositional Logic, in Progress in Artificial Intelligence, eds. C. Pinto-Ferreira and N. J. Mamede, Lecture Notes in Artificial Intelligence 990, Springer Verlag 1995. [5] Brian Logan and Aaron Sloman, Qualitative Decision Support using Prioritised Soft Constraints, CSRP-98-14, University of Birmingham School of Computer Science, 1998. ftp://ftp.cs.bham.ac.uk/pub/tech-reports/1998/CSRP-98-14.ps.gz [6] B.S.Logan and N. Alechina, A* with bounded costs, Proceedings of the 15th National Conference on Artificial Intelligence--AAAI-98, 1998, Madison, USA. Also Birmingham School of Computer Science technical report CSRP-98-09 [7] Leonard M. Adleman, Computing with DNA, Scientific American, August, 1998, 279, 2, pp. 34--41, [8] Roger B. Nelson, Proofs without words: Exercises in Visual Thinking, Mathematical Association of America, Washington DC, 1993, [9] Aaron Sloman, Musings on the roles of logical and non-logical representations in intelligence, in eds Janice Glasgow and Hari Narayanan and Chandrasekaran, Diagrammatic Reasoning: Computational and Cognitive Perspectives, MIT Press, 1995, pp. 7--33 The paper is accessible at http://www.cs.bham.ac.uk/research/cogaff/Aaron.Sloman_musings.ps.gz [10] Aaron Sloman, Diagrams in the mind? Invited paper, Thinking With Diagrams Conference (TW98) Aberystwyth, August 1998, eds T.Addis, N.Birch, A. Blackwell, P. Brna, P.Cheng, R.Cox, D.Gooding, P.Olivier, M.Scaife, K.Stenning, K van Rijsbergen. The paper is accessible at http://www.cs.bham.ac.uk/research/cogaff/Sloman.twd98.ps (also pdf version) [11] Mateja Jamnik, Alan Bundy, Ian Green, On Automating Diagrammatic Proofs of Arithmetic Arguments, in Proceedings of International Joint Conference on AI, 1997 [12] A. Sloman, On designing a visual system (Towards a Gibsonian computational model of vision), in Journal of Experimental and Theoretical AI, 1, 4, pp. 289--337 1989, The paper is accessible at http://www.cs.bham.ac.uk/research/cogaff/Aaron.Sloman_vision.design.ps.gz [13] Jack Cohen and Ian Stewart, The collapse of chaos, Penguin Books, New York, 1994 [14] Ian Stewart and Jack Cohen, Figments of Reality: The Evolution of the Curious Mind, Cambridge University Press, Cambridge, 1997, [15] Daniel C Dennett, Darwin's Dangerous Idea: Evolution and the meanings of life, Penguin Books, London, 1995 See chapter 6, and the section on Kauffman in chapter 8. -- ACKNOWLEDGEMENTS My thinking on these topics has benefited from interactions with colleagues and students at Birmingham: Riccardo Poli, Mark Ryan, Brian Logan, Manfred Kerber, William Langton, Chris Complin, Ian Millington, and others. Paul Marrow at BT drew my attention to the distinction between altricial and precocial species. After writing the above I became aware of partly similar thinking in a paper by Steve Grand (Cyberlife). See http://www.cyberlife.co.uk/steve/battling_with_ga-joe.htm