These notes were written in response to discussions arising out of a
workshop on intelligent robotics at BT in April 1998. See
http://www.labs.bt.com/people/mulhaug/docs/currentwork/workshop/workshop.htm
DRAFT
SOME NOTES ON COMBINATORIAL SEARCH AND HOW (NOT?) TO TAME IT
Aaron Sloman
Last modified 5 Feb 2000
CONTENTS:
-- WHAT IS COMBINATORIAL SEARCH?
-- COMBINATORIAL AND NON-COMBINATORIAL SPACES
-- SETS OF RELATED COMBINATORIAL SPACES
-- HOW (NOT?) TO TAME COMBINATORIAL SEARCH
-- -- (1) Direct the search towards the required solution
-- -- (2) Find ways to transform a search space into a smaller one,
-- -- (3) Find a new powerful form of representation,
-- -- (4) Find a transformation to add new structure enabling hill-climbing
-- -- (5) Find a clever algorithm which computes the result without search
-- -- (6) Use constraint manipulation techniques
-- -- (7) Use stochastic as opposed to deterministic methods
-- -- (8) Weaken the criterion for a solution.
-- -- (9) Divide and conquer
-- -- (10) Use domain-specific knowledge.
-- -- (11) Use chemical DNA-based machines to achieve enormous parallelism
-- -- (12) Use human or human-like spatial/visual reasoning abilities
-- GENERALITY vs DOMAIN SPECIFICITY (domain specific vs topic neutral)
-- "LAWS OF FORM" MAY DEFEAT COMBINATORICS
-- DIFFERENT TYPES OF LAWS OF FORM
-- INTERACTING TRAJECTORIES IN "DESIGN SPACE" AND "NICHE SPACE"
-- DESIGN SPACE AND NICHE SPACE VS COUPLED FITNESS LANDSCAPES
-- VIRTUAL MACHINES IN THE BIOSPHERE
-- EXAMPLE OF A TRADEOFF: HELPLESS VS COMPETENT INFANTS
-- REFERENCES
-- ACKNOWLEDGEMENTS
-- WHAT IS COMBINATORIAL SEARCH?
The phrase "combinatorial search" refers to search in a space defined by
possible combinations of components in a structure (linear or
non-linear). I.e. it is search in a combinatorial space. Items within
the space may vary according to their structure, or according to the
values of components (e.g. weights, parameters) within a structure, or
both.
How a space is searched is a separate question from whether it is
combinatorial or not. Many search strategies are possible in ANY search
space.
-- COMBINATORIAL AND NON-COMBINATORIAL SPACES
A non-combinatorial search would be a linear search through a fixed
store of data items, such as a set of book titles.
In some cases the fixed store might have the same contents as a
combinatorial space, e.g. the set of all book titles of up to six words
that can be generated by a simple grammar. However, typically a
pre-existing store can be preprocessed and embedded in a mechanism which
facilitates search so that answers to questions are found at great speed
(e.g. using database technology), depending on the nature of the
question. E.g. if the question is "is there an item matching pattern P"
the indexing may give direct and immediate answer, if it was designed to
anticipate such patterns. However if the questions is "find the item
which gives the highest value when function F is applied to it", the
index may not help, in which case searching through items in the set
will be required. Whether the search has to be exhaustive or not will
depend on the structure of the set and the nature of the Function.
In general the items in a combinatorial search do not all
exist in some database or other indexed structure in advance of the
search and are created during the search. Example tasks are:
find a set of prime numbers less than 50 which add up to 100,
find a program which discovers maximal cliques in binary
graphs.
Combinatorial search seems to characterise biological evolution.
For instance new DNA combinations, new morphologies for organisms, new
combinations of behavioural competence, new types of information
processing competence, were all produced during biological evolution.
Evolution involves a vast combinatorial search through a number of
different spaces. (See my discussion of design space and niche space in
[1].)
Combinatorial spaces typically have a size which is an exponential (or
worse) function of the number of components of the structure (whose size
need not be bounded in advance). For example if the solution to a
problem requires N decisions involving selection between two items then
the size of the space is 2**N. If the size of the solution is not known
in advance the actual space through which search processes can travel
may be very much longer, or even infinite. E.g. when searching for a
proof of a mathematical theorem you do not normally know in advance how
long the proof is likely to be, and since proofs (in predicate calculus)
can be arbitrarily long the search space is infinite.
Searching for a value in a single continuous dimension involves a
different kind of infinite search, since continuous sets are infinite,
even in a bounded interval. But (paradoxically?) this kind of search is
often (though not always) much easier to deal with because of the nice
mathematical properties of the set of real numbers and some of the
functions on that set. (E.g. if a function is known to have a single
maximum then simple hillclimbing starting anywhere will find it.)
In more complex search contexts a lot of effort has to go into trying to
tame the combinatorics. This can be done in several different ways. A
first draft list of strategies for taming search is given below. It is
certainly not complete and can probably be better structured. Moreover,
since it reflects information I have picked up over many years in many
contexts, I cannot easily provide references to examples, and would
welcome help in producing an annotated bibliography.
-- SETS OF RELATED COMBINATORIAL SPACES
A preliminary consideration is that there are often different ways of
looking at a problem involving different search spaces.
In general combinatorial spaces are closely related to other
combinatorial spaces generated in different ways.
E.g. consider a grammar G and lexicon L, and the set S1 of all
sentences of at most N items of L generated by G.
There is one space consisting of the set S1, and another S2 consisting
of the set of selections of rules in G and items in L to generate the
sentences in S1. (I.e the space of sentences and the space of
derivations are closely connected).
A structurally ambiguous sentence might be generated in more than one
way, e.g. "They saw the man with the telescope" can specify WHICH man
was seen (the one with the telescope) or HOW he was seen (with the
telescope). Here elements of S1 (sequences of words) and elements of S2
(derivations of sentences) are related, e.g. because each element of S2
might be a tree whose "fringe" is an element of S1. Since different
trees can have the same fringe, a search where structure is important
may involve a much larger space S2, but where it is unimportant, the
search can sometimes be reduced by focusing on S1.
In general there will be many combinations of lexical items which are
not legal sentences. So S1 will in general be a subset of S3, the set of
all possible N element sequences of items from L. Finding a way to focus
on S1 without having to explore S3 can be very difficult, except by
using S2, which may be smaller than S3, even though it is larger than
S2.
Likewise if lexical items are composed of characters from an alphabet
(including spaces) then each of S1 and S3 will determine a set S4 of
sequences of characters. However if N is the longest sequence in S4 then
there will be many sequences of characters of that length which contain
non-words. I.e. if S5 is the set of sequences of characters of length N,
then S5 will be a superset of S4.
Given the existence of all these sets, searching for a sentence
satisfying some set of conditions could in principle be done in any of
these spaces, though it would normally be silly to search in S3, S4, or
S5, because of the cost of eliminating large numbers of redundant items
by showing that they are not legal according to G and L.
Nevertheless the fact that a particular combinatorial space will
normally be mathematically related to many other spaces can be the basis
of techniques for transforming problems so as to reduce search.
-- HOW (NOT?) TO TAME COMBINATORIAL SEARCH
There are several different types of strategies for taming
combinatorics, all generally requiring design creativity. Here are some
examples. In general each technique works only for a restricted set of
problems. Finding which technique is best for a particular problem or
class of problems is itself a very difficult combinatorial search, and
there may be no general technique for doing it! (Perhaps there is some
kind of "diagonal" proof of impossibility?)
Note that the strategies listed here are not all mutually exclusive.
Sometimes a particular technique can be viewed in two different ways.
-- -- (1) Direct the search towards the required solution
This is one of the oldest and most widely used ideas in AI (in the form
of evaluation functions, search heuristics, evolutionary fitness
functions, etc.).
For instance, given a road map and the problem of finding a route from
town A to town B, one is searching in a space of partial routes. By
growing routes from A by always choosing the option closest to the
direction towards B it is sometimes possible to find a good route
without wasting time on irrelevant roads, using the metrical properties
of 2-D space to guide the route.
However if there are many dead-ends (e.g. due to impassable mountain
ranges, rivers, etc.) then this technique can fail. In many cases
directed search fails because the choices selected as best by the
evaluation often lead either to dead ends requiring a great deal of
backtracking, or else down an infinite path.
It is frequently rediscovered that simply directing a search doesn't
make the problem tractable except in special cases e.g. in search spaces
where good solutions have very large basins of attraction, or where good
solutions with smaller basins of attraction are densely distributed,
etc., or where a useful starting point happens to be chosen by luck.
Even where there is "good" route from some starting point to a solution,
without any dead ends requiring backtracking, if the space is large
enough then the good route might be too long to be traversed in the time
available -- e,g, a route in a very large space spiralling ever closer
to a solution might have to traverse most of the points in the space,
despite always getting closer to the goal.
So making combinatorial search *directed*, is a far from adequate answer
except in special cases. It is sometimes argued (e.g. by Kauffman in
[2]) that Darwinian selection mechanisms, which essentially do
hill-climbing in a space where neighbours are found by "simple" genetic
changes such as mutation and crossover, would have very little chance of
achieving organisms which now exist in the time available since the
earth was formed, unless additional structuring principles were involved
in controlling the search.
Combining directed search with other techniques (e.g. those listed
below) often helps, but different techniques are needed in different
classes of problems. Moreover finding a good way to direct the search
(e.g. finding a good evaluation function or fitness function) itself
often requires detailed analysis of the problem and a high degree of
creativity.
-- -- (2) Find ways to transform a search space into a smaller one,
to reduce the combinatorics.
This may involve any of a variety of techniques:
(a) Folding of one part of the space into another, if structures
are repeated, or where differences are irrelevant, or where a
solution in one part can easily be used to derive a solution in
another part, e.g. using symmetries. (This can include the use of
memo-functions which remember re-usable partial solutions for
sub-spaces, sometimes also referred to as "structure sharing").
(b) Pruning the space by selecting branches or sub-spaces which are
known to be adequate to the task or which are representative of the
sub-space discarded (sometimes called "refinement"),
(c) Searching at a higher level of abstraction if the more abstract
solution is sufficient for the task. E.g. replacing individual
nodes in a search space with sets of nodes which are in some sense
equivalent, and treating such a set as a node in a more abstract
space, which will therefore be smaller. Instead of using a
space of sets containing many nodes (extensional abstraction) it may
be possible to use descriptions which are applicable to many nodes
(intensional abstraction). Where sets are characterised by
common predicates, the former becomes a case of the latter.
(d) Using the result of a more abstract search to guide (prune) the
search in the original space (e.g. using proof plans in theorem
proving, or using the straight line joining two cities to guide
the search in a map for a sequence of roads leading from one to
the other.)
(f) Reducing the strictness of the requirement on the problem, which
may enable a less thorough search to suffice (e.g. looking for a
less than optimal solution to a problem). This may, for example,
transform a search in a space of points into a search in a space of
regions containing points of approximately equal value, where the
total number of regions is far smaller than the number of points.
The development of new programming languages in computer science can be
seen as an example of switching to more abstract but smaller spaces: the
task of searching for a Lisp or C++ program to solve a problem is
totally different from the task of searching for a machine code
solution, even if the former is eventually translated into the latter by
a compiler.
Even going from one high level language to another can make a huge
difference: as many programmers found in the mid 80s when they first
learnt Prolog after trying to use Pascal to do AI, for instance.
-- -- (3) Find a new powerful form of representation,
which produces an entirely new search space, or which makes a new
strategy possible, e.g. one which avoids search, or reduces search, or
allows a more rapid search.
An example is switching from a search in a space of incrementally grown
partial solutions (e.g. partial routes) to a space of complete solutions
(e.g. complete routes) where directing the search may be easier. (For an
example see [3].) The history of mathematics includes many examples of
powerful new representations.
Many of the other techniques listed here can be based on a change of
representation, e.g. moving to a higher level of abstraction requires
finding a way to represent solutions, heuristics, paths, subspaces at
the new level.
Much ongoing work in computer science is concerned with attempts to find
new representations and inference systems to reduce search spaces. E.g.
linear logic programming which allows each axiom to be used at most once
in a search can reduce search spaces. Whether such general techniques
can have much impact on really intractable problems is another question.
-- -- (4) Find a transformation to add new structure enabling hill-climbing
where it wasn't previously feasible.
Sometimes *enlarging* the space does this -- e.g. embedding a discrete
space in a continuous one. (See Ref [4] for an example.)
-- -- (5) Find a clever algorithm which computes the result without search
The history of mathematics has many examples of this, e.g. an algorithm
to find the highest common factor of two numbers without searching.
Sometimes clever algorithms are linked to clever representations. E.g.
if you had to form a collection of sets of numbers less than 100 you
might consider lists of numbers, one for each set. Alternatively you can
use bitstrings of length 100, where the Nth bit is 1 or 0 depending
whether N is in the set. Then operations like finding the union or
intersection of two sets can be performed trivially. Some operations can
use the fact that bitstrings can themselves be treated as numbers.
-- -- (6) Use constraint manipulation techniques
Many combinatorial spaces are defined by sets of options linked by
constraints. A search problem could be to find a selection from each
set, where selections from linked sets satisfy all the relevant
constraints.
This is a case where there are actually two spaces: S1 the space of all
possible selections, and S2 the space of selections satisfying the
constraints.
Working systematically through the options in S1 checking the
constraints may be intractable. Sometimes, however, constraint analysis
can be used to remove options iteratively from nodes in the constraint
network (e.g. using Waltz filtering).
This can sometimes remove large subsets from the search space. But
beware of phase transitions: in some cases constraint propagation can be
more work than exhaustive search, e.g. where there are very many
solutions so that most arbitrary selections satisfy the constraints.
(NOTE: Geoff Hinton showed in his PhD thesis (Edinburgh Univ. circa
1976) that where the initial constraint network is noisy, constraint
propagation may eliminate "good" options. If such noise is a possibility
then the search for a "correct" set of selections needs to be replaced
by a search for "the best" or "a good" set of selections, making use of
a cost function for violated constraints. Relaxation was used in his
thesis.)
-- -- (7) Use stochastic as opposed to deterministic methods
(possibly in combination with the other techniques: e.g.
non-deterministic constraint propagation).
This sometimes helps where hill-climbing fails because there are too
many hills with unsatisfactory peaks. Randomly sampling hills to climb
in parallel may give a good chance of finding a high hill. Examples of
non-deterministic methods include:
simulated annealing,
some neural nets,
evolutionary mechanisms,
various optimisation techniques
Where the search landscape is very rugged, i.e. small changes in the
combinations of options cause large changes in the value of the
resulting combination, hill-climbing is not possible. It then becomes
necessary to do some kind of sampling of the space. If it is too
unstructured, nothing short of exhaustive search will do.
Where the space is not so rugged but there are large numbers of local
maxima, hillclimbing will find a solution if you start anywhere on a
"good" hill. So sampling starting points from which to climb may also be
a good idea in this case.
More generally, we can sometimes find a way to "sample" the search space
which is not necessarily random. However a random search is more
appropriate when there is not enough knowledge about the structure of
the space to guarantee that a deterministic sampling strategy will give
representative results.
-- -- (8) Weaken the criterion for a solution.
This has already been mentioned under (f) in section (2) above.
For instance instead of looking for the best item look for something
that is good enough, or better than some threshold. (H.A. Simon called
this "satisficing".) This may enlarge the set of solutions enough to
turn an intractable search into a tractable one.
Moreover, where the search includes storing unexplored options for later
consideration, weakening can allow fewer unexplored options to be saved
if it is known that rejected options cannot lead to much better
solutions than saved ones, thereby turning a search with impossible
storage requirements to a feasible one, as in some variants of A*
search. (I learnt this from Brian Logan.)
-- -- (9) Divide and conquer
Replacing a single search in a huge space (searching for a structure
with a lot of components) with a large number of parallel searches in
much smaller spaces (searching for smaller structures). This can be
useful if the search is organised in such a way that a global solution
emerges from all the parallel searches.
For example if the original problem required K*N binary selections to be
made, then the search space is of size 2**(K*N). However if this can be
replaced by K separate searches each involving only N binary selections,
then the total number of nodes to be searched is K*(2**N) which is
typically a very much smaller number (e.g. if K = 1000, and N = 10, it
can transform a totally intractable search into something quite
feasible. (It makes an even bigger difference if the selections are not
binary.)
Note: this assumes some relatively cheap mechanism is available for
combining the separate results,
Examples of this technique include the following:
Neural net architectures using "local" adjustment criteria
during searching
Use of simulated "ants" leaving pheromone trails,
Systems that depend on voting mechanisms, "Swarming" systems (?)
Some evolutionary searches (coevolution???)
Use of models of a "market economy" with large numbers of
small individuals attempting to optimise their decision making,
using payments in a common currency for services or information,
in the expectation that this will optimise the global system.
Sometimes it doesn't, as we've often seen in real market
economies! (How many big companies which recommend this for
a whole country would replace their own corporate planning with
an internal market economy?)
Many physical systems can be seen as solving problems this way. E.g.
water poured into a container of arbitrary shape solves the problem of
arranging molecules so as to minimise total potential energy subject to
various constraints, some determined by the shape of the container.
Folding of complex molecules is another case. The use of a network made
of pieces of string to find the shortest route between two locations is
another.
This sort of approach can work well on computers in problems which are
inherently decomposable (or, nearly decomposable) into separate
sub-problems, and where a relatively cheap way of combining
solutions exists. However it will fail where the individual search
spaces remain large (or infinite as in the case of searching for
sub-proofs in a proof) and where there are very strong interactions
between sub-problems.
E.g. which options are best for component 256 may depend on which
options are best for component 243, which may depend on which options
are best for component 575, etc.
Many people seem to think animal brains (including human brains) work
entirely by the divide and conquer method because so many complex
problems are solved so quickly and the brain clearly uses highly
parallel mechanisms.
An alternative hypothesis is that brains have large numbers of highly
specific mechanisms which use a great deal of pre-compiled information
(e.g. about the environment, or the animal's body), with each mechanism
able to function well (often without search) in a special class of
problems. When a truly novel problem turns up combinatorial search is
required. How that's done depends on how much expertise and knowledge
the person has in the field (see next para.)
One of the reasons divide and conquer with a utility measure or some
sort of market economy or voting system often fails is that the attempt
to introduce a uniform currency as the sole or main method of
communication between components is inadequate because it loses
information. E.g. using a numeric evaluation function in A* search
doesn't help you understand why the search fails when it fails. If
instead of simply getting a numerical fitness or utility value back you
got a structured description of inadequate solutions, you might be in a
better position to change your requirements and start again, perhaps
building on some of the better failures.
(This is frequently rediscovered by people who try to solve complex
problems using utility measures, fitness measures, evaluation functions,
and then fail. For some recent work by on how to use a non-numeric
multi-dimensional evaluation function see [5] and [6].)
-- -- (10) Use domain-specific knowledge.
An approach which often works when others fail is to use domain
specific knowledge to tailor a representation, algorithm or architecture
to the properties of a domain. Animal visual systems are an example,
closely tailored to the structure of 2-D images and 3-D structure and
motion. Many commercial expert systems are highly knowledge-based.
There are many different ways domain knowledge can be stored and used.
E.g. large collections of well indexed condition-action rules or logical
axioms are common in expert systems. By contrast some neural nets store
large numbers of points in a uniformly structured solution space with a
convenient metric, along with fast parallel methods for finding the
nearest known neighbour to a new problem point. In some cases an
association mechanism can use interpolation to generate good solutions
for new intermediate points.
In general, however, even considerable expertise and prior knowledge
does not rule out the need for creative searching in a combinatorial
space. I am sure Mozart, Beethoven, Shakespeare, Newton, Einstein, etc.
all did that some of the time. Beethoven's drafts for some of his
compositions show that he did.
-- -- (11) Use chemical DNA-based machines to achieve enormous parallelism
The August 1998 issue of Scientific American has an article [7] showing
how chemical mechanisms based on reproducing, joining and splicing DNA
sequences can be used to solve certain combinatoric problems in a
reasonable time by making use of very large numbers of parallel
processes in a chemical soup. No doubt this technique will be developed
for use on more and more complex problems, as it promises to handle
within days or weeks problems that might take months or years on
computers.
The really hard task is to transform a problem into a form which can be
addressed in this way.
-- -- (12) Use human or human-like spatial/visual reasoning abilities
An ant finding its way back to its nest in long grass has to use either
local clues or a record of its path in order to select a direction in
which to move at each moment in time. A human who has a view of the
region from a high vantage point, or who has a good map, can normally
take in the structure of the terrain, see where the main obstacles are,
which valleys, paths or roads could be joined to form a good route, and
often, with hardly any searching at all, select a good path. This human
capability, though very powerful, is subject to real limits. If the
terrain consists of a complex maze (as in maze puzzles in children's
books) it may be impossible to see a good route without a considerable
amount of searching and backtracking.
The fact that humans have powerful spatial reasoning abilities has often
been noticed. For many interesting examples in mathematical reasoning
see Nelson's book [8]. It is often claimed that work in AI is hampered
by the fact that as yet machines do not have such abilities and many
people have attempted to implement subsets. (I made this claim at my
first AI conference in 1971, but later found that it had been said
before. The idea is frequently reinvented. For some discussion of what
the claim means and some examples see [9] and [10]. For an interesting
recent attempt to automate diagrammatic reasoning in connection with
arithmetic see [11].
It has so far proved very difficult to give machines human-like spatial
reasoning capabilities for reasons that are obviously closely related to
the fact that machine vision is currently nowhere near human or animal
vision in power and flexibility despite successes in particular classes
of visual tasks (controlling a device, recognising faces, visual
diagnosis of skin lesions). The reasons for these limitations are too
complex to be discussed here. However, it should be remembered that even
if we managed to analyse and replicate this spatial human ability its
limitations may remain in machine form. I.e. there may be deep reasons
why beyond a certain type of complexity, solutions to mazes and other
visually presented problems cannot simply be "seen", so that some form
of searching will then have to be used.
As suggested by the history of mathematics there may be special cases
where this can be avoided by relocating the problem in a more abstract
space where the structure is more visually perspicuous. (An example,
involving switching between metrical and topological representations is
discussed in [10]).
-- GENERALITY vs DOMAIN SPECIFICITY (domain specific vs topic neutral)
During the first 30 or so years of the life of AI, and especially during
the 70s and early 80s, there has been an enduring tension between those
who seek *general purpose* solutions to problems and those who believe
that only by using *specific* knowledge about particular problem domains
can truly powerful systems be produced (e.g. planners, problem solvers,
language understanders, image interpreters -- all of which are, in
different ways, confronted with search problems).
(remember GPS the
"General Problem
Solver", by Newell, Shaw and Simon, and all the general purpose planners
and logic based problem solvers, etc.?)
For instance, for some time the Stanford AI department, largely
influenced by John McCarthy, was associated with the search for general
methods, including general theorem provers and logic based planners.
Likewise, the work by Allan Newell and Herbert Simon at Carnegie
Mellon University (e.g. work by Newell, Shaw and Simon on the "General
Problem Solver") addressed general issues, at least in the 1960's and
and 1970's. By contrast the MIT AI department, led by Marvin Minsky and
Seymour Papert, sought power in domain specific knowledge.
There were similar divisions in the UK (E.g. Edinburgh and Imperial
College vs Sussex University and others.)
Of course the geographical divisions were not rigid: Feigenbaum at
Stanford liked the slogan "The power is in the knowledge" and there were
people at MIT who worked on logic.
The contrast was also associated with a division between "neats" and
"scruffies". The neat people (logicians, generalists) hoped for powerful
and elegant general methods, forms of representations, mechanisms. The
scruffies (including Roger Schank's team, first at Yale, then at
Northwestern University) thought that AI systems, forms of
representation, architectures etc. would have to be complex and messy to
have enough power to deal with the real world, as opposed to toy
problems.
I believe Chris Mellish, either during or shortly after, working on his
PhD in Edinburgh coined a phrase which is very relevant. He noticed that
when a domain specific technique has been found to work, there is some
general re-usable structure in the problem and the solution which can be
abstracted from the particular subject matter. That structure could then
be described as "domain specific but topic neutral". (I am writing this
from memory. I don't have any references.)
This seems to me to be very important: however scruffy effective domain
specific knowledge is, what makes it work is something about the
ontology of the domain, the structure of the problems, and it should be
possible to express the relationship between the problem solving
knowledge and the domain to which it applies and problems it is useful
for, in a manner which at least in principle forms a pattern which could
be applied elsewhere.
Eventually everyone, on both sides of the debate, came to realise that
there was a tradeoff between generality and power. Given any really
general technique for solving a class of problems, it was often, if
not always, the case that for any subclass with common features there
existed a more specific technique which was better, e.g. it worked
faster, or required less memory, or was more accurate, or found better
solutions, etc.
The relevance to the present discussion is this: ALL of the following
are essentially very general methods:
logical theorem proving
neural nets
constraint-manipulation
genetic algorithms
classifier systems
etc.
In order to make any of these techniques work well on a particular hard
problem domain it is normally important to analyse the intricacies of
that domain and tailor the method to the domain, e.g. by finding a good
representation that captures important features of the domain.
It may then turn out that once that work has been done, whatever was
learnt about the domain can equally well be used in connection with
other methods. So it may be that proponents of the different methods
have to re-invent each other's wheels to make their methods work well in
certain domains.
That suggests that it may be useful to seek ways of classifying types of
problems or problem domains in a manner which is independent of whether
you plan to use theorem provers, neural nets, or genetic algorithms to
solve the problems. If that can't be done, that will be an important
fact about how knowledge has to be structured. It may be that even
though there is domain-specific but topic-neutral knowledge, it is not
method-neutral.
A related point is that parts of the world may, as a matter of fact,
have constraints that restrict the problems that can arise and the
possible solutions to those problems. If those constraints can be
discovered and understood they could be deployed in specialised
problem-solving machinery or search strategies. For instance if a visual
system that is for use in a world where most physical objects do NOT
have surfaces covered everywhere with vast discontinuities then the
visual system can exploit the fact that most surfaces are mostly
continuous, in controlling its search for ways of segmenting and
analysing images. Without that constraint the space of possible
interpretations of any particular image might be many orders of
magnitude larger. Similarly a planner working in a state of uncertainty
about aspects of the environment can use some very general assumptions,
including assumptions about frequencies or probabilities of occurrences
of certain classes of events, to consider a vastly reduced set of
contingencies.
In some previous papers (e.g. [12]) I have discussed dimensions in which
an environment could be more or less cognitively friendly. If an
organism or robot can detect and make use of that friendliness, it could
benefit enormously in reduced computational requirements. Evolution
seems to be able to find and use many kinds of cognitive friendliness
suited to different organisms. In some cases it actually produces
cognitive friendliness, through co-evolution, e.g. features of flowers
which help pollinating insects to find and land on them. How evolution
finds the mechanisms which can make use of cognitive friendliness is
another matter.
-- "LAWS OF FORM" MAY DEFEAT COMBINATORICS
As far as evolutionary search is concerned, I conjecture that there is a
deep point, which is hard to make clear, which pervades work by Stuart
Kauffman [2], two recent books by Ian Stewart and Jack Cohen ([13] [14]),
as well as earlier work by the biologist Brian Goodwin.
Brian Goodwin (whom I used to know and talk to at Sussex University)
proposed long ago that there are "laws of form" which constrain what can
evolve and how organisms can develop. I think he thought of these as
mostly arising out of the properties of space and time (e.g. what kinds
of folding of tissue are possible in 3-D space).
(I'll try to find a reference to published work by Goodwin. See
http://www.gn.apc.org/schumachercollege )
These eternal (mathematical) laws of form may constrain possible
organisms to a far higher degree than the dynamics of natural selection
operating on randomly generated genetic structures.
The conjecture is that within the space of possible trajectories
available for evolutionary and developmental processes there are
constraints which limit the combinatorics to such an extent that the
normal emphasis on repeated selection among randomly produced items as
the explanation of how complex biological organisms developed is
misleading, since that alone could never have defeated the
combinatorics.
This idea has been generalised by Kauffman and others (e.g. Cohen and
Stewart) in the idea that "order for free" arises inevitably out of the
combinatorics of certain large classes of structures. Many of the
examples Kauffman gives arise from connectivity properties of graphs.
E.g. he claims that a very high (surprisingly high) proportion of a
certain class of randomly generated dynamical systems turn out to have a
small number of basins of attraction which exhaust nearly all of the
state space. In that case finding and comparing all the attractors could
be relatively easy. (Though convergence rates are also important.)
This leads to the notion that trajectories that are possible in
biological evolution are all subject to "laws of form" which provide
"order for free".
These laws can have different forms. E.g. some determine which
structures and processes are possible or impossible, some determine
probability distributions in classes of structures based on
particular kinds of building blocks. I am particularly interested in
those which determine the tradeoffs relevant to different designs
which fit the same niche. (More on this below.)
Such considerations led Kauffman ([2], page 150) to the claim that
although most biologists believe that "selection is the only source of
order", this view is "deeply inadequate" though it has "much to commend
it".
Where such laws of form constrain what is possible (or probable) within
a combinatorial space it may be the case that a directed search using a
selection mechanism suffices to ensure that good solutions can be found
in a reasonable time. (What "reasonable" means will be context
dependent.)
However this may not be as simple as it seems, for instance if different
selection criteria and different selection mechanisms are needed in
different parts of the space. Natural evolution may achieve this in ways
that are not easy to replicate in artificial evolution, for instance if,
as I suspect, the process depends crucially on coevolution of many
different species of organisms. For instance this may be a requirement
both for guidance of the search (different species "guide" one another
through increasingly difficult problems, just as a good teacher guides a
pupil), and for adequate sampling of the space of possible evolutionary
trajectories in a reasonable time.
(More on this below.)
-- DIFFERENT TYPES OF LAWS OF FORM
It is worth noting that there are different kinds of laws of form of
different degrees of generality and abstractness.
Some very general laws relate to the properties of systems with certain
classes of "small" building blocks which can be assembled in a very
wide range of structures. Examples include: Kauffman's arguments about
the high probability of autocatalytic networks given some very general
assumptions about a type of chemistry; the probabilities associated with
mendelian inheritance; the incompleteness and
complexity/incompressibility results concerning very general classes
of computations.
I conjecture that there is also a vast collection of more specific laws
of form concerning "intermediate level" more specialised architectures
and processes waiting to be studied, some of which I expect we'll find
have already been "discovered" by evolution.
Examples might be laws about tradeoffs between different designs in the
contexts of different sorts of niches. (AI researchers are beginning to
understand tradeoffs between reactive and deliberative architectures,
and conditions under which both separately, or some combination of the
two, can be useful.)
These tradeoffs are closely related to what first rate engineers learn
from experience: e.g. when a software engineer learns that for a
certain type of problem there is a choice between a solution which
maximises speed but is rigid and hard to extend and a solution which
runs more slowly but is much easier to maintain and extend.
Although such things may be learnt through experience, once
fully analysed they can sometimes be seen to have a deep mathematical
basis. (Complexity theory is an example.)
Other laws might be concerned with mechanisms of evolvability, i.e. ways
in which large strides can be made in a search space. Many engineers
have discovered the importance of reuse: i.e. copying a solution to an
old problem and modifying it to solve a new one versus trying to build
the new solution from scratch. It seems that at some point biological
evolution also "discovered" this. Was that an inevitable or highly
probable feature of any physically possible evolutionary process in a
certain type of environment, or was it just a lucky result of some
accidental co-occurrence of events and circumstances?
As Kauffman repeatedly notes, there may be constraints on biological
evolution and development which are not just empirical generalisations
about how biological mechanism work on earth, but follow from
mathematical principles (some of which we shall not understand properly
until we have developed new kinds of mathematics).
-- INTERACTING TRAJECTORIES IN "DESIGN SPACE" AND "NICHE SPACE"
I conjecture that there are similar interesting constraints on
interacting trajectories in design space and niche space in systems
where there are collections of co-evolving organisms (natural or
artificial). See [1], and related papers referred to therein.
Design space is a space of possible designs for behaving systems. It's
structure is very rich, complex, full of large and small
discontinuities, and extremely non-homogeneous: the topology of a
neighbourhood can vary enormously from one part of design space to
another, or from one dimension to another.
Niche space is less familiar: a niche is a set of requirements or
constraints which may or may not be satisfied to varying degrees by
different designs. The notion of niche space is an attempt to unify some
ideas from biology and from engineering.
Trajectories are possible both in niche space and in design space.
there are i-trajectories, possible for adaptive, or
self-modifying individuals, e-trajectories possible only through
evolutionary change across generations of individuals, r-trajectories
possible only when some external agent "repairs" or modifies a design
(by changing an instance of that design), c-trajectories which depend on
cultural mechanisms, and no doubt many more classes of trajectories
waiting to be discovered and analysed.
The mechanisms guiding trajectories, and the possible structures of
trajectories, are different in different parts of the spaces: as stated
above, the spaces are not homogeneous.
It is very important that the mapping between a design and a niche is
typically not a single fitness value but something more like a vector of
values (hence the importance of tradeoffs between different values.)
This may be a basis for bifurcation of trajectories, e.g. where
different paths improve different components of the fitness vector.
In coevolution, organisms which change alter the niches for other
organisms. Hence there can be all sorts of complex positive and negative
feedback loops between trajectories in design space and in niche space.
-- DESIGN SPACE AND NICHE SPACE VS COUPLED FITNESS LANDSCAPES
After working on these ideas for some time I was directed to Kauffman's
work by a research student at Birmingham, Ian Millington, who helped me
to understand the relationships to Kauffman's work.
He pointed out that Kauffman's discussion of co-evolution (e.g. [2],
page 221ff) involves "coupled fitness landscapes", where the fitness
landscape for each species is continually deformed by changes in others.
This assumes that each design has one fitness value, which may be
continually changing. Ian suggests that the analysis of collections of
trajectories in static niche spaces and design spaces with mappings
between them (e.g. defined by fitness functions whose results
are vectors of fitness values) may prove more fruitful, or at least
easier, than the attempt to understand mutual interactions between
continually deforming fitness landscapes. Whether this is so remains to
be seen.
One important point, if I understand Kauffman, is that what changes the
fitness landscape for one organism can only be external changes.
However, if you think of an organism developing or evolving from one
sort of design to another, then that in itself can change its
requirements, and therefore its niche.
E.g. a new behaviour can change energy requirements. This relates to the
fact that an organism's niche is not determined solely by its
environment: two organisms in the same physical location can have quite
different niches. Rather some aspects of a design determine requirements
for other aspects. I.e. given an environment (including other organisms)
what determines a niche is the organism itself.
This is related to the analysis of the concept of a "need". Among the
things we need to understand are what sorts of discontinuities there
are in niche space and design space when organisms first begin to
understand their own needs (How does that happen?) and so appreciate at
least some of the relationships between their designs and their niches.
-- VIRTUAL MACHINES IN THE BIOSPHERE
I conjecture that we can usefully think of such trajectories in niche
space and design space as defining possible causal interactions at
different scales and different levels of abstraction in virtual machines
in the biosphere.
I suspect that when we understand all this, including the evolution of
mechanisms which influence evolvability, it may help us to see that
biological evolution is in some ways far more like a process of design
than either darwinians or creationists ever supposed. (Compare Dennett's
comments in [15]).
(Sexual selection in organisms with explicit knowledge of their own
needs and how to serve them is one obvious mechanism for this, and
cultural influences can provide another, but I think there are more
subtle and older processes e.g. constraints on the development and use
of evolutionary mechanisms that allowed a sub-mechanism to be copied
then modified. Compare the importance of reuse in engineering design.)
Anyhow, within this framework, as in Kauffman's, natural evolution is
NOT a blind combinatorial search, nor even a search directed simply by
selection, but is highly constrained by "laws of form" which determine
which trajectories are possible and which types of interactions between
trajectories are possible (or probable).
-- EXAMPLE OF A TRADEOFF: HELPLESS VS COMPETENT INFANTS
Some mammals which need to be able to run with the herd very soon after
birth (horses, deer) are born with considerable perceptual and motor
competence providing independence, presumably largely genetically
determined (though maybe with some fine tuning prior to and immediately
after birth?)
Many others (e.g. hunting animals, monkeys, primates, humans) are born
in a very helpless form, they can take days, weeks or months in which
processes of groping around, feeding, playing, fighting, competing for
food, etc. apparently produce or aid the development of competence.
Perhaps there is a very important tradeoff between (a) the kind of grasp
of space and motion which can be encoded in genes and (b) the kind which
can be developed through a process of learning and adaptation by an
articulated individual exploring the vast array of structures in the
environment and movements of which it is capable and which it can
observe.
It may be that because of this, horses, deer and other animals born able
to run with the herd can never perceive and understand the same
intricate variety of spatial structures and motions as those whose
brains work things out through interacting with the environment during
early (helpless) infancy.
I predict that a similar difference can be found between altricial and
precocial types of birds. The former are hatched very young, physically
immature and dependent on parents. The latter hatched at a later stage,
when they are more physically mature, and are less dependent immediately
after hatching. The prediction is that the former will have a better
grasp of spatial structure and motion and use this in performing more
complex tasks than the latter, e.g. building nests in trees, flying
through trees, catching fast moving insects, animals or fish, etc. The
latter would mainly feed by pecking at static readily available food,
live on the ground or mainly on water, etc.
Understanding such tradeoffs may give us a better insight into why
humans (and many other animals) are born helpless with underdeveloped
brains, than the notion that bigger neonate brains would require a
larger female pelvis. (An elephant pelvis is pretty big.)
There are many other important tradeoffs, e.g. those involved in the
development of general deliberative capabilities (which require smaller
memory stores and perhaps shorter periods of evolution than purely
reactive capabilities using large collections of stored behaviours).
Another interesting tradeoff involves the development of types
of internal self-monitoring mechanisms. Another is the tradeoff between
distributed control and centralised global control.
When we have understood how these tradeoffs work in different contexts
we'll see that debates about which options are "better" are silly
(e.g. debates about "traditional" vs "new" AI).
(To be continued, perhaps, or at least revised and improved).
-- REFERENCES
[1] A draft paper (in postscript format) is available in the Cognition
and Affect project directory (see also [1a]:
Aaron Sloman, Design Spaces, Niche Spaces and the ``Hard'' Problem,
http://www.cs.bham.ac.uk/research/cogaff/Sloman.design.and.niche.spaces.ps
(NOTE: this paper is likely to be revised and extended.)
[1a] A more recent paper, extending the ideas:
A. Sloman, The ``Semantics'' of Evolution: Trajectories and
Trade-offs in Design Space and Niche Space,
in Ed. Helder Coelho, Progress in Artificial Intelligence,
6th Iberoamerican Conference on AI (IBERAMIA), Lisbon,
October, 1998, pp. 27--38,
Springer, Lecture Notes in Artificial Intelligence
Also at:
http://www.cs.bham.ac.uk/research/cogaff/Sloman_iberamia.ps
(and .pdf)
[2] Stuart Kauffman, At home in the universe: The search for laws of
complexity, Penguin Books, London, 1995,
[3] Brian Logan and Riccardo Poli; `Route planning in the space of
complete plans', in Proceedings of the 15th Workshop of the UK Planning
and Scheduling Special Interest Group, 21-22 November 1996, Liverpool
John Moores University, Liverpool, pp 233-240 (also available as
University of Birmingham School of Computer Science technical report
CSRP-97-18).
Also
Brian Logan and Riccardo Poli; `Route planning with GA*', CSRP-97-17,
University of Birmingham School of Computer Science, 1997.
Both the above are available in
ftp://ftp.cs.bham.ac.uk/pub/tech-reports/1997/
[4] Riccardo Poli, Mark Ryan and Aaron Sloman,
A New Continuous Propositional Logic, in Progress in Artificial
Intelligence, eds. C. Pinto-Ferreira and N. J. Mamede, Lecture Notes in
Artificial Intelligence 990, Springer Verlag 1995.
[5] Brian Logan and Aaron Sloman,
Qualitative Decision Support using Prioritised Soft Constraints,
CSRP-98-14, University of Birmingham School of Computer Science,
1998.
ftp://ftp.cs.bham.ac.uk/pub/tech-reports/1998/CSRP-98-14.ps.gz
[6] B.S.Logan and N. Alechina, A* with bounded costs,
Proceedings of the 15th National Conference on Artificial Intelligence--AAAI-98,
1998, Madison, USA. Also Birmingham School of Computer Science technical
report CSRP-98-09
[7] Leonard M. Adleman, Computing with DNA, Scientific American,
August, 1998, 279, 2, pp. 34--41,
[8] Roger B. Nelson, Proofs without words: Exercises in Visual Thinking,
Mathematical Association of America, Washington DC, 1993,
[9] Aaron Sloman, Musings on the roles of logical and non-logical
representations in intelligence, in
eds Janice Glasgow and Hari Narayanan and Chandrasekaran,
Diagrammatic Reasoning: Computational and Cognitive Perspectives,
MIT Press, 1995,
pp. 7--33
The paper is accessible at
http://www.cs.bham.ac.uk/research/cogaff/Aaron.Sloman_musings.ps.gz
[10] Aaron Sloman, Diagrams in the mind? Invited paper,
Thinking With Diagrams Conference (TW98) Aberystwyth,
August 1998, eds T.Addis, N.Birch, A. Blackwell, P. Brna, P.Cheng,
R.Cox, D.Gooding, P.Olivier, M.Scaife, K.Stenning, K van Rijsbergen.
The paper is accessible at
http://www.cs.bham.ac.uk/research/cogaff/Sloman.twd98.ps
(also pdf version)
[11] Mateja Jamnik, Alan Bundy, Ian Green,
On Automating Diagrammatic Proofs of Arithmetic Arguments, in
Proceedings of International Joint Conference on AI, 1997
[12] A. Sloman, On designing a visual system (Towards a Gibsonian
computational model of vision), in Journal of Experimental and
Theoretical AI, 1, 4, pp. 289--337 1989,
The paper is accessible at
http://www.cs.bham.ac.uk/research/cogaff/Aaron.Sloman_vision.design.ps.gz
[13] Jack Cohen and Ian Stewart, The collapse of chaos, Penguin Books,
New York, 1994
[14] Ian Stewart and Jack Cohen, Figments of Reality: The Evolution of
the Curious Mind, Cambridge University Press, Cambridge, 1997,
[15] Daniel C Dennett, Darwin's Dangerous Idea: Evolution and the
meanings of life, Penguin Books, London, 1995
See chapter 6, and the section on Kauffman in chapter 8.
-- ACKNOWLEDGEMENTS
My thinking on these topics has benefited from interactions with
colleagues and students at Birmingham: Riccardo Poli, Mark Ryan, Brian
Logan, Manfred Kerber, William Langton, Chris Complin, Ian Millington,
and others. Paul Marrow at BT drew my attention to the distinction
between altricial and precocial species.
After writing the above I became aware of partly similar thinking in a
paper by Steve Grand (Cyberlife). See
http://www.cyberlife.co.uk/steve/battling_with_ga-joe.htm