Algorithm Induction, Modularity and Complexity

Created by W.Langdon from gp-bibliography.bib Revision:1.4496

  author =       "John R. Woodward",
  title =        "Algorithm Induction, Modularity and Complexity",
  school =       "School of Computer Science, The University of
  year =         "2004",
  address =      "UK",
  month =        sep,
  email =        "",
  keywords =     "genetic algorithms, genetic programming",
  URL =          "",
  size =         "149 pages",
  abstract =     "We are concerned with the induction of a rule from a
                 set of observations, the goal being to succinctly
                 describe the observed data, but more importantly to
                 place us in a position to make predictions about future
                 data which we have not previously seen. One approach to
                 rule induction is to form a hypothesis space which
                 consists of potential rules or hypotheses, and then
                 search this space for a rule which is consistent with
                 the observed data. One formulation of this approach is
                 Genetic Programming, where the hypothesis spaces
                 consists of computer programs and the search of this
                 space is conducted using biologically inspired search
                 operators and a fitness function.

                 The well known No Free Lunch theorems are central to
                 search and essentially says all search algorithms
                 perform equally, under a number of assumptions. We
                 examine these assumptions and show that they are
                 invalidated when the hypothesis space contains
                 hypotheses which represent functions with different
                 frequencies, as is the case with Genetic Programming
                 and a number of other Machine Learning paradigms. The
                 Conservation of Generalisation, which is related to the
                 No Free Lunch theorems, implies that generalisation is
                 impossible. This is contrary to Occam's razor. The
                 Conservation of Generalisation theorems and Occam's
                 razor are consistent if we restrict ourselves to
                 representations which do not compress the observed

                 We define the representational complexity of a function
                 to be the minimum size of a given representation which
                 can express the function. Given a primitive set, and
                 the operation of composition, new functions can be
                 constructed, which is the representation standard
                 Genetic Programming uses to express functions. However
                 the complexity of a function under this type of
                 representation will depend on the primitive set. We
                 prove that if a representation is capable of expressing
                 modules (i.e. reuse of component parts of the
                 representation), then the complexity of a function is
                 independent of the primitive set (up to a constant
                 which depends on the primitive set but not on the
                 function being represented).

                 We then conduct a number of experiments related to the
                 evolution of Turing Complete representations. We argue
                 that, if a representation can address the general case
                 of a variable sized problem then the average number of
                 evaluations to find a solution is independent of the
                 problem size. In Genetic Programming, a fitness
                 function is used to drive the evolutionary process and
                 promote programs which are more likely to lead to
                 solutions. We experiment with a fitness function which
                 includes information about whether or not a program had
                 to be aborted during it's evaluation and demonstrate
                 that a 10 fold reduction in the number of evaluations
                 can be achieved on an arithmetic problem. Finally, we
                 experiment with a crossover operator which is inspired
                 by the theory of recursive functions and reduced the
                 probability of producing a program which has to be
                 terminated during its evaluation.",

Genetic Programming entries for John R Woodward