Dynamics and Performance of a Linear Genetic Programming System

Created by W.Langdon from gp-bibliography.bib Revision:1.4202

  author =       "Frank D. Francone",
  title =        "Dynamics and Performance of a Linear Genetic
                 Programming System",
  school =       "Department of Energy and Environment, Division of
                 Physical Resource Theory, Chalmers University of
  year =         "2009",
  type =         "Licensiate",
  address =      "SE-412 96 Goteborg, Sweden",
  keywords =     "genetic algorithms, genetic programming, bloat,
                 mutation, crossover, homologous crossover,
                 machine-learning, UXO discrimination",
  annote =       "The Pennsylvania State University CiteSeerX Archives",
  bibsource =    "OAI-PMH server at citeseerx.ist.psu.edu",
  language =     "en",
  oai =          "oai:CiteSeerX.psu:",
  rights =       "Metadata may be used without restrictions as long as
                 the oai identifier remains attached to it.",
  URL =          "http://citeseerx.ist.psu.edu/viewdoc/summary?doi=",
  broken =       "http://www.tradingsystemlab.com/files/Frank D.
                 Francone Licensiate Thesis.pdf",
  size =         "77 pages",
  abstract =     "Genetic Programming GP is a machine-learning
                 algorithm. Typically, GP is a supervised learning
                 algorithm, which trains on labelled training examples
                 provided by the user. The solution output by GP maps
                 known attributes to the known labels.

                 GP is distinctive from other machine-learning
                 algorithms in that its output is typically a computer
                 program: hence Genetic Programming. The GP system
                 documented here conducts learning with a series of very
                 simple selection and transformation steps modelled
                 loosely on biological evolution repeated over-and-over
                 on a population of evolving computer programs. The
                 selection step attempts to mimic natural selection. The
                 transformation steps, crossover and mutation, loosely
                 mimic biological eukaryotic reproduction. Although the
                 individual steps are simple, the dynamics of a GP run
                 are complex.

                 This thesis traces key research elements in the design
                 of a widely-used GP system. It also presents empirical
                 comparisons of the GP system that resulted from these
                 design elements to other leading machine-learning
                 algorithms. Each of the issues addressed in this thesis
                 touches on what was, at the time of publication, a key,
                 and not well understood, issue regarding the dynamics
                 and behaviour of GP runs. In particular, the design
                 issues addressed here are threefold: (1) The emergence
                 in GP runs of introns or code bloat. Introns in GP are
                 segments of code that have no effect on the output of
                 the program in which they appear. Introns are an
                 emergent phenomenon in GP. This thesis reports results
                 that support the hypothesis that introns emerge as a
                 means of protecting evolving programs against the
                 destructive effect of the traditional GP crossover
                 transformation operator. (2) Mutation in biological
                 reproduction is rare and usually destructive. However,
                 we present results which establish that, in GP, using
                 the mutation transformation operator with high
                 probability, generates better and more robust evolved
                 programs than using the mutation transformation
                 operator at the low levels found in biological
                 reproduction. (3) Finally, we return to the GP
                 crossover operator and present results that suggest
                 that a homologous crossover operator produces better
                 and more robust results than the traditional GP
                 crossover operator.

                 The GP system that resulted from the above research has
                 been publicly available since 1998. It has been
                 extensively tested and compared to other
                 machine-learning paradigms. This thesis presents
                 results that suggest the resulting GP system produces
                 consistently high-quality and robust solutions when
                 compared to Vapnick statistical regression, decision
                 trees, and neural networks over a wide range of problem
                 domains and problem types.",

Genetic Programming entries for Frank D Francone