A statistical learning approach to bloat and universal consistency in genetic programming

Created by W.Langdon from gp-bibliography.bib Revision:1.3872

  author =       "Olivier Teytaud and Marc Schoenauer and 
                 Sylvain Gelly and Nicolas Bredeche",
  title =        "A statistical learning approach to bloat and universal
                 consistency in genetic programming",
  year =         "2005",
  abstract =     "Universal Consistency, the convergence to the minimum
                 possible error rate in learning through genetic
                 programming (GP), and Code bloat, the excessive
                 increase of code size, are important issues in GP. This
                 paper proposes a theoretical analysis of universal
                 consistency and code bloat in the framework of symbolic
                 regression in GP, from the viewpoint of Statistical
                 Learning Theory, a well grounded mathematical toolbox
                 for Machine Learning. Two kinds of bloat must be
                 distinguished in that context, depending whether the
                 target function has finite description length or not.
                 Then, the Vapnik-Chervonenkis dimension of programs is
                 computed, and we prove that a parsimonious fitness
                 ensures Universal Consistency (i.e. the fact that the
                 solution minimising the empirical error does converge
                 to the best possible error when the number of examples
                 goes to infinity). However, it is proved that the
                 standard method consisting in choosing a maximal
                 program size depending on the number of examples might
                 still result in programs of infinitely increasing size
                 with their accuracy; a fitness biased by parsimony
                 pressure is proposed. This fitness avoids unnecessary
                 bloat while nevertheless preserving the Universal
  bibsource =    "OAI-PMH server at eprints.pascal-network.org",
  oai =          "oai:eprints.pascal-network.org:1586",
  URL =          "http://hal.archives-ouvertes.fr/docs/00/04/19/72/PDF/antibloatGecco2005_long_version.pdf",
  URL =          "http://eprints.pascal-network.org/archive/00001586/",
  URL =          "http://eprints.pascal-network.org/archive/00001586/01/eabloat.pdf",
  keywords =     "genetic algorithms, genetic programming, VC,
                 Learning/Statistics \& Optimisation",
  size =         "8 pages",
  notes =        "pascal-network.org URLs appear broken June 2015. See
                 also GECCO 2005 \cite{1068309}. Also known as

Genetic Programming entries for Olivier Teytaud Marc Schoenauer Sylvain Gelly Nicolas Bredeche