A Statistical Learning Theory Approach of Bloat

Created by W.Langdon from gp-bibliography.bib Revision:1.4420

  author =       "Sylvain Gelly and Olivier Teytaud and 
                 Nicolas Bredeche and Marc Schoenauer",
  title =        "A Statistical Learning Theory Approach of Bloat",
  howpublished = "www",
  year =         "2005",
  keywords =     "genetic algorithms, genetic programming,
                 Vapnik-Chervonenkis, VC dimension, bloat",
  URL =          "http://www.lri.fr/~teytaud/longBloat.pdf",
  URL =          "http://www.lri.fr/~gelly/paper/antibloatGecco2005_long_version.pdf",
  size =         "8 pages",
  abstract =     "Code bloat, the excessive increase of code size, is an
                 important issue in Genetic Programming (GP). This paper
                 proposes a theoretical analysis of code bloat in the
                 framework of symbolic regression in GP, from the
                 viewpoint of Statistical Learning Theory, a well
                 grounded mathematical toolbox for Machine Learning. Two
                 kinds of bloat must be distinguished in that context,
                 depending whether the target function lies in the
                 search space or not. Then, important mathematical
                 results are proved using classical results from
                 Statistical Learning. Namely, the Vapnik-Chervonenkis
                 dimension of programs is computed, and further results
                 from Statistical Learning allow to prove that a
                 parsimonious fitness ensures Universal Consistency (the
                 solution minimising the empirical error does converge
                 to the best possible error when the number of examples
                 goes to infinity). However, it is proved that the
                 standard method consisting in choosing a maximal
                 program size depending on the number of examples might
                 still result in programs of infinitely increasing size
                 with their accuracy; a more complicated modification of
                 the fitness is proposed that theoretically avoids
                 unnecessary bloat while nevertheless preserving the
                 Universal Consistency.",
  notes =        "cited by \cite{1068309} Replaced by

                 Equipe TAO - INRIA Futurs LRI, Bat. 490, University
                 Paris-Sud 91405 Orsay Cedex. France",

Genetic Programming entries for Sylvain Gelly Olivier Teytaud Nicolas Bredeche Marc Schoenauer