Measuring bloat, overfitting and functional complexity in genetic programming

Created by W.Langdon from gp-bibliography.bib Revision:1.4549

  author =       "Leonardo Vanneschi and Mauro Castelli and Sara Silva",
  title =        "Measuring bloat, overfitting and functional complexity
                 in genetic programming",
  booktitle =    "GECCO '10: Proceedings of the 12th annual conference
                 on Genetic and evolutionary computation",
  year =         "2010",
  editor =       "Juergen Branke and Martin Pelikan and Enrique Alba and 
                 Dirk V. Arnold and Josh Bongard and 
                 Anthony Brabazon and Juergen Branke and Martin V. Butz and 
                 Jeff Clune and Myra Cohen and Kalyanmoy Deb and 
                 Andries P Engelbrecht and Natalio Krasnogor and 
                 Julian F. Miller and Michael O'Neill and Kumara Sastry and 
                 Dirk Thierens and Jano {van Hemert} and Leonardo Vanneschi and 
                 Carsten Witt",
  isbn13 =       "978-1-4503-0072-8",
  pages =        "877--884",
  keywords =     "genetic algorithms, genetic programming",
  month =        "7-11 " # jul,
  organisation = "SIGEVO",
  address =      "Portland, Oregon, USA",
  DOI =          "doi:10.1145/1830483.1830643",
  publisher =    "ACM",
  publisher_address = "New York, NY, USA",
  abstract =     "Recent contributions clearly show that eliminating
                 bloat in a genetic programming system does not
                 necessarily eliminate overfitting and vice-versa. This
                 fact seems to contradict a common agreement of many
                 researchers known as the minimum description length
                 principle, which states that the best model is the one
                 that minimises the amount of information needed to
                 encode it. Another common agreement is that over
                 fitting should be, in some sense, related to the
                 functional complexity of the model. The goal of this
                 paper is to define three measures to respectively
                 quantify bloat, overfitting and functional complexity
                 of solutions and show their suitability on a set of
                 test problems including a simple bidimensional symbolic
                 regression test function and two real-life
                 multidimensional regression problems. The experimental
                 results are encouraging and should pave the way to
                 further investigation. Advantages and drawbacks of the
                 proposed measures are discussed, and ways to improve
                 them are suggested. In the future, these measures
                 should be useful to study and better understand the
                 relationship between bloat, overfitting and functional
                 complexity of solutions.",
  notes =        "Also known as \cite{1830643} GECCO-2010 A joint
                 meeting of the nineteenth international conference on
                 genetic algorithms (ICGA-2010) and the fifteenth annual
                 genetic programming conference (GP-2010)",

Genetic Programming entries for Leonardo Vanneschi Mauro Castelli Sara Silva