Variance based selection to improve test set performance in genetic programming

Created by W.Langdon from gp-bibliography.bib Revision:1.4420

  author =       "R. Muhammad Atif Azad and Conor Ryan",
  title =        "Variance based selection to improve test set
                 performance in genetic programming",
  booktitle =    "GECCO '11: Proceedings of the 13th annual conference
                 on Genetic and evolutionary computation",
  year =         "2011",
  editor =       "Natalio Krasnogor and Pier Luca Lanzi and 
                 Andries Engelbrecht and David Pelta and Carlos Gershenson and 
                 Giovanni Squillero and Alex Freitas and 
                 Marylyn Ritchie and Mike Preuss and Christian Gagne and 
                 Yew Soon Ong and Guenther Raidl and Marcus Gallager and 
                 Jose Lozano and Carlos Coello-Coello and Dario Landa Silva and 
                 Nikolaus Hansen and Silja Meyer-Nieberg and 
                 Jim Smith and Gus Eiben and Ester Bernado-Mansilla and 
                 Will Browne and Lee Spector and Tina Yu and Jeff Clune and 
                 Greg Hornby and Man-Leung Wong and Pierre Collet and 
                 Steve Gustafson and Jean-Paul Watson and 
                 Moshe Sipper and Simon Poulding and Gabriela Ochoa and 
                 Marc Schoenauer and Carsten Witt and Anne Auger",
  isbn13 =       "978-1-4503-0557-0",
  pages =        "1315--1322",
  keywords =     "genetic algorithms, genetic programming",
  month =        "12-16 " # jul,
  organisation = "SIGEVO",
  address =      "Dublin, Ireland",
  DOI =          "doi:10.1145/2001576.2001754",
  publisher =    "ACM",
  publisher_address = "New York, NY, USA",
  abstract =     "This paper proposes to improve the performance of
                 Genetic Programming (GP) over unseen data by minimizing
                 the variance of the output values of evolving models
                 alongwith reducing error on the training data. Variance
                 is a well understood, simple and inexpensive
                 statistical measure; it is easy to integrate into a GP
                 implementation and can be computed over arbitrary input
                 values even when the target output is not

                 Moreover, we propose a simple variance based selection
                 scheme to decide between two models (individuals). The
                 scheme is simple because, although it uses bi-objective
                 criteria to differentiate between two competing models,
                 it does not rely on a multi-objective optimisation
                 algorithm. In fact, standard multi-objective algorithms
                 can also employ this scheme to identify good trade-offs
                 such as those located around the knee of the Pareto

                 The results indicate that, despite some limitations,
                 these proposals significantly improve the performance
                 of GP over a selection of high dimensional
                 (multi-variate) problems from the domain of symbolic
                 regression. This improvement is manifested by superior
                 results over test sets in three out of four problems,
                 and by the fact that performance over the test sets
                 does not degrade as often witnessed with standard GP;
                 neither is this performance ever inferior to that on
                 the training set. As with some earlier studies, these
                 results do not find a link between expressions of small
                 sizes and their ability to generalise to unseen data.",
  notes =        "Also known as \cite{2001754} GECCO-2011 A joint
                 meeting of the twentieth international conference on
                 genetic algorithms (ICGA-2011) and the sixteenth annual
                 genetic programming conference (GP-2011)",

Genetic Programming entries for R Muhammad Atif Azad Conor Ryan