Model Selection and Overfitting in Genetic Programming: Empirical Study

Created by W.Langdon from gp-bibliography.bib Revision:1.4192

  author =       "Jan Zegklitz and Petr Posik",
  title =        "Model Selection and Overfitting in Genetic
                 Programming: Empirical Study",
  booktitle =    "GECCO Companion '15: Proceedings of the Companion
                 Publication of the 2015 Annual Conference on Genetic
                 and Evolutionary Computation",
  year =         "2015",
  editor =       "Sara Silva and Anna I Esparcia-Alcazar and 
                 Manuel Lopez-Ibanez and Sanaz Mostaghim and Jon Timmis and 
                 Christine Zarges and Luis Correia and Terence Soule and 
                 Mario Giacobini and Ryan Urbanowicz and 
                 Youhei Akimoto and Tobias Glasmachers and 
                 Francisco {Fernandez de Vega} and Amy Hoover and Pedro Larranaga and 
                 Marta Soto and Carlos Cotta and Francisco B. Pereira and 
                 Julia Handl and Jan Koutnik and Antonio Gaspar-Cunha and 
                 Heike Trautmann and Jean-Baptiste Mouret and 
                 Sebastian Risi and Ernesto Costa and Oliver Schuetze and 
                 Krzysztof Krawiec and Alberto Moraglio and 
                 Julian F. Miller and Pawel Widera and Stefano Cagnoni and 
                 JJ Merelo and Emma Hart and Leonardo Trujillo and 
                 Marouane Kessentini and Gabriela Ochoa and Francisco Chicano and 
                 Carola Doerr",
  isbn13 =       "978-1-4503-3488-4",
  keywords =     "genetic algorithms, genetic programming: Poster",
  pages =        "1527--1528",
  month =        "11-15 " # jul,
  organisation = "SIGEVO",
  address =      "Madrid, Spain",
  URL =          "",
  DOI =          "doi:10.1145/2739482.2764678",
  publisher =    "ACM",
  publisher_address = "New York, NY, USA",
  abstract =     "Genetic Programming has been very successful in
                 solving a large area of problems but its use as a
                 machine learning algorithm has been limited so far. One
                 of the reasons is the problem of over-fitting which
                 cannot be solved or suppressed as easily as in more
                 traditional approaches. Another problem, closely
                 related to over fitting, is the selection of the final
                 model from the population.

                 In this article we present our research that addresses
                 both problems: over fitting and model selection. We
                 compare several ways of dealing with over fitting,
                 based on Random Sampling Technique (RST) and on using a
                 validation set, all with an emphasis on model
                 selection. We subject each approach to a thorough
                 testing on artificial and real--world datasets and
                 compare them with the standard approach, which uses the
                 full training data, as a baseline.",
  notes =        "Also known as \cite{2764678} Distributed at

Genetic Programming entries for Jan Zegklitz Petr Posik