Predicting problem difficulty for genetic programming applied to data classification

Created by W.Langdon from gp-bibliography.bib Revision:1.4420

  author =       "Leonardo Trujillo and Yuliana Martinez and 
                 Edgar Galvan-Lopez and Pierrick Legrand",
  title =        "Predicting problem difficulty for genetic programming
                 applied to data classification",
  booktitle =    "GECCO '11: Proceedings of the 13th annual conference
                 on Genetic and evolutionary computation",
  year =         "2011",
  editor =       "Natalio Krasnogor and Pier Luca Lanzi and 
                 Andries Engelbrecht and David Pelta and Carlos Gershenson and 
                 Giovanni Squillero and Alex Freitas and 
                 Marylyn Ritchie and Mike Preuss and Christian Gagne and 
                 Yew Soon Ong and Guenther Raidl and Marcus Gallager and 
                 Jose Lozano and Carlos Coello-Coello and Dario Landa Silva and 
                 Nikolaus Hansen and Silja Meyer-Nieberg and 
                 Jim Smith and Gus Eiben and Ester Bernado-Mansilla and 
                 Will Browne and Lee Spector and Tina Yu and Jeff Clune and 
                 Greg Hornby and Man-Leung Wong and Pierre Collet and 
                 Steve Gustafson and Jean-Paul Watson and 
                 Moshe Sipper and Simon Poulding and Gabriela Ochoa and 
                 Marc Schoenauer and Carsten Witt and Anne Auger",
  isbn13 =       "978-1-4503-0557-0",
  pages =        "1355--1362",
  keywords =     "genetic algorithms, genetic programming",
  month =        "12-16 " # jul,
  organisation = "SIGEVO",
  address =      "Dublin, Ireland",
  DOI =          "doi:10.1145/2001576.2001759",
  publisher =    "ACM",
  publisher_address = "New York, NY, USA",
  abstract =     "During the development of applied systems, an
                 important problem that must be addressed is that of
                 choosing the correct tools for a given domain or
                 scenario. This general task has been addressed by the
                 genetic programming (GP) community by attempting to
                 determine the intrinsic difficulty that a problem poses
                 for a GP search. This paper presents an approach to
                 predict the performance of GP applied to data
                 classification, one of the most common problems in
                 computer science. The novelty of the proposal is to
                 extract statistical descriptors and complexity
                 descriptors of the problem data, and from these
                 estimate the expected performance of a GP classifier.
                 We derive two types of predictive models: linear
                 regression models and symbolic regression models
                 evolved with GP. The experimental results show that
                 both approaches provide good estimates of classifier
                 performance, using synthetic and real-world problems
                 for validation. In conclusion, this paper shows that it
                 is possible to accurately predict the expected
                 performance of a GP classifier using a set of
                 descriptors that characterize the problem data.",
  notes =        "Also known as \cite{2001759} GECCO-2011 A joint
                 meeting of the twentieth international conference on
                 genetic algorithms (ICGA-2011) and the sixteenth annual
                 genetic programming conference (GP-2011)",

Genetic Programming entries for Leonardo Trujillo Yuliana Martinez Edgar Galvan Lopez Pierrick Legrand