Better Solutions Faster: Soft Evolution of Robust Regression Models In Pareto genetic programming

Created by W.Langdon from gp-bibliography.bib Revision:1.4420

  author =       "Ekaterina Vladislavleva and Guido Smits and 
                 Mark Kotanchek",
  title =        "Better Solutions Faster: Soft Evolution of Robust
                 Regression Models In Pareto genetic programming",
  booktitle =    "Genetic Programming Theory and Practice {V}",
  year =         "2007",
  editor =       "Rick L. Riolo and Terence Soule and Bill Worzel",
  series =       "Genetic and Evolutionary Computation",
  chapter =      "2",
  pages =        "13--32",
  address =      "Ann Arbor",
  month =        "17-19" # may,
  publisher =    "Springer",
  keywords =     "genetic algorithms, genetic programming",
  isbn13 =       "978-0-387-76308-8",
  DOI =          "doi:10.1007/978-0-387-76308-8_2",
  size =         "19 pages",
  abstract =     "Better solutions faster is the reality of the
                 industrial modelling world, now more than ever.
                 Efficiency requirements, market pressures, and ever
                 changing data force us to use symbolic regression via
                 genetic programming (GP) in a highly automated fashion.
                 This is why we want our GP system to produce simple
                 solutions of the highest possible quality with the
                 lowest computational effort, and a high consistency in
                 the results of independent GP runs. In this chapter, we
                 show that genetic programming with a focus on ranking
                 in combination with goal softening is a very powerful
                 way to improve the efficiency and effectiveness of the
                 evolutionary search. Our strategy consists of partial
                 fitness evaluations of individuals on random subsets of
                 the original data set, with a gradual increase in the
                 subset size in consecutive generations. From a series
                 of experiments performed on three test problems, we
                 observed that those evolutions that started from the
                 smallest subset sizes (10percent) consistently led to
                 results that are superior in terms of the goodness of
                 fit, consistency between independent runs, and
                 computational effort. Our experience indicates that
                 solutions obtained using this approach are also less
                 complex and more robust against over-fitting. We find
                 that the near-optimal strategy of allocating
                 computational budget over a GP run is to evenly
                 distribute it over all generations. This implies that
                 initially, more individuals can be evaluated using
                 small subset sizes, promoting better exploration.
                 Exploitation becomes more important towards the end of
                 the run, when all individuals are evaluated using the
                 full data set with correspondingly smaller population
  notes =        "part of \cite{Riolo:2007:GPTP} published 2008",
  affiliation =  "Tilburg University Tilburg The Netherlands",

Genetic Programming entries for Ekaterina (Katya) Vladislavleva Guido F Smits Mark Kotanchek