Pareto front genetic programming parameter selection based on design of experiments and industrial data

Created by W.Langdon from gp-bibliography.bib Revision:1.4524

  author =       "Flor Castillo and Arthur Kordon and Guido Smits and 
                 Ben Christenson and Dee Dickerson",
  title =        "Pareto front genetic programming parameter selection
                 based on design of experiments and industrial data",
  booktitle =    "{GECCO 2006:} Proceedings of the 8th annual conference
                 on Genetic and evolutionary computation",
  year =         "2006",
  editor =       "Maarten Keijzer and Mike Cattolico and Dirk Arnold and 
                 Vladan Babovic and Christian Blum and Peter Bosman and 
                 Martin V. Butz and Carlos {Coello Coello} and 
                 Dipankar Dasgupta and Sevan G. Ficici and James Foster and 
                 Arturo Hernandez-Aguirre and Greg Hornby and 
                 Hod Lipson and Phil McMinn and Jason Moore and Guenther Raidl and 
                 Franz Rothlauf and Conor Ryan and Dirk Thierens",
  volume =       "2",
  ISBN =         "1-59593-186-4",
  pages =        "1613--1620",
  address =      "Seattle, Washington, USA",
  URL =          "",
  DOI =          "doi:10.1145/1143997.1144264",
  publisher =    "ACM Press",
  publisher_address = "New York, NY, 10286-1405, USA",
  month =        "8-12 " # jul,
  organisation = "ACM SIGEVO (formerly ISGEC)",
  keywords =     "genetic algorithms, genetic programming, Real-World
                 Applications, industrial applications, Pareto front,
                 statistical design of experiments, symbolic
  size =         "8 pages",
  abstract =     "Symbolic regression based on Pareto Front GP is the
                 key approach for generating high-performance
                 parsimonious empirical models acceptable for industrial
                 applications. The paper addresses the issue of finding
                 the optimal parameter settings of Pareto Front GP which
                 direct the simulated evolution toward simple models
                 with acceptable prediction error. A generic methodology
                 based on statistical design of experiments is proposed.
                 It includes statistical determination of the number of
                 replicates by half-width confidence intervals,
                 determination of the significant inputs by fractional
                 factorial design of experiments, approaching the
                 optimum by steepest ascent/descent, and local
                 exploration around the optimum by Box Behnken or by
                 central composite design of experiments. The results
                 from implementing the proposed methodology to a
                 small-sized industrial data set show that the
                 statistically significant factors for symbolic
                 regression, based on Pareto Front GP, are the number of
                 cascades, the number of generations, and the population
                 size. A second order regression model with high R2 of
                 0.97 includes the three parameters and their optimal
                 values have been defined. The optimal parameter
                 settings were validated with a separate small sized
                 industrial data set. The optimal settings are
                 recommended for symbolic regression applications using
                 data sets with up to 5 inputs and up to 50 data
  notes =        "GECCO-2006 A joint meeting of the fifteenth
                 international conference on genetic algorithms
                 (ICGA-2006) and the eleventh annual genetic programming
                 conference (GP-2006).

                 ACM Order Number 910060",

Genetic Programming entries for Flor A Castillo Arthur K Kordon Guido F Smits Ben Christenson Dee Dickerson