Trustable symbolic regression models: using ensembles, interval arithmetic and Pareto fronts to develop robust and trust-aware models

Created by W.Langdon from gp-bibliography.bib Revision:1.3973

@InCollection{Kotanchek:2007:GPTP,
  author =       "Mark Kotanchek and Guido Smits and 
                 Ekaterina Vladislavleva",
  title =        "Trustable symbolic regression models: using ensembles,
                 interval arithmetic and Pareto fronts to develop robust
                 and trust-aware models",
  booktitle =    "Genetic Programming Theory and Practice {V}",
  year =         "2007",
  editor =       "Rick L. Riolo and Terence Soule and Bill Worzel",
  series =       "Genetic and Evolutionary Computation",
  chapter =      "12",
  pages =        "201--220",
  address =      "Ann Arbor",
  month =        "17-19" # may,
  publisher =    "Springer",
  keywords =     "genetic algorithms, genetic programming",
  isbn13 =       "978-0-387-76308-8",
  URL =          "http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.457.5272",
  URL =          "http://www.evolved-analytics.com/sites/EA_Documents/Publications/GPTP07/GPTP07_TrustableModels_Preprint.pdf",
  DOI =          "doi:10.1007/978-0-387-76308-8_12",
  size =         "19 pages",
  abstract =     "Trust is a major issue with deploying empirical models
                 in the real world since changes in the underlying
                 system or use of the model in new regions of parameter
                 space can produce (potentially dangerous) incorrect
                 predictions. The trepidation involved with model usage
                 can be mitigated by assembling ensembles of diverse
                 models and using their consensus as a trust metric,
                 since these models will be constrained to agree in the
                 data region used for model development and also
                 constrained to disagree outside that region. The
                 problem is to define an appropriate model complexity
                 (since the ensemble should consist of models of similar
                 complexity), as well as to identify diverse models from
                 the candidate model set. In this chapter we discuss
                 strategies for the development and selection of robust
                 models and model ensembles and demonstrate those
                 strategies against industrial data sets. An important
                 benefit of this approach is that all available data may
                 be used in the model development rather than a
                 partition into training, test and validation subsets.
                 The result is constituent models are more accurate
                 without risk of over-fitting, the ensemble predictions
                 are more accurate and the ensemble predictions have a
                 meaningful trust metric.",
  notes =        "part of \cite{Riolo:2007:GPTP} published 2008",
}

Genetic Programming entries for Mark Kotanchek Guido F Smits Ekaterina (Katya) Vladislavleva

Citations