Sensitivity-like Analysis for Feature Selection in Genetic Programming

Created by W.Langdon from gp-bibliography.bib Revision:1.4549

  author =       "Grant Dick",
  title =        "Sensitivity-like Analysis for Feature Selection in
                 Genetic Programming",
  booktitle =    "Proceedings of the Genetic and Evolutionary
                 Computation Conference",
  series =       "GECCO '17",
  year =         "2017",
  isbn13 =       "978-1-4503-4920-8",
  address =      "Berlin, Germany",
  pages =        "401--408",
  size =         "8 pages",
  URL =          "",
  DOI =          "doi:10.1145/3071178.3071338",
  acmid =        "3071338",
  publisher =    "ACM",
  publisher_address = "New York, NY, USA",
  keywords =     "genetic algorithms, genetic programming, CART, feature
                 selection, random forests, symbolic regression,
                 variable importance",
  month =        "15-19 " # jul,
  abstract =     "feature selection is an important process within
                 machine learning problems. Through pressures imposed on
                 models during evolution, genetic programming performs
                 basic feature selection, and so analysis of the evolved
                 models can provide some insights into the utility of
                 input features. Previous work has tended towards a
                 presence model of feature selection, where the
                 frequency of a feature appearing within evolved models
                 is a metric for its utility. In this paper, we identify
                 some drawbacks with using this approach, and instead
                 propose the integration of importance measures for
                 feature selection that measure the influence of a
                 feature within a model. Using sensitivity-like analysis
                 methods inspired by importance measures used in random
                 forest regression, we demonstrate that genetic
                 programming introduces many features into evolved
                 models that have little impact on a given model's
                 behaviour, and this can mask the true importance of
                 salient features. The paper concludes by exploring
                 bloat control methods and adaptive terminal selection
                 methods to influence the identification of useful
                 features within the search performed by genetic
                 programming, with results suggesting that a combination
                 of adaptive terminal selection and bloat control may
                 help to improve generalisation performance.",
  notes =        "Also known as \cite{Dick:2017:SAF:3071178.3071338}
                 GECCO-2017 A Recombination of the 26th International
                 Conference on Genetic Algorithms (ICGA-2017) and the
                 22nd Annual Genetic Programming Conference (GP-2017)",

Genetic Programming entries for Grant Dick