A General Feature Engineering Wrapper for Machine Learning Using epsilon-Lexicase Survival

Created by W.Langdon from gp-bibliography.bib Revision:1.4524

  author =       "William {La Cava} and Jason Moore",
  title =        "A General Feature Engineering Wrapper for Machine
                 Learning Using epsilon-Lexicase Survival",
  booktitle =    "EuroGP 2017: Proceedings of the 20th European
                 Conference on Genetic Programming",
  year =         "2017",
  month =        "19-21 " # apr,
  editor =       "Mauro Castelli and James McDermott and 
                 Lukas Sekanina",
  series =       "LNCS",
  volume =       "10196",
  publisher =    "Springer Verlag",
  address =      "Amsterdam",
  pages =        "80--95",
  organisation = "species",
  keywords =     "genetic algorithms, genetic programming",
  DOI =          "doi:10.1007/978-3-319-55696-3_6",
  abstract =     "We propose a general wrapper for feature learning that
                 interfaces with other machine learning methods to
                 compose effective data representations. The proposed
                 feature engineering wrapper (FEW) uses genetic
                 programming to represent and evolve individual features
                 tailored to the machine learning method with which it
                 is paired. In order to maintain feature diversity,
                 e-lexicase survival is introduced, a method based on
                 epsilon-lexicase selection. This survival method
                 preserves semantically unique individuals in the
                 population based on their ability to solve difficult
                 subsets of training cases, thereby yielding a
                 population of uncorrelated features. We demonstrate FEW
                 with five different off-the-shelf machine learning
                 methods and test it on a set of real-world and
                 synthetic regression problems with dimensions varying
                 across three orders of magnitude. The results show that
                 FEW is able to improve model test predictions across
                 problems for several ML methods. We discuss and test
                 the scalability of FEW in comparison to other feature
                 composition strategies, most notably polynomial feature
  notes =        "Part of \cite{Castelli:2017:GP} EuroGP'2017 held
                 inconjunction with EvoCOP2017, EvoMusArt2017 and

Genetic Programming entries for William La Cava Jason H Moore