Using Feature Clustering for GP-Based Feature Construction on High- Dimensional Data

Created by W.Langdon from gp-bibliography.bib Revision:1.4496

  author =       "Binh Tran and Bing Xue and Mengjie Zhang",
  title =        "Using Feature Clustering for GP-Based Feature
                 Construction on High- Dimensional Data",
  booktitle =    "EuroGP 2017: Proceedings of the 20th European
                 Conference on Genetic Programming",
  year =         "2017",
  month =        "19-21 " # apr,
  editor =       "Mauro Castelli and James McDermott and 
                 Lukas Sekanina",
  series =       "LNCS",
  volume =       "10196",
  publisher =    "Springer Verlag",
  address =      "Amsterdam",
  pages =        "210--226",
  organisation = "species",
  keywords =     "genetic algorithms, genetic programming",
  DOI =          "doi:10.1007/978-3-319-55696-3_14",
  abstract =     "Feature construction is a pre-processing technique to
                 create new features with better discriminating ability
                 from the original features. Genetic programming (GP)
                 has been shown to be a prominent technique for this
                 task. However, applying GP to high-dimensional data is
                 still challenging due to the large search space.
                 Feature clustering groups similar features into
                 clusters, which can be used for dimensionality
                 reduction by choosing representative features from each
                 cluster to form the feature subset. Feature clustering
                 has been shown promising in feature selection; but has
                 not been investigated in feature construction for
                 classification. This paper presents the first work of
                 using feature clustering in this area. We propose a
                 cluster-based GP feature construction method called
                 CGPFC which uses feature clustering to improve the
                 performance of GP for feature construction on
                 high-dimensional data. Results on eight
                 high-dimensional datasets with varying difficulties
                 show that the CGPFC constructed features perform better
                 than the original full feature set and features
                 constructed by the standard GP constructor based on the
                 whole feature set.",
  notes =        "Part of \cite{Castelli:2017:GP} EuroGP'2017 held
                 inconjunction with EvoCOP2017, EvoMusArt2017 and

Genetic Programming entries for Binh Tran Bing Xue Mengjie Zhang