Multiple feature construction in classification on high-dimensional data using GP

Created by W.Langdon from gp-bibliography.bib Revision:1.4549

  author =       "Binh Tran and Mengjie Zhang and Bing Xue",
  booktitle =    "2016 IEEE Symposium Series on Computational
                 Intelligence (SSCI)",
  title =        "Multiple feature construction in classification on
                 high-dimensional data using GP",
  year =         "2016",
  abstract =     "Feature construction and feature selection are common
                 pre-processing techniques to obtain smaller but better
                 discriminating feature sets than the original ones.
                 These two techniques are essential in high-dimensional
                 data with thousands or tens of thousands of features
                 where there may exist many irrelevant and redundant
                 features. Genetic programming (GP) is a powerful
                 technique that has shown promising results in feature
                 construction and feature selection. However,
                 constructing multiple features for high-dimensional
                 data is still challenging due to its large search
                 space. In this paper, we propose a GP-based method that
                 simultaneously performs multiple feature construction
                 and feature selection to automatically transform
                 high-dimensional datasets into much smaller ones.
                 Experiment results on six datasets show that the size
                 of the generated feature set is less than 4percent of
                 the original feature set size and it significantly
                 improves the performance of K-Nearest Neighbour, Naive
                 Bayes and Decision Tree algorithms on 15 out of 18
                 comparisons. Compared with the single feature
                 construction method using GP, the proposed method has
                 better performance on half cases and similar on the
                 other half. Comparisons between the constructed
                 features, the selected features and the combination of
                 both constructed and selected features by the propose
                 method reveal different preferences of the three
                 learning algorithms on these feature sets.",
  keywords =     "genetic algorithms, genetic programming",
  DOI =          "doi:10.1109/SSCI.2016.7850130",
  month =        dec,
  notes =        "Also known as \cite{7850130}",

Genetic Programming entries for Binh Tran Mengjie Zhang Bing Xue