New Representations in Genetic Programming for Feature Construction in k-Means Clustering

Created by W.Langdon from gp-bibliography.bib Revision:1.4504

  author =       "Andrew Lensen and Bing Xue and Mengjie Zhang",
  title =        "New Representations in Genetic Programming for Feature
                 Construction in k-Means Clustering",
  booktitle =    "Proceedings of the 11th International Conference on
                 Simulated Evolution and Learning, SEAL-2017",
  year =         "2017",
  editor =       "Yuhui Shi and Kay Chen Tan and Mengjie Zhang and 
                 Ke Tang and Xiaodong Li and Qingfu Zhang and Ying Tan and 
                 Martin Middendorf and Yaochu Jin",
  volume =       "10593",
  series =       "Lecture Notes in Computer Science",
  pages =        "543--555",
  address =      "Shenzhen, China",
  month =        nov # " 10-13",
  publisher =    "Springer",
  keywords =     "genetic algorithms, genetic programming, Cluster
                 analysis, Feature construction, k-means, Evolutionary
  isbn13 =       "978-3-319-68759-9",
  URL =          "",
  DOI =          "doi:10.1007/978-3-319-68759-9_44",
  abstract =     "k-means is one of the fundamental and most well-known
                 algorithms in data mining. It has been widely used in
                 clustering tasks, but suffers from a number of
                 limitations on large or complex datasets. Genetic
                 Programming (GP) has been used to improve performance
                 of data mining algorithms by performing feature
                 construction the process of combining multiple
                 attributes (features) of a dataset together to produce
                 more powerful constructed features. In this paper, we
                 propose novel representations for using GP to perform
                 feature construction to improve the clustering
                 performance of the k-means algorithm. Our experiments
                 show significant performance improvement compared to
                 k-means across a variety of difficult datasets. Several
                 GP programs are also analysed to provide insight into
                 how feature construction is able to improve clustering

Genetic Programming entries for Andrew Lensen Bing Xue Mengjie Zhang