Multiple Imputation and Genetic Programming for Classification with Incomplete Data

Created by W.Langdon from gp-bibliography.bib Revision:1.4524

  author =       "Cao Truong Tran and Mengjie Zhang and 
                 Peter Andreae and Bing Xue",
  title =        "Multiple Imputation and Genetic Programming for
                 Classification with Incomplete Data",
  booktitle =    "Proceedings of the Genetic and Evolutionary
                 Computation Conference",
  series =       "GECCO '17",
  year =         "2017",
  isbn13 =       "978-1-4503-4920-8",
  address =      "Berlin, Germany",
  pages =        "521--528",
  size =         "8 pages",
  URL =          "",
  DOI =          "doi:10.1145/3071178.3071181",
  acmid =        "3071181",
  publisher =    "ACM",
  publisher_address = "New York, NY, USA",
  keywords =     "genetic algorithms, genetic programming,
                 classification, incomplete data, missing data, multiple
  month =        "15-19 " # jul,
  abstract =     "Many industrial and research datasets suffer from an
                 unavoidable issue of missing values. One of the most
                 common approaches to solving classification with
                 incomplete data is to use an imputation method to fill
                 missing values with plausible values before applying
                 classification algorithms. Multiple imputation is a
                 powerful approach to estimating missing values, but it
                 is very expensive to use multiple imputation to
                 estimate missing values for a single instance that
                 needs to be classified. Genetic programming (GP) has
                 been widely used to construct classifiers for complete
                 data, but it seldom has been used for incomplete data.
                 This paper proposes an approach to combining multiple
                 imputation and GP to evolve classifiers for incomplete
                 data. The proposed method uses multiple imputation to
                 provide a high quality training data. It also searches
                 for common patterns of missing values, and uses GP to
                 build a classifier for each pattern of missing values.
                 Therefore, the proposed method generates a set of
                 classifiers that can be used to directly classify any
                 new incomplete instance without requiring imputation.
                 Experimental results show that the proposed method not
                 only can be faster than other common methods for
                 classification with incomplete data but also can
                 achieve better classification accuracy.",
  notes =        "Also known as \cite{Tran:2017:MIG:3071178.3071181}
                 GECCO-2017 A Recombination of the 26th International
                 Conference on Genetic Algorithms (ICGA-2017) and the
                 22nd Annual Genetic Programming Conference (GP-2017)",

Genetic Programming entries for Cao Truong Tran Mengjie Zhang Peter Andreae Bing Xue