Reusing Genetic Programming for Ensemble Selection in Classification of Unbalanced Data

Created by W.Langdon from gp-bibliography.bib Revision:1.4524

  author =       "Urvesh Bhowan and Mark Johnston and Mengjie Zhang and 
                 Xin Yao",
  title =        "Reusing Genetic Programming for Ensemble Selection in
                 Classification of Unbalanced Data",
  journal =      "IEEE Transactions on Evolutionary Computation",
  year =         "2014",
  volume =       "18",
  number =       "6",
  pages =        "893--908",
  month =        dec,
  keywords =     "genetic algorithms, genetic programming",
  ISSN =         "1089-778X",
  DOI =          "doi:10.1109/TEVC.2013.2293393",
  size =         "16 pages",
  abstract =     "Classification algorithms can suffer from performance
                 degradation when the class distribution is unbalanced.
                 This paper develops a two-step approach to evolving
                 ensembles using genetic programming (GP) for unbalanced
                 data. The first step uses multi-objective (MO) GP to
                 evolve a Pareto approximated front of GP classifiers to
                 form the ensemble by trading-off the minority and the
                 majority class against each other during learning. The
                 MO component alleviates the reliance on sampling to
                 artificially re-balance the data. The second step,
                 which is the focus this paper, proposes a novel
                 ensemble selection approach using GP to automatically
                 find/choose the best individuals for the ensemble. This
                 new GP approach combines multiple Pareto-approximated
                 front members into a single composite genetic program
                 solution to represent the (optimised) ensemble. This
                 ensemble representation has two main
                 advantages/novelties over traditional genetic algorithm
                 (GA) approaches. Firstly, by limiting the depth of the
                 composite solution trees, we use selection pressure
                 during evolution to find small highly-cooperative
                 groups of individuals for the ensemble. This means that
                 ensemble sizes are not fixed a priori (as in GA), but
                 vary depending on the strength of the base learners.
                 Secondly, we compare different function set operators
                 in the composite solution trees to explore new ways to
                 aggregate the member outputs and thus, control how the
                 ensemble computes its output. We show that the proposed
                 GP approach evolves smaller, more diverse ensembles
                 compared to an established ensemble selection
                 algorithm, while still performing as well as, or better
                 than the established approach. The evolved GP ensembles
                 also perform well compared to other bagging and
                 boosting approaches, particularly on tasks with high
                 levels of class imbalance.",
  notes =        "Also known as \cite{6677603}",

Genetic Programming entries for Urvesh Bhowan Mark Johnston Mengjie Zhang Xin Yao