On the Impact of Class Imbalance in GP Streaming Classification with Label Budgets

Created by W.Langdon from gp-bibliography.bib Revision:1.4549

  author =       "Sara Khanchi and Malcolm Iain Heywood and 
                 Nur Zincir-Heywood",
  title =        "On the Impact of Class Imbalance in GP Streaming
                 Classification with Label Budgets",
  booktitle =    "EuroGP 2016: Proceedings of the 19th European
                 Conference on Genetic Programming",
  year =         "2016",
  month =        "30 " # mar # "--1 " # apr,
  editor =       "Malcolm I. Heywood and James McDermott and 
                 Mauro Castelli and Ernesto Costa and Kevin Sim",
  series =       "LNCS",
  volume =       "9594",
  publisher =    "Springer Verlag",
  address =      "Porto, Portugal",
  pages =        "35--50",
  organisation = "EvoStar",
  keywords =     "genetic algorithms, genetic programming",
  isbn13 =       "978-3-319-30668-1",
  DOI =          "doi:10.1007/978-3-319-30668-1_3",
  abstract =     "Streaming data scenarios introduce a set of
                 requirements that do not exist under supervised
                 learning paradigms typically employed for
                 classification. Specific examples include, anytime
                 operation, non-stationary processes, and limited label
                 budgets. From the perspective of class imbalance, this
                 implies that it is not even possible to guarantee that
                 all classes are present in the samples of data used to
                 construct a model. Moreover, when decisions are made
                 regarding what subset of data to sample, no label
                 information is available. Only after sampling is label
                 information provided. This represents a more
                 challenging task than encountered under non-streaming
                 (offline) scenarios because the training partition
                 contains label information. In this work, we
                 investigate the utility of different protocols for
                 sampling from the stream under the above constraints.
                 Adopting a uniform sampling protocol was previously
                 shown to be reasonably effective under both
                 evolutionary and non-evolutionary streaming
                 classifiers. In this work, we introduce a scheme for
                 using the current champion classifier to bias the
                 sampling of training instances \textit{during} the
                 course of the stream. The resulting streaming framework
                 for genetic programming is more effective at sampling
                 minor classes and therefore reacting to changes in the
                 underlying process responsible for generating the data
  notes =        "Part of \cite{Heywood:2016:GP} EuroGP'2016 held in
                 conjunction with EvoCOP2016, EvoMusArt2016 and

Genetic Programming entries for Sara Khanchi Malcolm Heywood Nur Zincir-Heywood