Tapped Delay Lines for GP Streaming Data Classification with Label Budgets

Created by W.Langdon from gp-bibliography.bib Revision:1.4524

  author =       "Ali Vahdat and Jillian Morgan and 
                 Andrew R. McIntyre and Malcolm I. Heywood and A. Nur Zincir-Heywood",
  title =        "Tapped Delay Lines for {GP} Streaming Data
                 Classification with Label Budgets",
  booktitle =    "18th European Conference on Genetic Programming",
  year =         "2015",
  editor =       "Penousal Machado and Malcolm I. Heywood and 
                 James McDermott and Mauro Castelli and 
                 Pablo Garcia-Sanchez and Paolo Burelli and Sebastian Risi and Kevin Sim",
  series =       "LNCS",
  volume =       "9025",
  publisher =    "Springer",
  pages =        "126--138",
  address =      "Copenhagen",
  month =        "8-10 " # apr,
  organisation = "EvoStar",
  keywords =     "genetic algorithms, genetic programming, Streaming
                 data classification, Non-stationary, Class imbalance,
  isbn13 =       "978-3-319-16500-4",
  DOI =          "doi:10.1007/978-3-319-16501-1_11",
  abstract =     "Streaming data classification requires that a model be
                 available for classifying stream content while
                 simultaneously detecting and reacting to changes to the
                 underlying process generating the data. Given that only
                 a fraction of the stream is visible at any point in
                 time (i.e. some form of window interface) then it is
                 difficult to place any guarantee on a classifier
                 encountering a well mixed distribution of classes
                 across the stream. Moreover, streaming data classifiers
                 are also required to operate under a limited label
                 budget (labelling all the data is too expensive). We
                 take these requirements to motivate the use of an
                 active learning strategy for decoupling genetic
                 programming training epochs from stream throughput. The
                 content of a data subset is controlled by a combination
                 of Pareto archiving and stochastic sampling. In
                 addition, a significant benefit is attributed to
                 support for a tapped delay line (TDL) interface to the
                 stream, but this also increases the dimensionality of
                 the task. We demonstrate that the benefits of assuming
                 the TDL can be maintained through the use of
                 oversampling without recourse to additional label
                 information. Benchmarking on 4 dataset demonstrates
                 that the approach is particularly effective when
                 reacting to shifts in the underlying properties of the
                 stream. Moreover, an online formulation for class-wise
                 detection rate is assumed, where this is able to
                 robustly characterise classifier performance throughout
                 the stream.",
  notes =        "Part of \cite{Machado:2015:GP} EuroGP'2015 held in
                 conjunction with EvoCOP2015, EvoMusArt2015 and

Genetic Programming entries for Ali Vahdat Jillian Morgan Andrew R McIntyre Malcolm Heywood Nur Zincir-Heywood