Handling Different Categories of Concept Drifts in Data Streams using Distributed GP

Created by W.Langdon from gp-bibliography.bib Revision:1.4216

  author =       "Gianluigi Folino and Giuseppe Papuzzo",
  title =        "Handling Different Categories of Concept Drifts in
                 Data Streams using Distributed GP",
  booktitle =    "Proceedings of the 13th European Conference on Genetic
                 Programming, EuroGP 2010",
  year =         "2010",
  editor =       "Anna Isabel Esparcia-Alcazar and Aniko Ekart and 
                 Sara Silva and Stephen Dignum and A. Sima Uyar",
  volume =       "6021",
  series =       "LNCS",
  pages =        "74--85",
  address =      "Istanbul",
  month =        "7-9 " # apr,
  organisation = "EvoStar",
  publisher =    "Springer",
  keywords =     "genetic algorithms, genetic programming",
  isbn13 =       "978-3-642-12147-0",
  DOI =          "doi:10.1007/978-3-642-12148-7_7",
  abstract =     "Using Genetic Programming (GP) for classifying data
                 streams is problematic as GP is slow compared with
                 traditional single solution techniques. However, the
                 availability of cheaper and better-performing
                 distributed and parallel architectures make it possible
                 to deal with complex problems previously hardly solved
                 owing to the large amount of time necessary. This work
                 presents a general framework based on a distributed GP
                 ensemble algorithm for coping with different types of
                 concept drift for the task of classification of large
                 data streams. The framework is able to detect changes
                 in a very efficient way using only a detection function
                 based on the incoming unclassified data. Thus, only if
                 a change is detected a distributed GP algorithm is
                 performed in order to improve classification accuracy
                 and this limits the overhead associated with the use of
                 a population-based method. Real world data streams may
                 present drifts of different types. The introduced
                 detection function, based on the self-similarity
                 fractal dimension, permits to cope in a very short time
                 with the main types of different drifts, as
                 demonstrated by the first experiments performed on some
                 artificial datasets. Furthermore, having an adequate
                 number of resources, distributed GP can handle very
                 frequent concept drifts.",
  notes =        "BoostCGPC, cellular GP, island model, AdaBoost,
                 Fractal dimension FD3, cloud computing, Minku Part of
                 \cite{Esparcia-Alcazar:2010:GP} EuroGP'2010 held in
                 conjunction with EvoCOP2010 EvoBIO2010 and

Genetic Programming entries for Gianluigi Folino Giuseppe Papuzzo