Evolving meta-ensemble of classifiers for handling incomplete and unbalanced datasets in the cyber security domain

Created by W.Langdon from gp-bibliography.bib Revision:1.3872

@Article{Folino:2016:ASC,
  author =       "G. Folino and F. S. Pisani",
  title =        "Evolving meta-ensemble of classifiers for handling
                 incomplete and unbalanced datasets in the cyber
                 security domain",
  journal =      "Applied Soft Computing",
  volume =       "47",
  pages =        "179--190",
  year =         "2016",
  ISSN =         "1568-4946",
  DOI =          "doi:10.1016/j.asoc.2016.05.044",
  URL =          "http://www.sciencedirect.com/science/article/pii/S156849461630254X",
  abstract =     "Cyber security classification algorithms usually
                 operate with datasets presenting many missing features
                 and strongly unbalanced classes. In order to cope with
                 these issues, we designed a distributed genetic
                 programming (GP) framework, named CAGE-MetaCombiner,
                 which adopts a meta-ensemble model to operate
                 efficiently with missing data. Each ensemble evolves a
                 function for combining the classifiers, which does not
                 need of any extra phase of training on the original
                 data. Therefore, in the case of changes in the data,
                 the function can be recomputed in an incremental way,
                 with a moderate computational effort; this aspect
                 together with the advantages of running on
                 parallel/distributed architectures makes the algorithm
                 suitable to operate with the real time constraints
                 typical of a cyber security problem. In addition, an
                 important cyber security problem that concerns the
                 classification of the users or the employers of an
                 e-payment system is illustrated, in order to show the
                 relevance of the case in which entire sources of data
                 or groups of features are missing. Finally, the
                 capacity of approach in handling groups of missing
                 features and unbalanced datasets is validated on many
                 artificial datasets and on two real datasets and it is
                 compared with some similar approaches.",
  keywords =     "genetic algorithms, genetic programming, Ensemble,
                 Data mining, Cyber security, Missing features",
}

Genetic Programming entries for Gianluigi Folino Francesco Sergio Pisani

Citations