Evolving decision trees using oracle guides

Created by W.Langdon from gp-bibliography.bib Revision:1.4549

  author =       "Ulf Johansson and Lars Niklasson",
  title =        "Evolving decision trees using oracle guides",
  booktitle =    "IEEE Symposium on Computational Intelligence and Data
                 Mining, CIDM '09",
  year =         "2009",
  month =        "30 2009-" # apr # " 2",
  pages =        "238--244",
  keywords =     "genetic algorithms, genetic programming, data mining,
                 decision trees, high-accuracy techniques, human
                 inspection, neural network ensemble, opaque models,
                 oracle guides, predictive models, rule extraction,
                 transparent models, data mining, decision trees, neural
  DOI =          "doi:10.1109/CIDM.2009.4938655",
  abstract =     "Some data mining problems require predictive models to
                 be not only accurate but also comprehensible.
                 Comprehensibility enables human inspection and
                 understanding of the model, making it possible to trace
                 why individual predictions are made. Since most
                 high-accuracy techniques produce opaque models,
                 accuracy is, in practice, regularly sacrificed for
                 comprehensibility. One frequently studied technique,
                 often able to reduce this accuracy vs.
                 comprehensibility tradeoff, is rule extraction, i.e.,
                 the activity where another, transparent, model is
                 generated from the opaque. In this paper, it is argued
                 that techniques producing transparent models, either
                 directly from the dataset, or from an opaque model,
                 could benefit from using an oracle guide. In the
                 experiments, genetic programming is used to evolve
                 decision trees, and a neural network ensemble is used
                 as the oracle guide. More specifically, the datasets
                 used by the genetic programming when evolving the
                 decision trees, consist of several different
                 combinations of the original training data and 'oracle
                 data', i.e., training or test data instances, together
                 with corresponding predictions from the oracle. In
                 total, seven different ways of combining regular
                 training data with oracle data were evaluated, and the
                 results, obtained on 26 UCI datasets, clearly show that
                 the use of an oracle guide improved the performance. As
                 a matter of fact, trees evolved using training data
                 only had the worst test set accuracy of all setups
                 evaluated. Furthermore, statistical tests show that two
                 setups, both using the oracle guide, produced
                 significantly more accurate trees, compared to the
                 setup using training data only.",
  notes =        "Also known as \cite{4938655}",

Genetic Programming entries for Ulf Johansson Lars Niklasson