Induction of Decision Trees via Evolutionary Programming

Created by W.Langdon from gp-bibliography.bib Revision:1.4202

  author =       "Robert Kirk DeLisle and Steven L. Dixon",
  title =        "Induction of Decision Trees via Evolutionary
  journal =      "Journal of Chemical Information and Modeling",
  year =         "2004",
  volume =       "44",
  number =       "3",
  pages =        "862--870",
  keywords =     "genetic algorithms, genetic programming, EP, EPTree",
  DOI =          "doi:10.1021/ci034188s",
  abstract =     "Decision trees have been used extensively in
                 cheminformatics for modelling various biochemical
                 endpoints including receptor-ligand binding, ADME
                 properties, environmental impact, and toxicity. The
                 traditional approach to inducing decision trees based
                 upon a given training set of data involves recursive
                 partitioning which selects partitioning variables and
                 their values in a greedy manner to optimise a given
                 measure of purity. This methodology has numerous
                 benefits including classifier interpretability and the
                 capability of modeling nonlinear relationships. The
                 greedy nature of induction, however, may fail to
                 elucidate underlying relationships between the data and
                 endpoints. Using evolutionary programming, decision
                 trees are induced which are significantly more accurate
                 than trees induced by recursive partitioning.
                 Furthermore, when assessed on previously unseen data in
                 a 10-fold cross-validated manner, evolutionary
                 programming induced trees exhibit a significantly
                 higher accuracy on previously unseen data. This
                 methodology is compared to single-tree and
                 multiple-tree recursive partitioning in two domains
                 (aerobic biodegradability and hepatotoxicity) and shown
                 to produce less complex classifiers with average
                 increases in predictive accuracy of 5-10\% over the
                 traditional method.",
  notes =        "
                 American Chemical Society, ACS Publications Division

                 Department of Molecular Modeling, Pharmacopeia, P.O.
                 Box 5350, Princeton, New Jersey 08543-5350, and
                 Schrodinger, 120 West 45th Street, 32nd Floor, New
                 York, New York


Genetic Programming entries for Robert Kirk DeLisle Steven L Dixon