Evolving accurate and compact classification rules with gene expression programming

Created by W.Langdon from gp-bibliography.bib Revision:1.4192

  author =       "Chi Zhou and Weimin Xiao and Thomas M. Tirpak and 
                 Peter C. Nelson",
  title =        "Evolving accurate and compact classification rules
                 with gene expression programming",
  journal =      "IEEE Transactions on Evolutionary Computation",
  year =         "2003",
  volume =       "7",
  number =       "6",
  pages =        "519--531",
  month =        dec,
  keywords =     "genetic algorithms, genetic programming,
                 classification rule, data mining, gene expression
                 programming, GEP",
  ISSN =         "1089-778X",
  DOI =          "doi:10.1109/TEVC.2003.819261",
  size =         "13 pages",
  abstract =     "Classification is one of the fundamental tasks of data
                 mining. Most rule induction and decision tree
                 algorithms perform local, greedy search to generate
                 classification rules that are often more complex than
                 necessary. Evolutionary algorithms for pattern
                 classification have recently received increased
                 attention because they can perform global searches. In
                 this paper, we propose a new approach for discovering
                 classification rules by using gene expression
                 programming (GEP), a new technique of genetic
                 programming (GP) with linear representation. The
                 antecedent of discovered rules may involve many
                 different combinations of attributes. To guide the
                 search process, we suggest a fitness function
                 considering both the rule consistency gain and
                 completeness. A multiclass classification problem is
                 formulated as multiple two-class problems by using the
                 one-against-all learning method. The covering strategy
                 is applied to learn multiple rules if applicable for
                 each class. Compact rule sets are subsequently evolved
                 using a two-phase pruning method based on the minimum
                 description length (MDL) principle and the integration
                 theory. Our approach is also noise tolerant and able to
                 deal with both numeric and nominal attributes.
                 Experiments with several benchmark data sets have shown
                 up to 20% improvement in validation accuracy, compared
                 with C4.5 algorithms. Furthermore, the proposed GEP
                 approach is more efficient and tends to generate
                 shorter solutions compared with canonical tree-based GP

Genetic Programming entries for Chi Zhou Weimin Xiao Thomas M Tirpak Peter C Nelson