Knowledge acquisition from many-attribute data by genetic programming with clustered terminal symbols

Created by W.Langdon from gp-bibliography.bib Revision:1.4549

  author =       "Akira Hara and Haruko Tanaka and Takumi Ichimura and 
                 Tetsuyuki Takahama",
  title =        "Knowledge acquisition from many-attribute data by
                 genetic programming with clustered terminal symbols",
  journal =      "International Journal of Knowledge and Web
  year =         "2012",
  volume =       "3",
  number =       "2",
  pages =        "180--201",
  keywords =     "genetic algorithms, genetic programming, knowledge
                 acquisition, rule extraction, molecule classification,
                 data attributes, clustering, terminal symbols, soft
                 computing, similarities, molecules, page rank learning,
                 information retrieval",
  ISSN =         "1755-8255",
  DOI =          "doi:10.1504/IJKWI.2012.050286",
  bibdate =      "2012-11-30",
  bibsource =    "DBLP,
  abstract =     "Rule extraction from database by soft computing
                 methods is important for knowledge acquisition. For
                 example, knowledge from the web pages can be useful for
                 information retrieval. When genetic programming (GP) is
                 applied to rule extraction from a database, the
                 attributes of data are often used for the terminal
                 symbols. However, the real databases have a large
                 number of attributes. Therefore, the size of the
                 terminal set increases and the search space becomes
                 vast. For improving the search performance, we propose
                 new methods for dealing with the large-scale terminal
                 set. In the methods, the terminal symbols are clustered
                 based on the similarities of the attributes. In the
                 beginning of search, by using the clusters for
                 terminals instead of original attributes, the number of
                 terminal symbols can be reduced. Therefore, the search
                 space can be reduced. In the latter stage of search, by
                 using the original attributes for terminal symbols, the
                 local search is performed. We applied our proposed
                 methods to two many-attribute datasets, the
                 classification of molecules as a benchmark problem and
                 the page rank learning for information retrieval. By
                 comparison with the conventional GP, the proposed
                 methods showed the faster evolutionary speed and
                 extracted more accurate rules",

Genetic Programming entries for Akira Hara Haruko Tanaka Takumi Ichimura Tetsuyuki Takahama