Schema Theory Based Data Engineering in Gene Expression Programming for Big Data Analytics

Created by W.Langdon from gp-bibliography.bib Revision:1.4549

  author =       "Zhengwen Huang and Maozhen Li and 
                 Christos Chousidis and Ali Mousavi and Changjun Jiang",
  title =        "Schema Theory Based Data Engineering in Gene
                 Expression Programming for Big Data Analytics",
  journal =      "IEEE Transactions on Evolutionary Computation",
  year =         "2018",
  volume =       "22",
  number =       "5",
  pages =        "792--804",
  month =        oct,
  keywords =     "genetic algorithms, genetic programming, Gene
                 expression programming, data engineering, big data
                 analytic, parallelization and segmentation",
  ISSN =         "1089-778X",
  URL =          "",
  DOI =          "doi:10.1109/TEVC.2017.2771445",
  size =         "14 pages",
  abstract =     "Gene expression programming (GEP) is a data driven
                 evolutionary technique that well suits for correlation
                 mining. Parallel GEPs are proposed to speed up the
                 evolution process using a cluster of computers or a
                 computer with multiple CPU cores. However, the
                 generation structure of chromosomes and the size of
                 input data are two issues that tend to be neglected
                 when speeding up GEP in evolution. To fill the research
                 gap, this paper proposes three guiding principles to
                 elaborate the computation nature of GEP in evolution
                 based on an analysis of GEP schema theory. As a result,
                 a novel data engineered GEP is developed which follows
                 closely the generation structure of chromosomes in
                 parallelization and considers the input data size in
                 segmentation. Experimental results on two data sets
                 with complementary features show that the data
                 engineered GEP speeds up the evolution process
                 significantly without loss of accuracy in data
                 correlation mining. Based on the experimental tests, a
                 computation model of the data engineered GEP is further
                 developed to demonstrate its high scalability in
                 dealing with potential big data using a large number of
                 CPU cores.",
  notes =        "also known as \cite{8187687}",

Genetic Programming entries for Zhengwen Huang Maozhen Li Christos Chousidis Ali Mousavi Changjun Jiang