Exploring Interestingness in a Computational Evolution System for the Genome-Wide Genetic Analysis of Alzheimer's Disease

Created by W.Langdon from gp-bibliography.bib Revision:1.4496

  author =       "Jason H. Moore and Douglas P. Hill and 
                 Andrew Saykin and Li Shen",
  title =        "Exploring Interestingness in a Computational Evolution
                 System for the Genome-Wide Genetic Analysis of
                 Alzheimer's Disease",
  booktitle =    "Genetic Programming Theory and Practice XI",
  year =         "2013",
  series =       "Genetic and Evolutionary Computation",
  editor =       "Rick Riolo and Jason H. Moore and Mark Kotanchek",
  publisher =    "Springer",
  chapter =      "2",
  pages =        "31--45",
  address =      "Ann Arbor, USA",
  month =        "9-11 " # may,
  keywords =     "genetic algorithms, genetic programming, Computational
                 evolution; Genetic epidemiology; Epistasis; Gene-gene
  isbn13 =       "978-1-4939-0374-0",
  DOI =          "doi:10.1007/978-1-4939-0375-7_2",
  abstract =     "Susceptibility to Alzheimer's disease is likely due to
                 complex interaction among many genetic and
                 environmental factors. Identifying complex genetic
                 effects in large data sets will require computational
                 methods that extend beyond what parametric statistical
                 methods such as logistic regression can provide. We
                 have previously introduced a computational evolution
                 system (CES) that uses genetic programming (GP) to
                 represent genetic models of disease and to search for
                 optimal models in a rugged fitness landscape that is
                 effectively infinite in size. The CES approach differs
                 from other GP approaches in that it is able to learn
                 how to solve the problem by generating its own
                 operators. A key feature is the ability for the
                 operators to use expert knowledge to guide the
                 stochastic search. We have previously shown that CES is
                 able to discover nonlinear genetic models of disease
                 susceptibility in both simulated and real data. The
                 goal of the present study was to introduce a measure of
                 interestingness into the modelling process. Here, we
                 define interestingness as a measure of non-additive
                 gene-gene interactions. That is, we are more interested
                 in those CES models that include attributes that
                 exhibit synergistic effects on disease risk. To
                 implement this new feature we first pre-processed the
                 data to measure all pairwise gene-gene interaction
                 effects using entropy-based methods. We then provided
                 these pre-computed measures to CES as expert knowledge
                 and as one of three fitness criteria in
                 three-dimensional Pareto optimisation. We applied this
                 new CES algorithm to an Alzheimer's disease data set
                 with approximately 520,000 genetic attributes. We show
                 that this approach discovers more interesting models
                 with the added benefit of improving classification
                 accuracy. This study demonstrates the applicability of
                 CES to genome-wide genetic analysis using expert
                 knowledge derived from measures of interestingness.",
  notes =        "http://cscs.umich.edu/gptp-workshops/

                 Part of \cite{Riolo:2013:GPTP} published after the
                 workshop in 2013",

Genetic Programming entries for Jason H Moore Douglas P Hill Andrew J Saykin Li Shen