Classification of Gene Expression Data with Genetic Programming

  author =       "Joseph A. Driscoll and Bill Worzel and 
                 Duncan MacLean",
  title =        "Classification of Gene Expression Data with Genetic
  booktitle =    "Genetic Programming Theory and Practice",
  publisher =    "Kluwer",
  year =         "2003",
  editor =       "Rick L. Riolo and Bill Worzel",
  chapter =      "3",
  pages =        "25--42",
  keywords =     "genetic algorithms, genetic programming,
                 classification, molecular diagnostics",
  ISBN =         "1-4020-7581-2",
  URL =          "",
  DOI =          "doi:10.1007/978-1-4419-8983-3_3",
  abstract =     "This paper summarises the use of a genetic programming
                 (GP) system to develop classification rules for gene
                 expression data that hold promise for the development
                 of new molecular diagnostics. This work focuses on
                 discovering simple, accurate rules that diagnose
                 diseases based on changes of gene expression profiles
                 within a diseased cell. GP is shown to be a useful
                 technique for discovering classification rules in a
                 supervised learning mode where the biological genotype
                 is paired with a biological phenotype such as a disease
                 state. In the process of developing these rules it is
                 necessary to develop new techniques for establishing
                 fitness and interpreting the results of evolutionary
                 runs because of the large number of independent
                 variables and the comparatively small number of
                 samples. These techniques are described and issues of
                 overfitting caused by small sample sizes and the
                 behaviour of the GP system when variables are missing
                 from the samples are discussed.",
  notes =        "Part of \cite{RioloWorzel:2003}",
