Feature Selection and Molecular Classification of Cancer Using Genetic Programming

Created by W.Langdon from gp-bibliography.bib Revision:1.4192

  author =       "Jianjun Yu and Jindan Yu and Arpit A. Almal and 
                 Saravana M. Dhanasekaran and Debashis Ghosh and 
                 William P. Worzel and Arul M. Chinnaiyan",
  title =        "Feature Selection and Molecular Classification of
                 Cancer Using Genetic Programming",
  journal =      "Neoplasia",
  year =         "2007",
  volume =       "9",
  number =       "4",
  pages =        "292--303",
  month =        apr,
  keywords =     "genetic algorithms, genetic programming, Molecular
                 diagnostics, biomarkers, prostate cancer, evolutionary
                 algorithm, microarray profiling",
  DOI =          "doi:10.1593/neo.07121",
  size =         "15 pages",
  abstract =     "Despite important advances in microarray-based
                 molecular classification of tumours, its application in
                 clinical settings remains formidable. This is in part
                 due to the limitation of current analysis programs in
                 discovering robust biomarkers and developing
                 classifiers with a practical set of genes. Genetic
                 programming (GP) is a type of machine learning
                 technique that uses evolutionary algorithm to simulate
                 natural selection as well as population dynamics, hence
                 leading to simple and comprehensible classifiers. Here
                 we applied GP to cancer expression profiling data to
                 select feature genes and build molecular classifiers by
                 mathematical integration of these genes. Analysis of
                 thousands of GP classifiers generated for a prostate
                 cancer data set revealed repetitive use of a set of
                 highly discriminative feature genes, many of which are
                 known to be disease associated. GP classifiers often
                 comprise five or less genes and successfully predict
                 cancer types and subtypes. More importantly, GP
                 classifiers generated in one study are able to predict
                 samples from an independent study, which may have used
                 different microarray platforms. In addition, GP yielded
                 classification accuracy better than or similar to
                 conventional classification methods. Furthermore, the
                 mathematical expression of GP classifiers provides
                 insights into relationships between classifier genes.
                 Taken together, our results demonstrate that GP may be
                 valuable for generating effective classifiers
                 containing a practical set of genes for
                 diagnostic/prognostic cancer classification.",
  notes =        "PMID: 17460773 [PubMed - indexed for MEDLINE]

                 ONCOMINE data sets. Fitness based on AUROC. Max node
                 count 8. 12 demes. z score gteq 40. ROC",

Genetic Programming entries for Jianjun Yu Jindan Yu Arpit A Almal Saravana M Dhanasekaran Debashis Ghosh William P Worzel Arul M Chinnaiyan