Evolutionary identification of cancer predictors using clustered data: a case study for breast cancer, melanoma, and cancer in the respiratory system

Created by W.Langdon from gp-bibliography.bib Revision:1.4549

  author =       "Stephan M. Winkler and Michael Affenzeller and 
                 Herbert Stekel",
  title =        "Evolutionary identification of cancer predictors using
                 clustered data: a case study for breast cancer,
                 melanoma, and cancer in the respiratory system",
  booktitle =    "GECCO '13 Companion: Proceeding of the fifteenth
                 annual conference companion on Genetic and evolutionary
                 computation conference companion",
  year =         "2013",
  editor =       "Christian Blum and Enrique Alba and 
                 Thomas Bartz-Beielstein and Daniele Loiacono and 
                 Francisco Luna and Joern Mehnen and Gabriela Ochoa and 
                 Mike Preuss and Emilia Tantar and Leonardo Vanneschi and 
                 Kent McClymont and Ed Keedwell and Emma Hart and 
                 Kevin Sim and Steven Gustafson and 
                 Ekaterina Vladislavleva and Anne Auger and Bernd Bischl and Dimo Brockhoff and 
                 Nikolaus Hansen and Olaf Mersmann and Petr Posik and 
                 Heike Trautmann and Muhammad Iqbal and Kamran Shafi and 
                 Ryan Urbanowicz and Stefan Wagner and 
                 Michael Affenzeller and David Walker and Richard Everson and 
                 Jonathan Fieldsend and Forrest Stonedahl and 
                 William Rand and Stephen L. Smith and Stefano Cagnoni and 
                 Robert M. Patton and Gisele L. Pappa and 
                 John Woodward and Jerry Swan and Krzysztof Krawiec and 
                 Alexandru-Adrian Tantar and Peter A. N. Bosman and 
                 Miguel Vega-Rodriguez and Jose M. Chaves-Gonzalez and 
                 David L. Gonzalez-Alvarez and 
                 Sergio Santander-Jimenez and Lee Spector and Maarten Keijzer and 
                 Kenneth Holladay and Tea Tusar and Boris Naujoks",
  isbn13 =       "978-1-4503-1964-5",
  keywords =     "genetic algorithms, genetic programming",
  pages =        "1463--1470",
  month =        "6-10 " # jul,
  organisation = "SIGEVO",
  address =      "Amsterdam, The Netherlands",
  DOI =          "doi:10.1145/2464576.2466809",
  publisher =    "ACM",
  publisher_address = "New York, NY, USA",
  abstract =     "In this paper we discuss the effects of using
                 pre-clustered data on the identification of estimation
                 models for cancer diagnoses. Based on patients' data
                 records including standard blood parameters, tumour
                 markers, and information about the diagnosis of tumors,
                 the goal is to identify mathematical models for
                 estimating cancer diagnoses. We have applied a hybrid
                 clustering and classification approach that first
                 identifies data clusters (using standard patient data
                 and tumor markers) and then learns prediction models on
                 the basis of these data clusters. In the empirical
                 section we analyse the clusters of patient data samples
                 formed using k-means clustering: The optimal number of
                 clusters is identified, and we investigate the
                 homogeneity of these clusters.

                 Several evolutionary modelling approaches implemented
                 in HeuristicLab have been applied for subsequently
                 identifying estimators for selected cancer diagnoses:
                 Linear regression, k-nearest neighbour learning,
                 artificial neural networks, and support vector machines
                 (all optimised using evolutionary algorithms) as well
                 as genetic programming. As we show in the results
                 section, the investigated diagnoses of breast cancer,
                 melanoma, and respiratory system cancer can be
                 estimated correctly in up to 84.2percent, 80.3percent,
                 and 94.1percent of the analysed test cases,
                 respectively; without tumour markers up to 78.2percent,
                 78percent, and 93.3percent of the test samples are
                 correctly estimated, respectively.",
  notes =        "Also known as \cite{2466809} Distributed at

Genetic Programming entries for Stephan M Winkler Michael Affenzeller Herbert Stekel