Data based prediction of cancer diagnoses using heterogeneous model ensembles: a case study for breast cancer, melanoma, and cancer in the respiratory system

Created by W.Langdon from gp-bibliography.bib Revision:1.3872

@InProceedings{Winkler:2014:GECCOcomp,
  author =       "Stephan M. Winkler and Michael Affenzeller and 
                 Susanne Schaller and Herbert Stekel",
  title =        "Data based prediction of cancer diagnoses using
                 heterogeneous model ensembles: a case study for breast
                 cancer, melanoma, and cancer in the respiratory
                 system",
  booktitle =    "GECCO 2014 Workshop on Medical Applications of Genetic
                 and Evolutionary Computation (MedGEC)",
  year =         "2014",
  editor =       "Stephen L. Smith and Stefano Cagnoni and 
                 Robert M. Patton",
  isbn13 =       "978-1-4503-2881-4",
  keywords =     "genetic algorithms, genetic programming",
  pages =        "1337--1344",
  month =        "12-16 " # jul,
  organisation = "SIGEVO",
  address =      "Vancouver, BC, Canada",
  URL =          "http://doi.acm.org/10.1145/2598394.2609853",
  DOI =          "doi:10.1145/2598394.2609853",
  publisher =    "ACM",
  publisher_address = "New York, NY, USA",
  abstract =     "In this paper we discuss heterogeneous estimation
                 model ensembles for cancer diagnoses produced using
                 various machine learning algorithms. Based on patients'
                 data records including standard blood parameters,
                 tumour markers, and information about the diagnosis of
                 tumors, the goal is to identify mathematical models for
                 estimating cancer diagnoses. Several machine learning
                 approaches implemented in HeuristicLab and WEKA have
                 been applied for identifying estimators for selected
                 cancer diagnoses: k-nearest neighbour learning,
                 decision trees, artificial neural networks, support
                 vector machines, random forests, and genetic
                 programming. The models produced using these methods
                 have been combined to heterogeneous model ensembles.
                 All models trained during the learning phase are
                 applied during the test phase; the final classification
                 is annotated with a confidence value that specifies how
                 reliable the models are regarding the presented
                 decision: We calculate the final estimation for each
                 sample via majority voting, and the relative ratio of a
                 sample's majority vote is used for calculating the
                 confidence in the final estimation. We use a confidence
                 threshold that specifies the minimum confidence level
                 that has to be reached; if this threshold is not
                 reached for a sample, then there is no prediction for
                 that specific sample.

                 As we show in the results section, the accuracies of
                 diagnoses of breast cancer, melanoma, and respiratory
                 system cancer can so be increased significantly. We see
                 that increasing the confidence threshold leads to
                 higher classification accuracies, bearing in mind that
                 the ratio of samples, for which there is a
                 classification statement, is significantly decreased.",
  notes =        "Also known as \cite{2609853} Distributed at
                 GECCO-2014.",
}

Genetic Programming entries for Stephan M Winkler Michael Affenzeller Susanne Schaller Herbert Stekel

Citations