Applying genetic programming to the prediction of alternative mRNA splice variants

  author =       "Ivana Vukusic and Sushma Nagaraja Grellscheid and 
                 Thomas Wiehe",
  title =        "Applying genetic programming to the prediction of
                 alternative mRNA splice variants",
  journal =      "Genomics",
  year =         "2007",
  volume =       "89",
  number =       "4",
  pages =        "471--479",
  month =        apr,
  keywords =     "genetic algorithms, genetic programming, Alternative
                 splicing, Cassette exon, Intron retention, Feature
                 matrix, Splice signals",
  DOI =          "doi:10.1016/j.ygeno.2007.01.001",
  abstract =     "Genetic programming (GP) can be used to classify a
                 given gene sequence as either constitutively or
                 alternatively spliced. We describe the principles of GP
                 and apply it to a well-defined data set of
                 alternatively spliced genes. A feature matrix of
                 sequence properties, such as nucleotide composition or
                 exon length, was passed to the GP system Discipulus To
                 test its performance we concentrated on cassette exons
                 (SCE) and retained introns (SIR). We analysed 27,519
                 constitutively spliced and 9641 cassette exons
                 including their neighbouring introns; in addition we
                 analysed 33316 constitutively spliced introns compared
                 to 2712 retained introns. We find that the classifier
                 yields highly accurate predictions on the SIR data with
                 a sensitivity of 92.1percent and a specificity of
                 79.2percent. Prediction accuracies on the SCE data are
                 lower, 47.3percent (sensitivity) and 70.9percent
                 (specificity), indicating that alternative splicing of
                 introns can be better captured by sequence properties
                 than that of exons.",
  notes =        "PMID: 17276654 [PubMed - indexed for MEDLINE]",

