Evolving Regular Expression-based Sequence Classifiers for Protein Nuclear Localisation

Created by W.Langdon from gp-bibliography.bib Revision:1.3872

  author =       "Amine Heddad and Markus Brameier and 
                 Robert M. MacCallum",
  title =        "Evolving Regular Expression-based Sequence Classifiers
                 for Protein Nuclear Localisation",
  booktitle =    "Applications of Evolutionary Computing,
                 EvoWorkshops2004: {EvoBIO}, {EvoCOMNET}, {EvoHOT},
                 {EvoIASP}, {EvoMUSART}, {EvoSTOC}",
  year =         "2004",
  month =        "5-7 " # apr,
  editor =       "Guenther R. Raidl and Stefano Cagnoni and 
                 Jurgen Branke and David W. Corne and Rolf Drechsler and 
                 Yaochu Jin and Colin R. Johnson and Penousal Machado and 
                 Elena Marchiori and Franz Rothlauf and George D. Smith and 
                 Giovanni Squillero",
  series =       "LNCS",
  volume =       "3005",
  address =      "Coimbra, Portugal",
  publisher =    "Springer Verlag",
  publisher_address = "Berlin",
  pages =        "31--40",
  keywords =     "genetic algorithms, genetic programming, evolutionary
                 computation, perl, grammar, BNF, linear GP, LGP, RE,
                 regular expressions",
  ISBN =         "3-540-21378-3",
  URL =          "http://www.sbc.su.se/~maccallr/publications/heddad-evobio2004.pdf",
  DOI =          "doi:10.1007/978-3-540-24653-4_4",
  abstract =     "A number of bioinformatics tools use regular
                 expression (RE) matching to locate protein or DNA
                 sequence motifs that have been discovered by
                 researchers in the laboratory. For example, patterns
                 representing nuclear localisation signals (NLSs) are
                 used to predict nuclear localisation. NLSs are not yet
                 well understood, and so the set of currently known NLSs
                 may be incomplete. Here we use genetic programming (GP)
                 to generate RE-based classifiers for nuclear
                 localisation. While the approach is a supervised one
                 (with respect to protein location), it is unsupervised
                 with respect to already known NLSs. It therefore has
                 the potential to discover new NLS motifs. We apply both
                 tree based and linear GP to the problem. The inclusion
                 of predicted secondary structure in the input does not
                 improve performance. Benchmarking shows that our
                 majority classifiers are competitive with existing
                 tools. The evolved REs are usually {"}NLS like{"} and
                 work is underway to analyse these for novelty.",
  notes =        "EvoWorkshops2004, perlGP, grammar (not needed, cf
                 p39?). http://www.sbc.su.se/~maccallr/nucpred/ perl
                 eval(), grammar, stgp, matches(),, pdiv, plog, multiple
                 classifier combination majority vote. 'No crossover is
                 allowed between REs' p38. Removing ineffective code.
                 'LGP very close to PerlGP' p38. RE matching done in C.
                 cf. \cite{brameier:nucpred}",

Genetic Programming entries for Amine Heddad Markus Brameier Robert M MacCallum