Derivation of context-free stochastic L-Grammar rules for promoter sequence modeling using Support Vector Machine

Created by W.Langdon from gp-bibliography.bib Revision:1.4549

  author =       "Robertas Damasevicius",
  title =        "Derivation of context-free stochastic {L}-Grammar
                 rules for promoter sequence modeling using Support
                 Vector Machine",
  booktitle =    "XI-th Joint International Scientific Events on
                 Informatics, Book 2, Advanced Research in Artificial
  year =         "2008",
  editor =       "K. Markov and {K. Ivanova} and I. Mitov",
  series =       "Information Science and Computing",
  pages =        "98--104",
  address =      "Varna, Bulgaria",
  publisher_address = "Sofia, Bulgaria",
  month =        "23 " # jun # " - 03 " # jul,
  publisher =    "Ithea",
  keywords =     "genetic algorithms, genetic programming, pattern
                 recognition, J, 3 life and medical sciences",
  annote =       "The Pennsylvania State University CiteSeerX Archives",
  bibsource =    "OAI-PMH server at",
  language =     "en",
  oai =          "oai:CiteSeerX.psu:",
  rights =       "Metadata may be used without restrictions as long as
                 the oai identifier remains attached to it.",
  URL =          "",
  URL =          "",
  URL =          "",
  size =         "7 pages",
  abstract =     "Formal grammars can used for describing complex
                 repeatable structures such as DNA sequences. In this
                 paper, we describe the structural composition of DNA
                 sequences using a context-free stochastic L-grammar.
                 L-grammars are a special class of parallel grammars
                 that can model the growth of living organisms, e.g.
                 plant development, and model the morphology of a
                 variety of organisms. We believe that parallel grammars
                 also can be used for modelling genetic mechanisms and
                 sequences such as promoters. Promoters are short
                 regulatory DNA sequences located upstream of a gene.
                 Detection of promoters in DNA sequences is important
                 for successful gene prediction. Promoters can be
                 recognised by certain patterns that are conserved
                 within a species, but there are many exceptions which
                 makes the promoter recognition a complex problem. We
                 replace the problem of promoter recognition by
                 induction of context-free stochastic L-grammar rules,
                 which are later used for the structural analysis of
                 promoter sequences. L-grammar rules are derived
                 automatically from the drosophila and vertebrate
                 promoter datasets using a genetic programming technique
                 and their fitness is evaluated using a Support Vector
                 Machine (SVM) classifier. The artificial promoter
                 sequences generated using the derived L-grammar rules
                 are analysed and compared with natural promoter
  notes =        "",

Genetic Programming entries for Robertas Damasevicius