Searching for discrimination rules in protease proteolytic cleavage activity using genetic programming with a min-max scoring function

Created by W.Langdon from gp-bibliography.bib Revision:1.3872

@Article{ZhengRongYang:2003:BS,
  author =       "Zheng Rong Yang and Rebecca Thomson and 
                 T. Charles Hodgman and Jonathan Dry and Austin K. Doyle and 
                 Ajit Narayanan and XiKun Wu",
  title =        "Searching for discrimination rules in protease
                 proteolytic cleavage activity using genetic programming
                 with a min-max scoring function",
  journal =      "Biosystems",
  year =         "2003",
  volume =       "72",
  number =       "1-2",
  pages =        "159--176",
  month =        nov,
  keywords =     "genetic algorithms, genetic programming, Amino acid
                 similarity matrix, The reverse Polish notation,
                 Proteolytic cleavage analysis",
  URL =          "http://www.sciencedirect.com/science/article/B6T2K-49N9DN6-2/2/0d63ebb7904ac33ae0d20ce4f6477a57",
  DOI =          "doi:10.1016/S0303-2647(03)00141-2",
  abstract =     "We present an algorithm which is able to extract
                 discriminant rules from oligopeptides for protease
                 proteolytic cleavage activity prediction. The algorithm
                 is developed using previous genetic programming. Three
                 important components in the algorithm are a min-max
                 scoring function, the reverse Polish notation (RPN) and
                 the use of minimum description length. The min-max
                 scoring function is developed using amino acid
                 similarity matrices for measuring the similarity
                 between an oligopeptide and a rule, which is a complex
                 algebraic equation of amino acids rather than a simple
                 pattern sequence. The Fisher ratio is then calculated
                 on the scoring values using the class label associated
                 with the oligopeptides. The discriminant ability of
                 each rule can therefore be evaluated. The use of RPN
                 makes the evolutionary operations simpler and therefore
                 reduces the computational cost. To prevent overfitting,
                 the concept of minimum description length is used to
                 penalize over-complicated rules. A fitness function is
                 therefore composed of the Fisher ratio and the use of
                 minimum description length for an efficient
                 evolutionary process. In the application to four
                 protease datasets (Trypsin, Factor Xa, Hepatitis C
                 Virus and HIV protease cleavage site prediction), our
                 algorithm is superior to C5, a conventional method for
                 deriving decision trees.",
}

Genetic Programming entries for Zheng Rong Yang Rebecca Thomson T Charles Hodgman Jonathan Dry Austin K Doyle Ajit Narayanan XiKun Wu

Citations