A multi-stage protein secondary structure prediction system using machine learning and information theory

Created by W.Langdon from gp-bibliography.bib Revision:1.4208

  author =       "Masood Zamani and Stefan C. Kremer",
  booktitle =    "2015 IEEE International Conference on Bioinformatics
                 and Biomedicine (BIBM)",
  title =        "A multi-stage protein secondary structure prediction
                 system using machine learning and information theory",
  year =         "2015",
  pages =        "1304--1309",
  abstract =     "In this paper, we evaluated the performance of a
                 multi-stage protein secondary structure (PSS)
                 prediction model. The proposed classifier uses
                 statistical information and protein profiles. The
                 statistical information is derived from protein
                 sequences and structures by using a k-means clustering
                 technique and Information theory. In the first stage, a
                 feed-forward artificial neural network maps a sequence
                 fragment to a region in the Ramachandran plot
                 (2D-plot). A score vector is constructed with the
                 mapped region using clustering and statistical
                 information. The score vector represents the tendency
                 of pairing an identified region in the 2D-plot and
                 secondary structures for a residue. The score vectors
                 which are used in the second stage have fewer
                 dimensions compared to input vectors that are commonly
                 derived from protein sequences or profile information.
                 In the second stage, a two-tier classifier is employed
                 based on an artificial neural network and a genetic
                 programming (GP) method. The GP method uses IF rules
                 for a three-state classification. The two-tier
                 classifier's performance is compared to those of
                 two-tier artificial neural networks (ANNs) and support
                 vector machines (SVMs). The prediction method is
                 examined with a common protein dataset, RS126. The
                 performance of the proposed classification model is
                 measured based on Q3 and segment overlap (SOV) scores.
                 The proposed PSS prediction model improves over
                 3percent the Q3 score and 2percent the SOV score in
                 comparison to those of two-tier ANN and SVMs
  keywords =     "genetic algorithms, genetic programming",
  DOI =          "doi:10.1109/BIBM.2015.7359867",
  month =        nov,
  notes =        "Sch. of Comput. Sci., Univ. of Guelph, Guelph, ON,

                 Also known as \cite{7359867}",

Genetic Programming entries for Masood Zamani Stefan C Kremer