Protein secondary structure prediction through a novel framework of secondary structure transition sites and new encoding schemes

Created by W.Langdon from gp-bibliography.bib Revision:1.3872

@InProceedings{Zamani:2016:CIBCB,
  author =       "Masood Zamani and Stefan C. Kremer",
  booktitle =    "2016 IEEE Conference on Computational Intelligence in
                 Bioinformatics and Computational Biology (CIBCB)",
  title =        "Protein secondary structure prediction through a novel
                 framework of secondary structure transition sites and
                 new encoding schemes",
  year =         "2016",
  abstract =     "In this paper, we propose an ab initio two-stage
                 protein secondary structure (PSS) prediction model
                 through a novel framework of PSS transition site
                 prediction by using Artificial Neural Networks (ANNs)
                 and Genetic Programming (GP). In the proposed
                 classifier, protein sequences are encoded by new amino
                 acid encoding schemes derived from genetic Codon
                 mappings, Clustering and Information theory. In the
                 first stage, sequence segments are mapped to regions in
                 the Ramachandran map (2D-plot), and weight scores are
                 computed by using statistical information derived from
                 clusters. In addition, score vectors are constructed
                 for the mapped regions using the weight scores and PSS
                 transition sites. The score vectors have fewer
                 dimensions compared to those of commonly used encoding
                 schemes and protein profile. In the second stage, a
                 two-tier classifier is employed based on an ANN and a
                 GP method. The performance of the two-stage classifier
                 is compared to the state-of-the-art cascaded Machine
                 Learning methods which commonly employ ANNs. The
                 prediction method is examined with the latest dataset
                 of non-homologous protein sequences, PISCES [1]. The
                 experimental results and statistical analyses indicate
                 a significantly higher distribution of Q3 scores,
                 approximately 7percent with p-value <; 0.001, in
                 comparison to that of cascaded ANN architectures. PSS
                 transition sites are valuable information about the
                 topological property of protein sequences and
                 incorporating the information improves the overall
                 performance of the PSS prediction model.",
  keywords =     "genetic algorithms, genetic programming, ANN, machine
                 learning, amino acids, protein secondary structure,
                 information theory;",
  DOI =          "doi:10.1109/CIBCB.2016.7758118",
  month =        oct,
  notes =        "Also known as \cite{7758118}",
}

Genetic Programming entries for Masood Zamani Stefan C Kremer

Citations