Hardware-accelerated analysis of non-protein-coding RNAs

Created by W.Langdon from gp-bibliography.bib Revision:1.4340

  author =       "Ola {Snove, Jr.}",
  title =        "Hardware-accelerated analysis of non-protein-coding
  school =       "Faculty of Information Technology, Mathematics and
                 Electrical Engineering, Department of Computer and
                 Information Science, Norwegian University of Science
                 and Technology",
  year =         "2005",
  address =      "Norway",
  keywords =     "genetic algorithms, genetic programming, MIMD, MISD,
  URL =          "http://www.idi.ntnu.no/grupper/su/publ/phd/drphil-snove-rev-10jun06.pdf",
  URL =          "http://hdl.handle.net/11250/249826",
  size =         "207 pages",
  abstract =     "A tremendous amount of genomic sequence data of
                 relatively high quality has become publicly available
                 due to the human genome sequencing projects that were
                 completed a few years ago. Despite considerable
                 efforts, we do not yet know everything that is to know
                 about the various parts of the genome, what all the
                 regions code for, and how their gene products
                 contribute in the myriad of biological processes that
                 are performed within the cells. New high-performance
                 methods are needed to extract knowledge from this vast
                 amount of information.

                 Furthermore, the traditional view that DNA codes for
                 RNA that codes for protein, which is known as the
                 central dogma of molecular biology, seems to be only
                 part of the story. The discovery of many
                 non-proteincoding gene families with housekeeping and
                 regulatory functions brings an entirely new perspective
                 to molecular biology. Also, sequence analysis of the
                 new gene families require new methods, as there are
                 significant differences between protein-coding and
                 non-protein-coding genes.

                 This work describes a new search processor that can
                 search for complex patterns in sequence data for which
                 no efficient lookup-index is known. When several chips
                 are mounted on search cards that are fitted into PCs in
                 a small cluster configuration, the system's performance
                 is orders of magnitude higher than that of comparable
                 solutions for selected applications. The applications
                 treated in this work fall into two main categories,
                 namely pattern screening and data mining, and both take
                 advantage of the search capacity of the cluster to
                 achieve adequate performance. Specifically, the thesis
                 describes an interactive system for exploration of all
                 types of genomic sequence data. Moreover, a genetic
                 programming-based data mining system finds classifiers
                 that consist of potentially complex patterns that are
                 characteristic for groups of sequences. The screening
                 and mining capacity has been used to develop an
                 algorithm for identification of new non-protein-coding
                 genes in bacteria; a system for rational design of
                 effective and specific short interfering RNA for
                 sequence-specific silencing of protein-coding genes;
                 and an improved algorithmic step for identification of
                 new regulatory targets for the microRNA family of
                 non-protein-coding genes.",
  notes =        "http://www.interagon.com/demo/",

Genetic Programming entries for Ola Snove Jr