Discovering biological motifs with genetic programming

  abstract =     "Choosing the right representation for a problem is
                 important. In this article we introduce a linear
                 genetic programming approach for motif discovery in
                 protein families, and we also present a thorough
                 comparison between our approach and Koza-style genetic
                 programming using ADFs.

                 In a study of 45 protein families, we demonstrate that
                 our algorithm, given equal processing resources and no
                 prior knowledge in shaping of datasets, consistently
                 generates motifs that are of significantly better
                 quality than those we found by using trees as
                 representation. For several of the studied protein
                 families we evolve motifs comparable to those found in
                 Prosite, a manually curated database of protein

                 Our linear genome gave better results than Koza-style
                 genetic programming for 37 of 45 families. The
                 difference is statistically significant for 24 of the
                 families at the 99 percent confidence level.",
