Performance of Genetic Programming Optimised Bowtie2 on Genome Comparison and Analytic Testing (GCAT) Benchmarks

Created by W.Langdon from gp-bibliography.bib Revision:1.3872

@Article{Langdon:2014:BDM,
  author =       "W. B. Langdon",
  title =        "Performance of Genetic Programming Optimised {Bowtie2}
                 on Genome Comparison and Analytic Testing (GCAT)
                 Benchmarks",
  journal =      "BioData Mining",
  year =         "2015",
  volume =       "8",
  number =       "1",
  month =        "8 " # jan,
  keywords =     "genetic algorithms, genetic programming, genetic
                 improvement, GI, double-ended DNA sequence, high
                 throughput Solexa 454 nextgen NGS sequence
                 query,

                 Bowtie2GP, rapid fuzzy string matching, Homo sapiens
                 genome reference consortium HG19

                 Next-generation DNA sequencing,

                 ",
  URL =          "http://www.cs.ucl.ac.uk/staff/W.Langdon/ftp/papers/Langdon_2014_BDM.pdf",
  DOI =          "doi:10.1186/s13040-014-0034-0",
  size =         "8 pages",
  abstract =     "Background:

                 Genetic studies are increasingly based on short noisy
                 next generation scanners. Typically complete DNA
                 sequences are assembled by matching short NextGen
                 sequences against reference genomes. Despite
                 considerable algorithmic gains since the turn of the
                 millennium, matching both single ended and paired end
                 strings to a reference remains computationally
                 demanding. Further tailoring Bioinformatics tools to
                 each new task or scanner remains highly skilled and
                 labour intensive. With this in mind, we recently
                 demonstrated a genetic programming based automated
                 technique which generated a version of the
                 state-of-the-art alignment tool Bowtie2 which was
                 considerably faster on short sequences produced by a
                 scanner at the Broad Institute and released as part of
                 The Thousand Genome Project.

                 Results:

                 Bowtie2GP and the original Bowtie2 release were
                 compared on bioplanet's GCAT synthetic benchmarks.
                 Bowtie2GP enhancements were also applied to the latest
                 Bowtie2 release (2.2.3, 29 May 2014) and retained both
                 the GP and the manually introduced
                 improvements.

                 Conclusions:

                 On both singled ended and paired-end synthetic next
                 generation DNA sequence GCAT benchmarks Bowtie2GP runs
                 up to 45percent faster than Bowtie2. The lost in
                 accuracy can be as little as 0.2--0.5percent but up to
                 2.5percent for longer sequences.",
  notes =        "GISMO

                 PMID: 25621011 [PubMed]",
}

Genetic Programming entries for William B Langdon

Citations