Efficient interleaved sampling of training data in genetic programming

Created by W.Langdon from gp-bibliography.bib Revision:1.4221

  author =       "R. Muhammad Atif Azad and David Medernach and 
                 Conor Ryan",
  title =        "Efficient interleaved sampling of training data in
                 genetic programming",
  booktitle =    "GECCO Comp '14: Proceedings of the 2014 conference
                 companion on Genetic and evolutionary computation
  year =         "2014",
  editor =       "Christian Igel and Dirk V. Arnold and 
                 Christian Gagne and Elena Popovici and Anne Auger and 
                 Jaume Bacardit and Dimo Brockhoff and Stefano Cagnoni and 
                 Kalyanmoy Deb and Benjamin Doerr and James Foster and 
                 Tobias Glasmachers and Emma Hart and Malcolm I. Heywood and 
                 Hitoshi Iba and Christian Jacob and Thomas Jansen and 
                 Yaochu Jin and Marouane Kessentini and 
                 Joshua D. Knowles and William B. Langdon and Pedro Larranaga and 
                 Sean Luke and Gabriel Luque and John A. W. McCall and 
                 Marco A. {Montes de Oca} and Alison Motsinger-Reif and 
                 Yew Soon Ong and Michael Palmer and 
                 Konstantinos E. Parsopoulos and Guenther Raidl and Sebastian Risi and 
                 Guenther Ruhe and Tom Schaul and Thomas Schmickl and 
                 Bernhard Sendhoff and Kenneth O. Stanley and 
                 Thomas Stuetzle and Dirk Thierens and Julian Togelius and 
                 Carsten Witt and Christine Zarges",
  isbn13 =       "978-1-4503-2881-4",
  keywords =     "genetic algorithms, genetic programming: Poster",
  pages =        "127--128",
  month =        "12-16 " # jul,
  organisation = "SIGEVO",
  address =      "Vancouver, BC, Canada",
  URL =          "http://doi.acm.org/10.1145/2598394.2598480",
  DOI =          "doi:10.1145/2598394.2598480",
  publisher =    "ACM",
  publisher_address = "New York, NY, USA",
  abstract =     "The ability to generalise beyond the training set is
                 important for Genetic Programming (GP). Interleaved
                 Sampling is a recently proposed approach to improve
                 generalisation in GP. In this technique, GP alternates
                 between using the entire data set and only a single
                 data point. Initial results showed that the technique
                 not only produces solutions that generalise well, but
                 that it so happens at a reduced computational expense
                 as half the number of generations only evaluate a
                 single data point.

                 This paper further investigates the merit of
                 interleaving the use of training set with two
                 alternatives approaches. These are: the use of random
                 search instead of a single data point, and simply
                 minimising the tree size. Both of these alternatives
                 are computationally even cheaper than the original
                 setup as they simply do not invoke the fitness function
                 half the time. We test the utility of these new methods
                 on four, well cited, and high dimensional problems from
                 the symbolic regression domain.

                 The results show that the new approaches continue to
                 produce general solutions despite taking only half the
                 fitness evaluations. Size minimisation also prevents
                 bloat while producing competitive results on both
                 training and test data sets. The tree sizes with size
                 ionisation are substantially smaller than the rest of
                 the setups, which further brings down the training
  notes =        "Also known as \cite{2598480} Distributed at

Genetic Programming entries for R Muhammad Atif Azad David Medernach Conor Ryan