The Effects of Randomly Sampled Training Data on Program Evolution

Created by W.Langdon from gp-bibliography.bib Revision:1.4208

  author =       "Brian J. Ross",
  title =        "The Effects of Randomly Sampled Training Data on
                 Program Evolution",
  pages =        "443--450",
  year =         "2000",
  publisher =    "Morgan Kaufmann",
  booktitle =    "Proceedings of the Genetic and Evolutionary
                 Computation Conference (GECCO-2000)",
  editor =       "Darrell Whitley and David Goldberg and 
                 Erick Cantu-Paz and Lee Spector and Ian Parmee and Hans-Georg Beyer",
  address =      "Las Vegas, Nevada, USA",
  publisher_address = "San Francisco, CA 94104, USA",
  month =        "10-12 " # jul,
  keywords =     "genetic algorithms, genetic programming, grammar,
                 stochastic regular expressions",
  ISBN =         "1-55860-708-0",
  URL =          "",
  URL =          "",
  URL =          "",
  URL =          "",
  size =         "8 pages",
  abstract =     "The effects of randomly sampled training data on
                 genetic programming performance is empirically
                 investigated. Often the most natural, if not only,
                 means of characterising the target behaviour for a
                 problem is to randomly sample training cases inherent
                 to that problem. A natural question to raise about this
                 strategy is, how deleterious is the randomly sampling
                 of training data to evolution performance? Will
                 sampling reduce the evolutionary search to hill
                 climbing? Can re-sampling during the run be
                 advantageous? We address these questions by undertaking
                 a suite of different GP experiments. Parameters include
                 various sampling strategies (single, re-sampling, ideal
                 samples), generational and steady-state evolution, and
                 non-evolutionary strategies such as hill climbing and
                 random search. The experiments confirm that random
                 sampling effectively characterizes stochastic domains
                 during genetic programming, provided that a
                 sufficiently representative sample is used. An
                 unexpected result is that genetic programming may
                 perform worse than random search when the sampled
                 training sets are exceptionally poor. We conjecture
                 that poor training sets cause evolution to prematurely
                 converge to undesirable optima, which irrevocably
                 handicaps the population's diversity and viability.",
  notes =        "pop=750/500 (culled), gens=50. p448 'higher quality
                 training evolved better quality solutions'. p448 L1 GP
                 better than hill climbing and random search (not true
                 on L2).

                 A joint meeting of the ninth International Conference
                 on Genetic Algorithms (ICGA-2000) and the fifth Annual
                 Genetic Programming Conference (GP-2000) Part of

                 See also \cite{oai:CiteSeerPSU:250158}",

Genetic Programming entries for Brian J Ross