General Program Synthesis from Examples Using Genetic Programming with Parent Selection Based on Random Lexicographic Orderings of Test Cases

Created by W.Langdon from gp-bibliography.bib Revision:1.4340

  author =       "Thomas M. Helmuth",
  title =        "General Program Synthesis from Examples Using Genetic
                 Programming with Parent Selection Based on Random
                 Lexicographic Orderings of Test Cases",
  school =       "College of Information and Computer Sciences,
                 University of Massachusetts Amherst",
  year =         "2015",
  address =      "USA",
  month =        sep,
  keywords =     "genetic algorithms, genetic programming, lexicase",
  URL =          "",
  URL =          "",
  size =         "159 pages",
  abstract =     "Software developers routinely create tests before
                 writing code, to ensure that their programs fulfill
                 their requirements. Instead of having human programmers
                 write the code to meet these tests, automatic program
                 synthesis systems can create programs to meet
                 specifications without human intervention, only
                 requiring examples of desired behavior. In the
                 long-term, we envision using genetic programming to
                 synthesize large pieces of software. This dissertation
                 takes steps toward this goal by investigating the
                 ability of genetic programming to solve introductory
                 computer science programming problems. We present a
                 suite of 29 benchmark problems intended to test general
                 program synthesis systems, which we systematically
                 selected from sources of introductory computer science
                 programming problems. This suite is suitable for
                 experiments with any program synthesis system driven by
                 input/output examples. Unlike existing benchmarks that
                 concentrate on constrained problem domains such as list
                 manipulation, symbolic regression, or Boolean
                 functions, this suite contains general programming
                 problems that require a range of programming
                 constructs, such as multiple data types and data
                 structures, control flow statements, and I/O. The
                 problems encompass a range of difficulties and
                 requirements as necessary to thoroughly assess the
                 capabilities of a program synthesis system. Besides
                 describing the specifications for each problem, we make
                 recommendations for experimental protocols and
                 statistical methods to use with the problems. This
                 dissertation's second contribution is an investigation
                 of behaviour-based parent selection in genetic
                 programming, concentrating on a new method called
                 lexicase selection. Most parent selection techniques
                 aggregate errors from test cases to compute a single
                 scalar fitness value; lexicase selection instead treats
                 test cases separately, never comparing error values of
                 different test cases. This property allows it to select
                 parents that specialise on some test cases even if they
                 perform poorly on others. We compare lexicase selection
                 to other parent selection techniques on our benchmark
                 suite, showing better performance for lexicase
                 selection. After observing that lexicase selection
                 increases exploration of the search space while also
                 increasing exploitation of promising programs, we
                 conduct a range of experiments to identify which
                 characteristics of lexicase selection influence its
  notes =        "UM-CS-Phd-2015-005

                 Supervised by Lee Spector

                 GP discussion list 27 Sep 2015:

Genetic Programming entries for Thomas Helmuth