Evaluation of a Tree-based Pipeline Optimization Tool for Automating Data Science

Created by W.Langdon from gp-bibliography.bib Revision:1.4524

  author =       "Randal S. Olson and Nathan Bartley and 
                 Ryan J. Urbanowicz and Jason H. Moore",
  title =        "Evaluation of a Tree-based Pipeline Optimization Tool
                 for Automating Data Science",
  booktitle =    "GECCO '16: Proceedings of the 2016 Annual Conference
                 on Genetic and Evolutionary Computation",
  year =         "2016",
  editor =       "Tobias Friedrich and Frank Neumann and 
                 Andrew M. Sutton and Martin Middendorf and Xiaodong Li and 
                 Emma Hart and Mengjie Zhang and Youhei Akimoto and 
                 Peter A. N. Bosman and Terry Soule and Risto Miikkulainen and 
                 Daniele Loiacono and Julian Togelius and 
                 Manuel Lopez-Ibanez and Holger Hoos and Julia Handl and 
                 Faustino Gomez and Carlos M. Fonseca and 
                 Heike Trautmann and Alberto Moraglio and William F. Punch and 
                 Krzysztof Krawiec and Zdenek Vasicek and 
                 Thomas Jansen and Jim Smith and Simone Ludwig and JJ Merelo and 
                 Boris Naujoks and Enrique Alba and Gabriela Ochoa and 
                 Simon Poulding and Dirk Sudholt and Timo Koetzing",
  pages =        "485--492",
  keywords =     "genetic algorithms, genetic programming",
  month =        "20-24 " # jul,
  organisation = "SIGEVO",
  address =      "Denver, USA",
  publisher =    "ACM",
  publisher_address = "New York, NY, USA",
  isbn13 =       "978-1-4503-4206-3",
  DOI =          "doi:10.1145/2908812.2908918",
  abstract =     "As the field of data science continues to grow, there
                 will be an ever-increasing demand for tools that make
                 machine learning accessible to non-experts. In this
                 paper, we introduce the concept of tree-based pipeline
                 optimization for automating one of the most tedious
                 parts of machine learning--pipeline design. We
                 implement an open source Tree-based Pipeline
                 Optimization Tool (TPOT) in Python and demonstrate its
                 effectiveness on a series of simulated and real-world
                 benchmark data sets. In particular, we show that TPOT
                 can design machine learning pipelines that provide a
                 significant improvement over a basic machine learning
                 analysis while requiring little to no input nor prior
                 knowledge from the user. We also address the tendency
                 for TPOT to design overly complex pipelines by
                 integrating Pareto optimization, which produces compact
                 pipelines without sacrificing classification accuracy.
                 As such, this work represents an important step toward
                 fully automating machine learning pipeline design.",
  notes =        "teapot

                 GECCO-2016 A Recombination of the 25th International
                 Conference on Genetic Algorithms (ICGA-2016) and the
                 21st Annual Genetic Programming Conference (GP-2016)",

Genetic Programming entries for Randal S Olson Nathan Bartley Ryan J Urbanowicz Jason H Moore