Emergent Solutions to High-Dimensional Multi-Task Reinforcement Learning

Created by W.Langdon from gp-bibliography.bib Revision:1.4420

@Article{Kelly:2018:EC,
  author =       "Stephen Kelly and Malcolm I. Heywood",
  title =        "Emergent Solutions to High-Dimensional Multi-Task
                 Reinforcement Learning",
  journal =      "Evolutionary Computation",
  year =         "2018",
  volume =       "26",
  number =       "3",
  pages =        "347--380",
  month =        "Fall",
  keywords =     "genetic algorithms, genetic programming, Emergent
                 modularity, cooperative coevolution, reinforcement
                 learning, multi-task learning",
  ISSN =         "1063-6560",
  URL =          "http://www.human-competitive.org/sites/default/files/kelly-paper.pdf",
  URL =          "https://www.mitpressjournals.org/doi/pdf/10.1162/evco_a_00232",
  size =         "33 pages",
  abstract =     "Algorithms that learn through environmental
                 interaction and delayed rewards, or reinforcement
                 learning, increasingly face the challenge of scaling to
                 dynamic, high-dimensional, and partially observable
                 environments. Significant attention is being paid to
                 frameworks from deep learning, which scale to
                 high-dimensional data by decomposing the task through
                 multi-layered neural networks. While effective, the
                 representation is complex and computationally
                 demanding. In this work we propose a framework based on
                 Genetic Programming which adaptively complexifies
                 policies through interaction with the task. We make a
                 direct comparison with several deep reinforcement
                 learning frameworks in the challenging Atari video game
                 environment as well as more traditional reinforcement
                 learning frameworks based on a priori engineered
                 features. Results indicate that the proposed approach
                 matches the quality of deep learning while being a
                 minimum of three orders of magnitude simpler with
                 respect to model complexity. This results in real-time
                 operation of the champion RL agent without recourse to
                 specialized hardware support. Moreover, the approach is
                 capable of evolving solutions to multiple game titles
                 simultaneously with no additional computational cost.
                 In this case, agent behaviours for an individual game
                 as well as single agents capable of playing all games
                 emerge from the same evolutionary",
  notes =        "Silver Winner 2018 HUMIES

                 Extended Paper Invited from GECCO 2017",
}

Genetic Programming entries for Stephen Kelly Malcolm Heywood

Citations