Evolutionary approach of reward function for reinforcement learning using genetic programming

Created by W.Langdon from gp-bibliography.bib Revision:1.4340

  author =       "Shota Sumino and Atsuko Mutoh and Shohei Kato",
  title =        "Evolutionary approach of reward function for
                 reinforcement learning using genetic programming",
  booktitle =    "International Symposium on Micro-NanoMechatronics and
                 Human Science (MHS 2011)",
  year =         "2011",
  month =        "6-9 " # nov,
  pages =        "385--390",
  size =         "6 pages",
  abstract =     "In recent year, reinforcement learning, which acquires
                 a behaviour of robots has been drawing attention. A
                 suitable behaviour is autonomously acquired by using
                 this system. Robots learn the suitable behaviour by
                 iterating action and receiving the evaluated value of
                 that action. The evaluated value is calculated by
                 reward function. In general reinforcement learning, we
                 acquire a suitable behaviour by setting the suitable
                 reward function for each problem. However in previous
                 research of reinforcement learning, most reward
                 functions are constructed based on human's heuristics.
                 To construct reward functions, trial-and-error is
                 needed, and it imposes an enormous drain on humans.
                 Therefore we propose an approach, which automatically
                 generate reward functions, using Genetic Programming.
                 In this approach, we create a method evaluating reward
                 functions. Reward functions are generated by Genetic
                 Programming, and are evaluated by evaluating method. A
                 suitable reward function is generated by evolution of
                 these reward functions. In this paper, we conducted an
                 experiment to confirm the effectiveness of proposed
                 method. In the experiment, we generate a suitable
                 reward function of a problem, which a route searching
                 problem in a tile-world. Through the experiment, we
                 confirm that the proposed approach can generate a
                 suitable reward function, and the generated reward
                 function can acquire a more suitable behaviour in
                 comparison with a reward function by constructed based
                 on human's heuristics.",
  keywords =     "genetic algorithms, genetic programming, evolutionary
                 approach, general reinforcement learning, human
                 heuristics, reward function, robot behaviour, route
                 searching problem, trial-and-error, learning systems,
                 mobile robots, search problems",
  DOI =          "doi:10.1109/MHS.2011.6102214",
  ISSN =         "Pending",
  notes =        "Also known as \cite{6102214}",

Genetic Programming entries for Shota Sumino Atsuko Mutoh Shohei Kato