Automatic repair of real bugs in java: a large-scale experiment on the defects4j dataset

Created by W.Langdon from gp-bibliography.bib Revision:1.4524

  author =       "Matias Martinez and Thomas Durieux and 
                 Romain Sommerard and Jifeng Xuan and Martin Monperrus",
  title =        "Automatic repair of real bugs in java: a large-scale
                 experiment on the defects4j dataset",
  journal =      "Empirical Software Engineering",
  year =         "2017",
  volume =       "22",
  number =       "4",
  pages =        "1936--1964",
  month =        aug,
  keywords =     "genetic algorithms, genetic programming, Automatic
                 bugfixing, Software repair, Bugs, Defects, Patches,
                 Fixes, GenProg",
  ISSN =         "1573-7616",
  URL =          "",
  URL =          "",
  DOI =          "doi:10.1007/s10664-016-9470-4",
  size =         "29 pages",
  abstract =     "Defects4J is a large, peer-reviewed, structured
                 dataset of real-world Java bugs. Each bug in Defects4J
                 comes with a test suite and at least one failing test
                 case that triggers the bug. In this paper, we report on
                 an experiment to explore the effectiveness of automatic
                 test-suite based repair on Defects4J. The result of our
                 experiment shows that the considered state-of-the-art
                 repair methods can generate patches for 47 out of 224
                 bugs. However, those patches are only test-suite
                 adequate, which means that they pass the test suite and
                 may potentially be incorrect beyond the test-suite
                 satisfaction correctness criterion. We have manually
                 analysed 84 different patches to assess their real
                 correctness. In total, 9 real Java bugs can be
                 correctly repaired with test-suite based repair. This
                 analysis shows that test-suite based repair suffers
                 from under-specified bugs, for which trivial or
                 incorrect patches still pass the test suite. With
                 respect to practical applicability, it takes on average
                 14.8 minutes to find a patch. The experiment was done
                 on a scientific grid, totalling 17.6 days of
                 computation time. All the repair systems and
                 experimental results are publicly available on Github
                 in order to facilitate future research on automatic
  notes =        "jgenprog and jkali


Genetic Programming entries for Matias Martinez Thomas Durieux Romain Sommerard Jifeng Xuan Martin Monperrus