Evolving Decision Trees for the Categorization of Software

Created by W.Langdon from gp-bibliography.bib Revision:1.4524

  author =       "Jasenko Hosic and Daniel R. Tauritz and 
                 Samuel A. Mulder",
  title =        "Evolving Decision Trees for the Categorization of
  booktitle =    "Proceedings of the 38th IEEE Annual Computers,
                 Software and Applications Conference Workshops
                 (COMPSACW '14)",
  year =         "2014",
  pages =        "337--342",
  address =      "Vasteras",
  month =        "21-25 " # jul,
  publisher =    "IEEE",
  keywords =     "genetic programming, program understanding, SBSE,
                 software categorisation, decision trees",
  DOI =          "doi:10.1109/COMPSACW.2014.59",
  size =         "6 pages",
  abstract =     "Current manual techniques of static reverse
                 engineering are inefficient at providing semantic
                 program understanding. We have developed an automated
                 method to categorise applications in order to quickly
                 determine pertinent characteristics. Prior work in this
                 area has had some success, but a major strength of our
                 approach is that it produces heuristics that can be
                 reused for quick analysis of new data. Our method
                 relies on a genetic programming algorithm to evolve
                 decision trees which can be used to categorise
                 software. The terminals, or leaf nodes, within the
                 trees each contain values based on selected features
                 from one of several attributes: system calls, byte
                 n-grams, opcode n-grams, cyclomatic complexity, and
                 bonding. The evolved decision trees are reusable and
                 achieve average accuracies above 95percent when
                 categorising programs based on compiler origin and
                 versions. Developing new decision trees simply requires
                 more labelled datasets and potentially different
                 feature selection algorithms for other attributes,
                 depending on the data being classified.",
  notes =        "Dept. of Comput. Sci., Missouri Univ. of Sci. &
                 Technol., Rolla, MO, USA Also known as \cite{6903152}",

Genetic Programming entries for Jasenko Hosic Daniel R Tauritz Samuel A Mulder