Normalized Compression Distance of Multisets with Applications

Created by W.Langdon from gp-bibliography.bib Revision:1.4504

  author =       "Andrew R. Cohen and Paul M. B. Vitanyi",
  title =        "Normalized Compression Distance of Multisets with
  journal =      "IEEE Transactions on Pattern Analysis and Machine
  year =         "2015",
  volume =       "37",
  number =       "8",
  pages =        "1602--1614",
  month =        aug,
  keywords =     "genetic algorithms, genetic programming",
  ISSN =         "0162-8828",
  DOI =          "doi:10.1109/TPAMI.2014.2375175",
  abstract =     "Pairwise normalized compression distance (NCD) is a
                 parameter-free, feature-free, alignment-free,
                 similarity metric based on compression. We propose an
                 NCD of multisets that is also metric. Previously,
                 attempts to obtain such an NCD failed. For
                 classification purposes it is superior to the pairwise
                 NCD in accuracy and implementation complexity. We cover
                 the entire trajectory from theoretical underpinning to
                 feasible practice. It is applied to biological (stem
                 cell, organelle transport) and OCR classification
                 questions that were earlier treated with the pairwise
                 NCD. With the new method we achieved significantly
                 better results. The theoretic foundation is Kolmogorov
  notes =        "Also known as \cite{6967789}",

Genetic Programming entries for Andrew R Cohen Paul M B Vitanyi