Feature Extraction for the k-Nearest Neighbour Classifier with Genetic Programming

Created by W.Langdon from gp-bibliography.bib Revision:1.4192

  author =       "Martijn C. J. Bot",
  title =        "Feature Extraction for the k-Nearest Neighbour
                 Classifier with Genetic Programming",
  booktitle =    "Genetic Programming, Proceedings of EuroGP'2001",
  year =         "2001",
  editor =       "Julian F. Miller and Marco Tomassini and 
                 Pier Luca Lanzi and Conor Ryan and Andrea G. B. Tettamanzi and 
                 William B. Langdon",
  volume =       "2038",
  series =       "LNCS",
  pages =        "256--267",
  address =      "Lake Como, Italy",
  publisher_address = "Berlin",
  month =        "18-20 " # apr,
  organisation = "EvoNET",
  publisher =    "Springer-Verlag",
  keywords =     "genetic algorithms, genetic programming, Feature
                 Extraction, Machine Learning: Poster",
  ISBN =         "3-540-41899-7",
  URL =          "http://www.springerlink.com/openurl.asp?genre=article&issn=0302-9743&volume=2038&spage=256",
  DOI =          "doi:10.1007/3-540-45355-5_20",
  size =         "12 pages",
  abstract =     "In pattern recognition the curse of dimensionality can
                 be handled either by reducing the number of features,
                 e.g. with decision trees or by extraction of new

                 We propose a genetic programming (GP) framework for
                 automatic extraction of features with the express aim
                 of dimension reduction and the additional aim of
                 improving accuracy of the k-nearest neighbour (k-NN)
                 classifier. We will show that our system is capable of
                 reducing most datasets to one or two features while
                 k-NN accuracy improves or stays the same. Such a small
                 number of features has the great advantage of allowing
                 visual inspection of the dataset in a two-dimensional

                 Since k-NN is a non-linear classification algorithm, we
                 compare several linear fitness measures. We will show
                 the a very simple one, the accuracy of the minimal
                 distance to means (mdm) classifier outperforms all
                 other fitness measures.

                 We introduce a stopping criterion gleaned from numeric
                 mathematics. New features are only added if the
                 relative increase in training accuracy is more than a
                 constant d, for the mdm classifier estimated to be
  notes =        "EuroGP'2001, part of \cite{miller:2001:gp}",

Genetic Programming entries for Martijn C J Bot