Optimising Confidence of Text Classification by Evolution of Symbolic Expressions

Created by W.Langdon from gp-bibliography.bib Revision:1.4192

  author =       "Brij Masand",
  institution =  "Thinking Machines Corporation",
  title =        "Optimising Confidence of Text Classification by
                 Evolution of Symbolic Expressions",
  booktitle =    "Advances in Genetic Programming",
  publisher =    "MIT Press",
  editor =       "Kenneth E. {Kinnear, Jr.}",
  year =         "1994",
  pages =        "445--458",
  chapter =      "21",
  URL =          "http://www.amazon.co.uk/Advances-Genetic-Programming-Complex-Adaptive/dp/0262111888",
  URL =          "http://cognet.mit.edu/sites/default/files/books/9780262277181/pdfs/9780262277181_chap21.pdf",
  keywords =     "genetic algorithms, genetic programming, k-nn",
  size =         "13 pages",
  abstract =     "This paper reports some experiments in applying
                 genetic algorithms for assessing the confidence of
                 automatically assigned multiple keywords for news
                 stories. Using Memory Based Reasoning (MBR) (a
                 k-nearest neighbour method) to classify the stories, we
                 would like to assign a confidence score per news story,
                 that allows one to refer stories with low
                 classification confidence to a human coder. Using
                 Genetic Programming (GP) as used for program evolution
                 by [Koza 1992], we discover and evolve symbolic
                 expressions to compute confidence scores for news
                 stories that allow a higher performance on subsets of
                 the database while referring some stories to human
                 editors. We have earlier reported recall and precision
                 of 81percent and 72percent, if 100percent of the
                 stories are coded automatically [Masand, Linoff and
                 Waltz 1992]. Using the evolved confidence measures to
                 refer some stories for manual coding, we can achieve
                 about 80percent recall and 80percent precision for
                 92percent of the stories. This compares favourably with
                 manually specified confidence functions that could
                 classify 76percent of the database with an 80-8Opercent
                 recall-precision requirement.",
  notes =        "Presented at Genetic Programming Workshop of ICGA-93",
  notes =        "Classification of New Stories, Very simple formulae
                 evolved which do better than existing human attempts at
                 automatic coding. Automatic results comparable to human
                 success rates",

Genetic Programming entries for Brij Masand