A combined component approach for finding collection-adapted ranking functions based on genetic programming

Created by W.Langdon from gp-bibliography.bib Revision:1.4221

  author =       "Humberto Mossri {de Almeida} and 
                 Marcos Andre Goncalves and Marco Cristo and Pavel Calado",
  title =        "A combined component approach for finding
                 collection-adapted ranking functions based on genetic
  booktitle =    "Proceedings of the 30th Annual International ACM
                 Conference on Research and Development in Information
                 Retrieval, SIGIR 2007",
  year =         "2007",
  editor =       "Wessel Kraaij and Arjen P. {de Vries} and 
                 Charles L. A. Clarke and Norbert Fuhr and Noriko Kando",
  pages =        "399--406",
  address =      "Amsterdam, The Netherlands",
  month =        jul # " 23-27",
  publisher =    "ACM",
  keywords =     "genetic algorithms, genetic programming, Information
                 Retrieval, Ranking Functions, Term-weighting, Machine
  isbn13 =       "978-1-59593-597-7",
  DOI =          "doi:10.1145/1277741.1277810",
  size =         "8 pages",
  abstract =     "In this paper, we propose a new method to discover
                 collection-adapted ranking functions based on Genetic
                 Programming (GP). Our Combined Component Approach
                 (CCA)is based on the combination of several
                 term-weighting components (i.e.,term frequency,
                 collection frequency, normalization) extracted from
                 well-known ranking functions. In contrast to related
                 work, the GP terminals in our CCA are not based on
                 simple statistical information of a document
                 collection, but on meaningful, effective, and proven
                 components. Experimental results show that our approach
                 was able to out perform standard TF-IDF, BM25 and
                 another GP-based approach in two different collections.
                 CCA obtained improvements in mean average precision up
                 to 40.87percent for the TREC-8 collection, and
                 24.85percent for the WBR99 collection (a large
                 Brazilian Web collection), over the baseline functions.
                 The CCA evolution process also was able to reduce the
                 over training, commonly found in machine learning
                 methods, especially genetic programming, and to
                 converge faster than the other GP-based approach used
                 for comparison.",
  bibdate =      "2007-08-24",
  bibsource =    "DBLP,

Genetic Programming entries for Humberto Mossri de Almeida Marcos Andre Goncalves Marco Cristo Pavel Pereira Calado