Term-weighting learning via genetic programming for text classification

Created by W.Langdon from gp-bibliography.bib Revision:1.4420

  author =       "Hugo Jair Escalante and Mauricio A. Garcia-Limon and 
                 Alicia Morales-Reyes and Mario Graff and 
                 Manuel Montes-y-Gomez and Eduardo F. Morales and 
                 Jose Martinez-Carranza",
  title =        "Term-weighting learning via genetic programming for
                 text classification",
  journal =      "Knowledge-Based Systems",
  year =         "2015",
  volume =       "83",
  pages =        "176--189",
  keywords =     "genetic algorithms, genetic programming,
                 term-weighting learning, text mining, representation
                 learning, bag of words",
  ISSN =         "0950-7051",
  bibdate =      "2015-05-11",
  bibsource =    "DBLP,
  URL =          "http://dx.doi.org/10.1016/j.knosys.2015.03.025",
  DOI =          "doi:10.1016/j.knosys.2015.03.025",
  URL =          "http://www.sciencedirect.com/science/article/pii/S0950705115001197",
  abstract =     "This paper describes a novel approach to learning
                 term-weighting schemes (TWSs) in the context of text
                 classification. In text mining a TWS determines the way
                 in which documents will be represented in a vector
                 space model, before applying a classifier. Whereas
                 acceptable performance has been obtained with standard
                 TWS (e.g., Boolean and term-frequency schemes), the
                 definition of TWSs has been traditionally an art.
                 Further, it is still a difficult task to determine what
                 is the best TWS for a particular problem and it is not
                 clear yet, whether better schemes, than those currently
                 available, can be generated by combining known TWS. We
                 propose in this article a genetic program that aims at
                 learning effective TWSs that can improve the
                 performance of current schemes in text classification.
                 The genetic program learns how to combine a set of
                 basic units to give rise to discriminative TWSs. We
                 report an extensive experimental study comprising data
                 sets from thematic and non-thematic text classification
                 as well as from image classification. Our study shows
                 the validity of the proposed method; in fact, we show
                 that TWSs learnt with the genetic program outperform
                 traditional schemes and other TWSs proposed in recent
                 works. Further, we show that TWSs learnt from a
                 specific domain can be effectively used for other
  notes =        "See http://arxiv.org/abs/1410.0640

Genetic Programming entries for Hugo Jair Escalante Mauricio Garcia-Limon Alicia Morales-Reyes Mario Graff Guerrero Manuel Montes-y-Gomez Eduardo F Morales Jose Martinez-Carranza