Learning to Schedule Webpage Updates Using Genetic Programming

Created by W.Langdon from gp-bibliography.bib Revision:1.4420

  author =       "Aecio S. R. Santos and Nivio Ziviani and 
                 Jussara M. Almeida and Cristiano Carvalho and 
                 Edleno {Silva de Moura} and Altigran {Soares da Silva}",
  title =        "Learning to Schedule Webpage Updates Using Genetic
  booktitle =    "Proceedings of 20th International Symposium String
                 Processing and Information Retrieval (SPIRE 2013)",
  year =         "2013",
  editor =       "Oren Kurland and Moshe Lewenstein and Ely Porat",
  volume =       "8214",
  series =       "Lecture Notes in Computer Science",
  pages =        "271--278",
  address =      "Jerusalem, Israel",
  month =        oct # " 7-9",
  publisher =    "Springer",
  keywords =     "genetic algorithms, genetic programming, web spider,
                 crawler, webbot",
  bibdate =      "2013-09-30",
  bibsource =    "DBLP,
  isbn13 =       "978-3-319-02431-8",
  URL =          "http://dx.doi.org/10.1007/978-3-319-02432-5",
  DOI =          "doi:10.1007/978-3-319-02432-5_30",
  abstract =     "A key challenge endured when designing a scheduling
                 policy regarding freshness is to estimate the
                 likelihood of a previously crawled webpage being
                 modified on the web. This estimate is used to define
                 the order in which those pages should be visited, and
                 can be explored to reduce the cost of monitoring
                 crawled web pages for keeping updated versions. We here
                 present a novel approach to generate score functions
                 that produce accurate rankings of pages regarding their
                 probability of being modified when compared to their
                 previously crawled versions. We propose a flexible
                 framework that uses genetic programming to evolve score
                 functions to estimate the likelihood that a webpage has
                 been modified. We present a thorough experimental
                 evaluation of the benefits of our framework over five
                 state-of-the-art baselines.",

Genetic Programming entries for Aecio S R Santos Nivio Ziviani Jussara Marques de Almeida Cristiano Carvalho Edleno Silva de Moura Altigran S da Silva