An Optimized Approach of Modified BAT Algorithm to Record Deduplication

Created by W.Langdon from gp-bibliography.bib Revision:1.4420

  author =       "A {Faritha Banu} and C. Chandrasekar",
  title =        "An Optimized Approach of Modified BAT Algorithm to
                 Record Deduplication",
  year =         "2013",
  journal =      "IJCA",
  month =        jul # "~24",
  number =       "1/",
  keywords =     "genetic algorithms, genetic programming, deduplication
                 function, modified bat algorithm, data mining
  annote =       "The Pennsylvania State University CiteSeerX Archives",
  bibsource =    "OAI-PMH server at",
  language =     "en",
  oai =          "oai:CiteSeerX.psu:",
  rights =       "Metadata may be used without restrictions as long as
                 the oai identifier remains attached to it.",
  URL =          "",
  URL =          "",
  abstract =     "The task of recognising, in a data warehouse, records
                 that pass on to the identical real world entity despite
                 misspelling words, kinds, special writing styles or
                 even unusual schema versions or data types is called as
                 the record deduplication. In existing research they
                 offered a genetic programming (GP) approach to record
                 deduplication. Their approach combines several
                 different parts of substantiation extracted from the
                 data content to generate a deduplication purpose that
                 is capable to recognise whether two or more entries in
                 a depository are duplications or not. Because record
                 deduplication is a time intense task even for
                 undersized repositories, their aspire is to promote a
                 method that discovers a proper arrangement of the best
                 pieces of confirmation, consequently compliant a
                 deduplication function that maximises performance using
                 a small representative portion of the corresponding
                 data for preparation purposes also the optimisation of
                 process is less. Our research deals these issues with a
                 novel technique called modified bat algorithm for
                 record duplication. The incentive behind is to generate
                 a flexible and effective method that employs Data
                 Mining algorithms. The structure distributes many
                 similarities with evolutionary computation techniques
                 such as Genetic programming approach. This scheme is
                 initialised with an inhabitant of random solutions and
                 explores for optima by updating bat inventions.
                 Nevertheless, disparate GP, modified bat has no
                 development operators such as crossover and mutation.
                 We also compare the proposed algorithm with other
                 existing algorithms, including GP from the experimental

Genetic Programming entries for A Faritha Banu C Chandrasekar