An evolutionary approach to complex schema matching

  author =       "Moises Gomes {de Carvalho} and 
                 Alberto H. F. Laender and Marcos Andre Goncalves and Altigran S. {da Silva}",
  title =        "An evolutionary approach to complex schema matching",
  journal =      "Information Systems",
  volume =       "38",
  number =       "3",
  pages =        "302--316",
  year =         "2013",
  ISSN =         "0306-4379",
  DOI =          "doi:10.1016/",
  URL =          "",
  abstract =     "The schema matching problem can be defined as the task
                 of finding semantic relationships between schema
                 elements existing in different data repositories.
                 Despite the existence of elaborated graphic tools for
                 helping to find such matches, this task is usually
                 manually done. In this paper, we propose a novel
                 evolutionary approach to addressing the problem of
                 automatically finding complex matches between schemas
                 of semantically related data repositories. To the best
                 of our knowledge, this is the first approach that is
                 capable of discovering complex schema matches using
                 only the data instances. Since we only exploit the data
                 stored in the repositories for this task, we rely on
                 matching strategies that are based on record
                 deduplication (aka, entity-oriented strategy) and
                 information retrieval (aka, value-oriented strategy)
                 techniques to find complex schema matches during the
                 evolutionary process. To demonstrate the effectiveness
                 of our approach, we conducted an experimental
                 evaluation using real-world and synthetic datasets. The
                 results show that our approach is able to find complex
                 matches with high accuracy, similar to that obtained by
                 more elaborated (hybrid) approaches, despite using only
                 evidence based on the data instances.",
  keywords =     "genetic algorithms, genetic programming, Complex
                 schema matchings, Entity-oriented strategy,
                 Value-oriented strategy",

