2011 | OriginalPaper | Buchkapitel
Discovering Implicit Categorical Semantics for Schema Matching
verfasst von : Guohui Ding, Guoren Wang
Erschienen in: Database Systems for Advanced Applications
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
Attribute-level schema matching is a critical step in numerous database applications, such as DataSpaces, Ontology Merging and Schema Integration. There exist many researches on this topic, however, they ignore the implicit categorical information which is crucial to find high-quality matches between schema attributes. In this paper, we discover the categorical semantics implicit in source instances, and associate them with the matches in order to improve overall quality of schema matching. Our method works in three phases. The first phase is a pre-detecting step that detects the possible categories of source instances by using clustering techniques. In the second phase, we employ
information entropy
to find the attributes whose instances imply the categorical semantics. In the third phase, we introduce a new concept
c-mapping
to represent the associations between the matches and the categorical semantics. Then, we employ an adaptive
scoring function
to evaluate the
c-mappings
to achieve the task of associating the matches with the semantics. Moreover, we show how to translate the matches with semantics into schema mapping expressions, and use the
chase
procedure to transform source data into target schemas. An experimental study shows that our approach is effective and has good performance.