2016 | OriginalPaper | Chapter
SemSynX
: Flexible Similarity Analysis of XML Data via Semantic and Syntactic Heterogeneity/Homogeneity Detection
Authors : Jesús M. Almendros-Jiménez, Alfredo Cuzzocrea
Published in: Hybrid Artificial Intelligent Systems
Publisher: Springer International Publishing
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
Abstract
SemSynX
, a novel technique for supporting similarity analysis of XML data via semantic and syntactic heterogeneity/homogeneity detection. Given two XML trees, SemSynX
retrieves a list of semantic and syntactic heterogeneity/homogeneity matches of objects (i.e., elements, values, tags, attributes) occurring in certain paths of the trees. A local score that takes into account the path and value similarity is given for each heterogeneity/homogeneity found. A global score that summarizes the number of equal matches as well as the local scores globally is also provided. The proposed technique is highly customizable, and it permits the specification of thresholds for the requested degree of similarity for paths and values as well as for the degree of relevance for path and value matching. It thus makes possible to “adjust” the similarity analysis depending on the nature of the input XML trees. SemSynX
has been implemented in terms of a XQuery library, as to enhance interoperability with other XML processing tools. To complete our analytical contributions, a comprehensive experimental assessment and evaluation of SemSynX
over several classes of XML documents is provided.