2011 | OriginalPaper | Buchkapitel
XML Information Retrieval through Tree Edit Distance and Structural Summaries
verfasst von : Cyril Laitang, Mohand Boughanem, Karen Pinel-Sauvagnat
Erschienen in: Information Retrieval Technology
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
Semi-structured Information Retrieval (SIR) allows the user to narrow his search down to the element level. As queries and XML documents can be seen as hierarchically nested elements, we consider that their structural proximity can be evaluated through their trees similarity. Our approach combines both content and structure scores, the latter being based on tree edit distance (minimal cost of operations to turn one tree to another). We use the tree structure to propagate and combine both measures. Moreover, to overcome time and space complexity, we summarize the document tree structure. We experimented various tree summary techniques as well as our original model using the SSCAS task of the INEX 2005 campaign. Results showed that our approach outperforms state of the art ones.