2013 | OriginalPaper | Buchkapitel
Learning Effective XML Classifiers Based on Discriminatory Structures and Nested Content
verfasst von : Gianni Costa, Riccardo Ortale, Ettore Ritacco
Erschienen in: Knowledge Discovery, Knowledge Engineering and Knowledge Management
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
Supervised classification aims to learn a model (or a classifier) from a collection of XML documents individually marked with one of a predefined set of class labels. The learnt classifier isolates each class by the content and structural regularities observed within the respective labeled XML documents and, thus, allows to predict the unknown class of unlabeled XML documents by looking at their content and structural features. The classification of unlabeled XML documents into the predefined classes is a valuable support for more effective and efficient XML search, retrieval and filtering.
We discuss an approach for learning intelligible XML classifiers. XML documents are represented as transactions in a space of boolean features, that are informative of their content and structure. Learning algorithms induce compact associative classifiers with outperforming effectiveness from the transactional XML representation. A preprocessing step contributes to the scalability of the approach with the size of XML corpora.