2008 | OriginalPaper | Buchkapitel
Integrating Structure and Meaning: A New Method for Encoding Structure for Text Classification
verfasst von : Jonathan M. Fishbein, Chris Eliasmith
Erschienen in: Advances in Information Retrieval
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
Current representation schemes for automatic text classification treat documents as syntactically unstructured collections of words or ‘concepts’. Past attempts to encode syntactic structure have treated part-of-speech information as another word-like feature, but have been shown to be less effective than non-structural approaches. We propose a new representation scheme using Holographic Reduced Representations (HRRs) as a technique to encode both semantic and syntactic structure. This method improves on previous attempts in the literature by encoding the structure across all features of the document vector while preserving text semantics. Our method does not increase the dimensionality of the document vectors, allowing for efficient computation and storage. We present classification results of our HRR text representations versus Bag-of-Concepts representations and show that our method of including structure improves text classification results.