2006 | OriginalPaper | Buchkapitel
Understanding and Enhancing the Folding-In Method in Latent Semantic Indexing
verfasst von : Xiang Wang, Xiaoming Jin
Erschienen in: Database and Expert Systems Applications
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
Latent Semantic Indexing(LSI) has been proved to be effective to capture the semantic structure of document collections. It is widely used in content-based text retrieval. However, in many real-world applications dealing with very large document collections, LSI suffers from its high computational complexity, which comes from the process of Singular Value Decomposition(SVD). As a result, in practice, the folding-in method is widely used as an approximation to the LSI method. However, in practice, the folding-in method is generally implemented ”as is” and detailed analysis on its effectiveness and performance is left out. Consequentially, the performance of the folding-in method cannot be guaranteed. In this paper, we firstly illustrated the underlying principle of the folding-in method from a linear algebra point of view and analyzed some existing commonly used techniques. Based on the theoretical analysis, we proposed a novel algorithm to guide the implementation of the folding-in method. Our method was justified and evaluated by a series of experiments on various classical IR data sets. The results indicated that our method was effective and had consistent performance over different document collections.