2004 | OriginalPaper | Buchkapitel
Contributions of Textual Data Analysis to Text Retrieval
verfasst von : Simona Balbi, Emilio Di Meglio
Erschienen in: Classification, Clustering, and Data Mining Applications
Verlag: Springer Berlin Heidelberg
Enthalten in: Professional Book Archive
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
The aim of this paper is to show how Textual Data Analysis techniques, developed in Europe under the influence of the Analyse Multidimen-sionelle des Données School, can improve performance of the LSI retrieval method. A first improvement can be obtained by properly considering the data contained in a lexical table. LSI is based on Euclidean distance, which is not adequate for frequency data. By using the chi-squared metric, on which Correspondence Analysis is based, significant improvements can be achieved. Further improvements can be obtained by considering external information such as keywords, authors, etc. Here an approach to text retrieval with external information based on PLS regression is shown. The suggested strategies are applied in text retrieval experiments on medical journal abstracts.