Skip to main content

2003 | OriginalPaper | Buchkapitel

Clustering and Visualization in a Multi-lingual Multi-document Summarization System

verfasst von : Hsin-Hsi Chen, June-Jei Kuo, Tsei-Chun Su

Erschienen in: Advances in Information Retrieval

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

To measure the similarity of words, sentences, and documents is one of the major issues in multi-lingual multi-document summarization. This paper presents five strategies to compute the multilingual sentence similarity. The experimental results show that sentence alignment without considering the word position or order in a sentence obtains the best performance. Besides, two strategies are proposed for multilingual document clustering. The two-phase strategy (translation after clustering) is better than one-phase strategy (translation before clustering). Translation deferred to sentence clustering, which reduces the propagation of translation errors, is most promising. Moreover, three strategies are proposed to tackle the sentence clustering. Complete link within a cluster has the best performance, however, the subsumption-based clustering has the advantage of lower computation complexity and similar performance. Finally, two visualization models (i.e., focusing and browsing), which consider the users’ language preference, are proposed.

Metadaten
Titel
Clustering and Visualization in a Multi-lingual Multi-document Summarization System
verfasst von
Hsin-Hsi Chen
June-Jei Kuo
Tsei-Chun Su
Copyright-Jahr
2003
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/3-540-36618-0_19