2007 | OriginalPaper | Buchkapitel
A Hybrid Model for Document Clustering Based on a Fuzzy Approach of Synonymy and Polysemy
verfasst von : Francisco P. Romero, Andrés Soto, José A. Olivas
Erschienen in: Theoretical Advances and Applications of Fuzzy Logic and Soft Computing
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
A new model for document clustering is proposed in order to manage with conceptual aspects. To measure the presence degree of a concept in a document (or even in a document collection), a concept frequency formula is introduced. This formula is based on new fuzzy formulas to calculate the synonymy and polysemy degrees between terms. To solve the several shortcomings of classical clustering algorithm a soft approach to hybrid model is proposed. The clustering procedure is implemented by two connected and tailored algorithms with the aim to build a fuzzy-hierarchical structure. A fuzzy hierarchical clustering algorithm is used to determine an initial clustering and the process is completed using an improved soft clustering algorithm. Experiments show that using this model, clustering tends to perform better than the classical approach.