Clustering is a widely used unsupervised learning method to group data with similar characteristics. The performance of the clustering method can be in general evaluated through some validity indices. However, most validity indices are designed for the specific algorithms along with specific structure of data space. Moreover, these indices consist of a few within- and between- clustering distance functions. The applicability of these indices heavily relies on the correctness of combining these functions. In this research, we first summarize three common characteristics of any clustering evaluation: (1) the clustering outcome can be evaluated by a group of validity indices if some efficient validity indices are available, (2) the clustering outcome can be measured by an independent intra-cluster distance function and (3) the clustering outcome can be measured by the neighborhood based functions. Considering the complementary and unstable natures among the clustering evaluation, we then apply Dampster-Shafter (D-S) Evidence Theory to fuse the three characteristics to generate a new index, termed fused Multiple Characteristic Indices (fMCI). The fMCI generally is capable to evaluate clustering outcomes of arbitrary clustering methods associated with more complex structures of data space. We conduct a number of experiments to demonstrate that the fMCI is applicable to evaluate different clustering algorithms on different datasets and the fMCI can achieve more accurate and robust clustering evaluation comparing to existing indices.
Weitere Kapitel dieses Buchs durch Wischen aufrufen
Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten
Sie möchten Zugang zu diesem Inhalt erhalten? Dann informieren Sie sich jetzt über unsere Produkte:
- Dampster-Shafer Evidence Theory Based Multi-Characteristics Fusion for Clustering Evaluation
- Springer Berlin Heidelberg
ec4u, Neuer Inhalt/© ITandMEDIA