2014 | OriginalPaper | Buchkapitel
DenClust: A Density Based Seed Selection Approach for K-Means
verfasst von : Md Anisur Rahman, Md Zahidul Islam, Terry Bossomaier
Erschienen in: Artificial Intelligence and Soft Computing
Verlag: Springer International Publishing
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
In this paper we present a clustering technique called DenClust that produces high quality initial seeds through a deterministic process without requiring an user input on the number of clusters
k
and the radius of the clusters
r
. The high quality seeds are given input to K-Means as the set of initial seeds to produce the final clusters. DenClust uses a density based approach for initial seed selection. It calculates the density of each record, where the density of a record is the number of records that have the minimum distances with the record. This approach is expected to produce high quality initial seeds for K-Means resulting in high quality clusters from a dataset. The performance of DenClust is compared with five (5) existing techniques namely CRUDAW, AGCUK, Simple K-means (SK), Basic Farthest Point Heuristic (BFPH) and New Farthest Point Heuristic (NFPH) in terms of three (3) external cluster evaluation criteria namely F-Measure, Entropy, Purity and two (2) internal cluster evaluation criteria namely Xie-Beni Index (XB) and Sum of Square Error (SSE). We use three (3) natural datasets that we obtain from the UCI machine learning repository. DenClust performs better than all five existing techniques in terms of all five evaluation criteria for all three datasets used in this study.