In this paper we present a comparative study of three data stream clustering algorithms: STREAM, CluStream and MR-Stream. We used a total of 90 synthetic data sets generated from spatial point processes following Gaussian distributions or Mixtures of Gaussians. The algorithms were executed in three main scenarios: 1) low dimensional; 2) low dimensional with concept drift and 3) high dimensional with concept drift. In general, CluStream outperformed the other algorithms in terms of clustering quality at a higher execution time cost. Our results are analyzed with the non-parametric Friedman test and post-hoc Nemenyi test, both with
= 5%. Recommendations and future research directions are also explored.
Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten