ABSTRACT
Evolutionary clustering is an emerging research area essential to important applications such as clustering dynamic Web and blog contents and clustering data streams. In evolutionary clustering, a good clustering result should fit the current data well, while simultaneously not deviate too dramatically from the recent history. To fulfill this dual purpose, a measure of temporal smoothness is integrated in the overall measure of clustering quality. In this paper, we propose two frameworks that incorporate temporal smoothness in evolutionary spectral clustering. For both frameworks, we start with intuitions gained from the well-known k-means clustering problem, and then propose and solve corresponding cost functions for the evolutionary spectral clustering problems. Our solutions to the evolutionary spectral clustering problems provide more stable and consistent clustering results that are less sensitive to short-term noises while at the same time are adaptive to long-term cluster drifts. Furthermore, we demonstrate that our methods provide the optimal solutions to the relaxed versions of the corresponding evolutionary k-means clustering problems. Performance experiments over a number of real and synthetic data sets illustrate our evolutionary spectral clustering methods provide more robust clustering results that are not sensitive to noise and can adapt to data drifts.
- C. C. Aggarwal, J. Han, J. Wang, and P. S. Yu. A framework for clustering evolving data streams. In Proc. of the 12th VLDB Conference, 2003. Google ScholarDigital Library
- F. R. Bach and M. I. Jordan. Learning spectral clustering, with application to speech separation. Journal of Machine Learning Research, 7, 2006. Google ScholarDigital Library
- D. Chakrabarti, R. Kumar, and A. Tomkins. Evolutionary clustering. In Proc. of the 12th ACM SIGKDD Conference, 2006. Google ScholarDigital Library
- M. Charikar, C. Chekuri, T. Feder, and R. Motwani. Incremental clustering and dynamic information retrieval. In Proc. of the 29th STOC Conference, 1997. Google ScholarDigital Library
- C. Chatfield. The Analysis of Time Series: An Introduction. Chapman & Hall/CRC.Google Scholar
- F. R. K. Chung. Spectral Graph Theory. American Mathematical Society, 1997.Google Scholar
- L. De Lathauwer, B. De Moor, and J. Vandewalle. A multilinear singular value decomposition. SIAM J. on Matrix Analysis and Applications, 21(4), 2000. Google ScholarDigital Library
- I. S. Dhillon, Y. Guan, and B. Kulis. Kernel k-means: spectral clustering and normalized cuts. In Proc. of the 10th ACM SIGKDD Conference, 2004. Google ScholarDigital Library
- C. Ding and X. He. K-means clustering via principal component analysis. In Proc. of the 21st ICML Conference, 2004. Google ScholarDigital Library
- K. Fan. On a theorem of Weyl concerning eigenvalues of linear transformations. In Proc. Natl. Acad. Sci., 1949.Google ScholarCross Ref
- G. Golub and C. V. Loan. Matrix Computations. Johns Hopkins University Press, third edition, 1996.Google Scholar
- S. Guha, N. Mishra, R. Motwani, and L. O'Callaghan. Clustering data streams. In IEEE Symposium on Foundations of Computer Science, 2000. Google ScholarDigital Library
- C. Gupta and R. Grossman. Genic: A single pass generalized incremental algorithm for clustering. In SIAM Int. Conf. on Data Mining, 2004.Google ScholarCross Ref
- L. J. Hubert and P. Arabie. Comparing partitions. Journal of Classification, 2, 1985.Google Scholar
- X. Ji and W. Xu. Document clustering with prior knowledge. In SIGIR, 2006. Google ScholarDigital Library
- Y. Li, J. Han, and J. Yang. Clustering moving objects. In Proc. of the 10th ACM SIGKDD Conference, 2004. Google ScholarDigital Library
- A. Ng, M. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In NIPS,2001.Google Scholar
- H. Ning, W. Xu, Y. Chi, Y. Gong, and T. Huang. Incremental spectral clustering with application to monitoring of evolving blog communities. In SIAM Int. Conf. on Data Mining, 2007.Google ScholarCross Ref
- J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Trans. on Pattern Analysis and Machine Intelligence, 22(8), 2000. Google ScholarDigital Library
- K. Wagstaff, C. Cardie, S. Rogers, and S. Schroedl. Constrained K-means clustering with background knowledge. In Proc. 18th ICML Conference, 2001. Google ScholarDigital Library
- Y. Weiss. Segmentation using eigenvectors: A unifying view. In ICCV '99: Proceedings of the International Conference on Computer Vision-Volume 2, 1999. Google ScholarDigital Library
- H. Zha, X. He, C. H. Q. Ding, M. Gu, and H. D. Simon. Spectral relaxation for k-means clustering. In NIPS, 2001.Google Scholar
Index Terms
- Evolutionary spectral clustering by incorporating temporal smoothness
Recommendations
Evolutionary clustering
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningWe consider the problem of clustering data over time. An evolutionary clustering should simultaneously optimize two potentially conflicting criteria: first, the clustering at any point in time should remain faithful to the current data as much as ...
On evolutionary spectral clustering
Evolutionary clustering is an emerging research area essential to important applications such as clustering dynamic Web and blog contents and clustering data streams. In evolutionary clustering, a good clustering result should fit the current data well, ...
Efficient evolutionary spectral clustering
An evolutionary spectral clustering method using a smoothed Laplacian is presented.The incomplete Cholesky decomposition (ICD) reduces runtime from O(N3) to O(N).The memory requirements are decreased from O(N2) to O(N).A stopping criterion using the ...
Comments