Abstract
A promising approach to graph clustering is based on the intuitive notion of intracluster density versus intercluster sparsity. As for the weighted case, clusters should accumulate lots of weight, in contrast to their connection to the remaining graph, which should be light. While both formalizations and algorithms focusing on particular aspects of this rather vague concept have been proposed, no conclusive argument on their appropriateness has been given. In order to deepen the understanding of particular concepts, including both quality assessment as well as designing new algorithms, we conducted an experimental evaluation of graph-clustering approaches. By combining proved techniques from graph partitioning and geometric clustering, we also introduce a new approach that compares favorably.
- AUSIELLO, G., CRESCENZI, P., GAMBOSI, G., KANN, V., AND MARCHETTI-SPACCAMELA, A. 2002. Complexity and Approximation--Combinatorial Optimization Problems and Their Approximability Properties , 2nd ed. Springer-Verlagn New York. Google ScholarDigital Library
- BRANDES, U., GAERTLER, M., AND WAGNER, D. 2003. Experiments on graph clustering algorithms. In Proceedings of the 11th Annual European Symposium on Algorithms (ESA'03). Lecture Notes in Computer Science, vol. 2832. 568-579.Google ScholarCross Ref
- CHUNG, F. R. K. AND YAU, S.-T. 1994. A near optimal algorithm for edge separators. In Proceeding of the 26th Annual ACM Symposium on Theory of Computing. ACM Press, New York. 1-8. Google ScholarDigital Library
- CHUNG, F. R. K. AND YAU, S.-T. 1997. Eigenvalues, flows and separators of graphs. In Proceeding of the 29th Annual ACM Symposium on Theory of Computing. ACM Press, New York. 1-8. Google ScholarDigital Library
- CLAUSET, A., NEWMAN, M. E. J., AND MOORE, C. 2004. Finding community structure in very large networks. Physical Review E 70, 066111.Google ScholarCross Ref
- GAERTLER, M. 2005. Clustering. In Network Analysis: Methodological Foundations, U. Brandes and T. Erlebach, Eds. Lecture Notes in Computer Science, vol. 3418. Springer-Verlag, New York. 178-215.Google Scholar
- HAREL, D. AND KOREN, Y. 2001. On clustering using random walks. In Proceedings of the 21st Conference on Foundations of Software Technology and Theoretical Computer S. Lecture Notes in Computer Science, vol. 2245. Springer-Verlag, New York. 18-41. Google ScholarDigital Library
- HARTUV, E. AND SHAMIR, R. 2000. A clustering algorithm based on graph connectivity. Information Processing Letters 76, 4-6, 175-181. Google ScholarDigital Library
- HO, T. B., KAWASAKI, S., AND NGUYEN, N. B. 2003. Documents Clustering Using Tolerance Rough Set Model and Its Application to Information Retrieval. Physica-Verlag GmbH, Heidelberg. 181-196. Google ScholarDigital Library
- JAIN, A. K. AND DUBES, R. C. 1988. Algorithms for Clustering Data. Prentice Hall, Englewood, Cliffs, NJ. Google ScholarDigital Library
- JAIN, A. K., MURTY, M. N., AND FLYNN, P. J. 1999. Data clustering: A review. ACM Computing Surveys 31, 3, 264-323. Google ScholarDigital Library
- NEWMAN, M. E. J. AND GIRVAN, M. 2004. Finding and evaluating community structure in networks. Physical Review E 69, 026113.Google ScholarCross Ref
- SHAMIR, R., SHARAN, R., AND TSUR, D. 2002. Cluster graph modification problems. In Proceedings of the 28th International Workshop on Graph-Theoretical Conecpts in Computer Science (WG). Lecture Notes in Computer Science, vol. 2573. Springer-Verlag, New York. 379-390. Google ScholarDigital Library
- SPIELMAN, D. A. AND TENG, S.-H. 1996. Spectral partitioning works: Planar graphs and finite element meshes. In Proceedings of the 37th Annual IEEE Symposium on Foundations of Computer Science (FOCS'96). 96-106. Google ScholarDigital Library
- VAN DONGEN, S. M. 2000. Graph Clustering by Flow Simulation. Ph.D. thesis, University of Utrecht.Google Scholar
- VEMPALA, S., KANNAN, R., AND VETTA, A. 2000. On clusterings--good, bad and spectral. In Proceedings of the 41st Annual IEEE Symposium on Foundations of Computer Science (FOCS'00). 367-378. Google ScholarDigital Library
- WAGNER, D. AND WAGNER, F. 1993. Between min cut and graph bisection. In MFCS'93: Proceedings of the 18th International Symposium on Mathematical Foundations of Computer Science, A. M. Borzyszkowski and S. Sokolowski, Eds. Lecture Notes in Computer Science, vol. 711. Springer-Verlag, New York. 744-750. Google ScholarDigital Library
- ZAHN, C. T. 1971. Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Transactions on Computers C-20, 68-86. Google ScholarDigital Library
Index Terms
- Engineering graph clustering: Models and experimental evaluation
Recommendations
Experiments on Density-Constrained Graph Clustering
Clustering a graph means identifying internally dense subgraphs that are only sparsely interconnected. Formalizations of this notion lead to measures that quantify the quality of a clustering and to algorithms that actually find clusterings. Since, most ...
A partitional clustering algorithm validated by a clustering tendency index based on graph theory
Applying graph theory to clustering, we propose a partitional clustering method and a clustering tendency index. No initial assumptions about the data set are requested by the method. The number of clusters and the partition that best fits the data set, ...
Survey of Clustering: Algorithms and Applications
This article is a survey into clustering applications and algorithms. A number of important well-known clustering methods are discussed. The authors present a brief history of the development of the field of clustering, discuss various types of ...
Comments