research-article

Engineering graph clustering: Models and experimental evaluation

Authors:
Ulrik Brandes

University of Konstanz

University of Konstanz
View Profile

,
Marco Gaertler

Universitat Karlsruhe

Universitat Karlsruhe
View Profile

,
Dorothea Wagner

Universitat Karlsruhe

Universitat Karlsruhe
View Profile

ACM Journal of Experimental Algorithmics Volume 12Article No.: 1.1pp 1–26https://doi.org/10.1145/1227161.1227162

Published:12 June 2008Publication History

ACM Journal of Experimental Algorithmics

Abstract

A promising approach to graph clustering is based on the intuitive notion of intracluster density versus intercluster sparsity. As for the weighted case, clusters should accumulate lots of weight, in contrast to their connection to the remaining graph, which should be light. While both formalizations and algorithms focusing on particular aspects of this rather vague concept have been proposed, no conclusive argument on their appropriateness has been given. In order to deepen the understanding of particular concepts, including both quality assessment as well as designing new algorithms, we conducted an experimental evaluation of graph-clustering approaches. By combining proved techniques from graph partitioning and geometric clustering, we also introduce a new approach that compares favorably.

References

AUSIELLO, G., CRESCENZI, P., GAMBOSI, G., KANN, V., AND MARCHETTI-SPACCAMELA, A. 2002. Complexity and Approximation--Combinatorial Optimization Problems and Their Approximability Properties , 2nd ed. Springer-Verlagn New York. Google ScholarDigital Library
BRANDES, U., GAERTLER, M., AND WAGNER, D. 2003. Experiments on graph clustering algorithms. In Proceedings of the 11th Annual European Symposium on Algorithms (ESA'03). Lecture Notes in Computer Science, vol. 2832. 568-579.Google ScholarCross Ref
CHUNG, F. R. K. AND YAU, S.-T. 1994. A near optimal algorithm for edge separators. In Proceeding of the 26th Annual ACM Symposium on Theory of Computing. ACM Press, New York. 1-8. Google ScholarDigital Library
CHUNG, F. R. K. AND YAU, S.-T. 1997. Eigenvalues, flows and separators of graphs. In Proceeding of the 29th Annual ACM Symposium on Theory of Computing. ACM Press, New York. 1-8. Google ScholarDigital Library
CLAUSET, A., NEWMAN, M. E. J., AND MOORE, C. 2004. Finding community structure in very large networks. Physical Review E 70, 066111.Google ScholarCross Ref
GAERTLER, M. 2005. Clustering. In Network Analysis: Methodological Foundations, U. Brandes and T. Erlebach, Eds. Lecture Notes in Computer Science, vol. 3418. Springer-Verlag, New York. 178-215.Google Scholar
HAREL, D. AND KOREN, Y. 2001. On clustering using random walks. In Proceedings of the 21st Conference on Foundations of Software Technology and Theoretical Computer S. Lecture Notes in Computer Science, vol. 2245. Springer-Verlag, New York. 18-41. Google ScholarDigital Library
HARTUV, E. AND SHAMIR, R. 2000. A clustering algorithm based on graph connectivity. Information Processing Letters 76, 4-6, 175-181. Google ScholarDigital Library
HO, T. B., KAWASAKI, S., AND NGUYEN, N. B. 2003. Documents Clustering Using Tolerance Rough Set Model and Its Application to Information Retrieval. Physica-Verlag GmbH, Heidelberg. 181-196. Google ScholarDigital Library
JAIN, A. K. AND DUBES, R. C. 1988. Algorithms for Clustering Data. Prentice Hall, Englewood, Cliffs, NJ. Google ScholarDigital Library
JAIN, A. K., MURTY, M. N., AND FLYNN, P. J. 1999. Data clustering: A review. ACM Computing Surveys 31, 3, 264-323. Google ScholarDigital Library
NEWMAN, M. E. J. AND GIRVAN, M. 2004. Finding and evaluating community structure in networks. Physical Review E 69, 026113.Google ScholarCross Ref
SHAMIR, R., SHARAN, R., AND TSUR, D. 2002. Cluster graph modification problems. In Proceedings of the 28th International Workshop on Graph-Theoretical Conecpts in Computer Science (WG). Lecture Notes in Computer Science, vol. 2573. Springer-Verlag, New York. 379-390. Google ScholarDigital Library
SPIELMAN, D. A. AND TENG, S.-H. 1996. Spectral partitioning works: Planar graphs and finite element meshes. In Proceedings of the 37th Annual IEEE Symposium on Foundations of Computer Science (FOCS'96). 96-106. Google ScholarDigital Library
VAN DONGEN, S. M. 2000. Graph Clustering by Flow Simulation. Ph.D. thesis, University of Utrecht.Google Scholar
VEMPALA, S., KANNAN, R., AND VETTA, A. 2000. On clusterings--good, bad and spectral. In Proceedings of the 41st Annual IEEE Symposium on Foundations of Computer Science (FOCS'00). 367-378. Google ScholarDigital Library
WAGNER, D. AND WAGNER, F. 1993. Between min cut and graph bisection. In MFCS'93: Proceedings of the 18th International Symposium on Mathematical Foundations of Computer Science, A. M. Borzyszkowski and S. Sokolowski, Eds. Lecture Notes in Computer Science, vol. 711. Springer-Verlag, New York. 744-750. Google ScholarDigital Library
ZAHN, C. T. 1971. Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Transactions on Computers C-20, 68-86. Google ScholarDigital Library

Index Terms

Engineering graph clustering: Models and experimental evaluation

Recommendations

Experiments on Density-Constrained Graph Clustering

Clustering a graph means identifying internally dense subgraphs that are only sparsely interconnected. Formalizations of this notion lead to measures that quantify the quality of a clustering and to algorithms that actually find clusterings. Since, most ...
Read More
A partitional clustering algorithm validated by a clustering tendency index based on graph theory

Applying graph theory to clustering, we propose a partitional clustering method and a clustering tendency index. No initial assumptions about the data set are requested by the method. The number of clusters and the partition that best fits the data set, ...
Read More
Survey of Clustering: Algorithms and Applications

This article is a survey into clustering applications and algorithms. A number of important well-known clustering methods are discussed. The authors present a brief history of the development of the field of clustering, discuss various types of ...
Read More

Reviews

Reviewer: Goran Trajkovski

Clustering is an issue that affects nearly all computing disciplines, from computer graphics to bioinformatics; it also affects a wide range of fields outside of computing. Trying to identify regions in which data are denser than other regions is not a trivial task when the dataset is huge. So what do we do then__?__ We try to formalize intuitive ways to see the denser regions in the foggy clouds of data. Effective, well-performing algorithms are a necessity for data mining. In this paper, the authors give us models of graph clustering, based on the notion of identifying intracluster density as well as intercluster sparsity. They contrast three algorithms: Markov clustering, interactive conductance cutting, and the geometric minimum spanning tree clustering. The authors define several parameters for quality control, in an effort to assess density and sparsity. For graphs of different topologies, one algorithm might perform better than the other two. As always, there is a tradeoff between complexity and speed. Experimental studies reveal that this approach is a promising one. So what is the effect of this paper__?__ It makes one sit down and think. Given time, one might implement an approach, and apply it to datasets that have been collected and hidden somewhere, in order to see how the algorithms discussed in this paper will perform. This is a well-written, organized, and easy-to-follow paper. It presents new concepts with sound fundamental investigations and experimental support. It represents a vast, solid amount of work. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Journal of Experimental Algorithmics Volume 12, Issue
2008
507 pages
ISSN:1084-6654
EISSN:1084-6654
DOI:10.1145/1227161
Issue’s Table of Contents

Copyright © 2008 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 June 2008
- Accepted: 1 December 2006
- Received: 1 February 2006
Published in jea Volume 12, Issue
Author Tags
Graph clustering
clustering algorithms
experimental evaluation
quality measures
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 31
  Total Citations
  View Citations
- 2,207
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Engineering graph clustering: Models and experimental evaluation

ACM Journal of Experimental Algorithmics

Abstract

References

Cited By

Index Terms

Recommendations

Experiments on Density-Constrained Graph Clustering

A partitional clustering algorithm validated by a clustering tendency index based on graph theory

Survey of Clustering: Algorithms and Applications

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Engineering graph clustering: Models and experimental evaluation

ACM Journal of Experimental Algorithmics

Abstract

References

Cited By

Index Terms

Recommendations

Experiments on Density-Constrained Graph Clustering

A partitional clustering algorithm validated by a clustering tendency index based on graph theory

Survey of Clustering: Algorithms and Applications

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media