Abstract
Modularity is a widely used quality measure for graph clusterings. Its exact maximization is NP-hard and prohibitively expensive for large graphs. Popular heuristics first perform a coarsening phase, where local search starting from singleton clusters is used to compute a preliminary clustering, and then optionally a refinement phase, where this clustering is improved by moving vertices between clusters. As a generalization, multilevel heuristics coarsen in several stages, and refine by moving entire clusters from each of these stages, not only individual vertices.
This article organizes existing and new single-level and multilevel heuristics into a coherent design space, and compares them experimentally with respect to their effectiveness (achieved modularity) and runtime. For coarsening by iterated cluster joining, it turns out that the most widely used criterion for joining clusters (modularity increase) is outperformed by other simple criteria, that a recent multistep algorithm [Schuetz and Caflisch 2008] is no improvement over simple single-step coarsening for these criteria, and that the recent multilevel coarsening by iterated vertex moving [Blondel et al. 2008] is somewhat faster but slightly less effective (with refinement). The new multilevel refinement is significantly more effective than the conventional single-level refinement or no refinement, in reasonable runtime.
A comparison with published benchmark results and algorithm implementations shows that multilevel local search heuristics, despite their relative simplicity, are competitive with the best algorithms in the literature.
- Aarts, E. and Lenstra, J. K., Eds. 2003. Local Search in Combinatorial Optimization. Princeton University Press, Princeton, NJ. Google ScholarDigital Library
- Agarwal, G. and Kempe, D. 2008. Modularity-maximizing graph communities via mathematical programming. Eur. Phys. J. B 66, 3, 409--418.Google ScholarCross Ref
- Albert, R., Jeong, H., and Barabási, A.-L. 1999. Diameter of the World-Wide Web. Nature 401, 6749, 130--131.Google Scholar
- Arenas, A., Fernández, A., and Gómez, S. 2008. Analysis of the structure of complex networks at different resolution levels. New J. Phys. 10, 053039.Google ScholarCross Ref
- Bader, D. and Madduri, K. 2008. SNAP, Small-world Network Analysis and Partitioning: An open-source parallel graph framework for the exploration of large-scale networks. In Proceedings of the IEEE International Symposium on Parallel and Distributed Processing (IPDPS'08). IEEE, Los Alamitos, CA, 1--12.Google Scholar
- Barber, M. and Clark, J. 2009. Detecting network communities by propagating labels under constraints. Phys. Rev. E 80, 2, 026129.Google ScholarCross Ref
- Batagelj, V. and Mrvar, A. 2006. Pajek datasets. http://vlado.fmf.uni-lj.si/pub/networks/data/.Google Scholar
- Blondel, V. D., Guillaume, J.-L., Lambiotte, R., and Lefebvre, E. 2008. Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. P10008.Google ScholarCross Ref
- Boettcher, S. and Percus, A. G. 2001. Optimization with extremal dynamics. Phys. Rev. Lett. 86, 5211--5214.Google ScholarCross Ref
- Boguñá, M., Pastor-Satorras, R., Díaz-Guilera, A., and Arenas, A. 2004. Models of social networks based on social distance attachment. Phys. Rev. E 70, 056122.Google ScholarCross Ref
- Brandes, U., Delling, D., Gaertler, M., Görke, R., Hoefer, M., Nikoloski, Z., and Wagner, D. 2008. On modularity clustering. IEEE Trans. Knowl. Data Eng. 20, 2, 172--188. Google ScholarDigital Library
- Brandes, U., Gaertler, M., and Wagner, D. 2007. Engineering graph clustering: Models and experimental evaluation. ACM J. Exp. Algorithmics 12, 1.1. Google ScholarDigital Library
- Clauset, A., Newman, M. E. J., and Moore, C. 2004. Finding community structure in very large networks. Phys. Rev. E 70, 066111.Google ScholarCross Ref
- Csárdi, G. and Nepusz, T. 2006. The igraph software package for complex network research. Inter. Complex Syst. 1695.Google Scholar
- Danon, L., Díaz-Guilera, A., and Arenas, A. 2006. Effect of size heterogeneity on community identification in complex networks. J. Stat. Mech. Theory Exp. P11010.Google ScholarCross Ref
- Danon, L., Díaz-Guilera, A., Duch, J., and Arenas, A. 2005. Comparing community structure identification. J. Stat. Mech. Theory Exp. P09008.Google ScholarCross Ref
- Dawes, B., Niebler, E., Rivera, R., and James, D. 2009. BOOST C++ Libraries. http://www.boost.org.Google Scholar
- Delling, D., Gaertler, M., Görke, R., and Wagner, D. 2008. Engineering comparators for graph clusterings. In Proceedings of the 4th International Conference on Algorithmic Aspects in Information and Management (AAIM'08). Springer-Verlag, Berlin, 131--142. Google ScholarDigital Library
- DiCiccio, T. J. and Efron, B. 1996. Bootstrap confidence intervals. Stat. Sci. 11, 3, 189--228.Google ScholarCross Ref
- Djidjev, H. N. 2008. A scalable multilevel algorithm for graph clustering and community structure detection. In Proceedings of the 4th International Workshop on Algorithms and Models for the Web-Graph (WAW'06). Springer-Verlag, Berlin, 117--128. Google ScholarDigital Library
- Donetti, L. and Muñoz, M. A. 2004. Detecting network communities: a new systematic and efficient algorithm. J. Stat. Mech. Theory Exp. P10012.Google ScholarCross Ref
- Donetti, L. and Muñoz, M. A. 2005. Improved spectral algorithm for the detection of network communities. Proc. AIP 779, 1, 104--107.Google ScholarCross Ref
- Duch, J. and Arenas, A. 2005. Community detection in complex networks using extremal optimization. Phys. Rev. E 72, 027104.Google ScholarCross Ref
- Fortunato, S. and Barthélemy, M. 2007. Resolution limit in community detection. Proc. National Acad. Sci. 104, 1, 36--41.Google ScholarCross Ref
- Gaertler, M. 2005. Clustering. In Network Analysis: Methodological Foundations, U. Brandes and T. Erlebach, Eds. Springer-Verlag, Berlin, 178--215.Google Scholar
- Gaertler, M., Görke, R., and Wagner, D. 2007. Significance-driven graph clustering. In Proceedings of the 3rd International Conference on Algorithmic Aspects in Information and Management (AAIM'07). Springer-Verlag, Berlin, 11--26. Google ScholarDigital Library
- Girvan, M. and Newman, M. E. J. 2002. Community structure in social and biological networks. Proc. National Acad. Sci. 99, 12, 7821--7826.Google ScholarCross Ref
- Gleiser, P. and Danon, L. 2003. Community structure in jazz. Adv. Complex Syst. 6, 4, 565--573.Google ScholarCross Ref
- Grossman, J. 2007. The Erdős number project. http://www.oakland.edu/enp/.Google Scholar
- Guimerà, R. and Amaral, L. A. N. 2005. Functional cartography of complex metabolic networks. Nature 433, 7028, 895--900.Google Scholar
- Guimerà, R., Danon, L., Díaz-Guilera, A., Giralt, F., and Arenas, A. 2003. Self-similar community structure in a network of human interactions. Phys. Rev. E 68, 065103.Google ScholarCross Ref
- Hendrickson, B. and Leland, R. W. 1995. A multilevel algorithm for partitioning graphs. In Proceedings of the 1995 ACM/IEEE Conference on Supercomputing (Supercomputing'95). ACM, New York, 28. Google ScholarDigital Library
- Kannan, R., Vempala, S., and Vetta, A. 2004. On clusterings: Good, bad and spectral. J. ACM 51, 3, 497--515. Google ScholarDigital Library
- Karrer, B. and Newman, M. E. J. 2009. Random acyclic networks. Phys. Rev. Lett. 102, 12, 128701.Google ScholarCross Ref
- Karypis, G. and Kumar, V. 1998. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20, 1, 359--392. Google ScholarDigital Library
- Kernighan, B. and Lin, S. 1970. An efficient heuristic procedure for partitioning graphs. Bell Sys. Techn. J. 49, 2, 291--307.Google ScholarCross Ref
- Kirkpatrick, S., Gelatt, Jr., C. D., and Vecchi, M. P. 1983. Optimization by simulated annealing. Science 220, 4598, 671--680.Google Scholar
- Krebs, V. 2008. A network of books about recent US politics sold by the online bookseller amazon.com. http://www.orgnet.com/.Google Scholar
- Lancichinetti, A. and Fortunato, S. 2009a. Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Phys. Rev. E 80, 016118.Google ScholarCross Ref
- Lancichinetti, A. and Fortunato, S. 2009b. Community detection algorithms: A comparative analysis. Phys. Rev. E 80, 056117.Google ScholarCross Ref
- Liu, X. and Murata, T. 2010. Advanced modularity-specialized label propagation algorithm for detecting communities in networks. Physica A 389, 7, 1493--1500.Google ScholarCross Ref
- Lü, Z. and Huang, W. 2009. Iterated tabu search for identifying community structure in complex networks. Phys. Rev. E 80, 026130.Google ScholarCross Ref
- Lusseau, D., Schneider, K., Boisseau, O. J., Haase, P., Slooten, E., and Dawson, S. M. 2003. The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations. Behav. Ecol. Sociobiol. 54, 396--405.Google ScholarCross Ref
- Massen, C. P. and Doye, J. P. K. 2005. Identifying communities within energy landscapes. Phys. Rev. E 71, 046101.Google ScholarCross Ref
- Medus, A., Acuña, G., and Dorso, C. O. 2005. Detection of community structures in networks via global optimization. Physica A 358, 2-4, 593--604.Google ScholarCross Ref
- Mei, J., He, S., Shi, G., Wang, Z., and Li, W. 2009. Revealing network communities through modularity maximization by a contraction--dilation method. New J. Phys. 11, 043025.Google ScholarCross Ref
- Newman, M. E. J. 2001. The structure of scientific collaboration networks. Proc. National Acad. Sci. 98, 2, 404--409.Google ScholarCross Ref
- Newman, M. E. J. 2004a. Analysis of weighted networks. Phys. Rev. E 70, 056131.Google ScholarCross Ref
- Newman, M. E. J. 2004b. Fast algorithm for detecting community structure in networks. Phys. Rev. E 69, 066133.Google ScholarCross Ref
- Newman, M. E. J. 2006a. Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74, 036104.Google ScholarCross Ref
- Newman, M. E. J. 2006b. Modularity and community structure in networks. Proc. National Acad. Sci. 103, 23, 8577--8582.Google ScholarCross Ref
- Newman, M. E. J. and Girvan, M. 2004. Finding and evaluating community structure in networks. Phys. Rev. E 69, 026113.Google ScholarCross Ref
- Noack, A. 2007a. Energy models for graph clustering. J. Graph Algorithms Appl. 11, 2, 453--480.Google ScholarCross Ref
- Noack, A. 2007b. Unified quality measures for clusterings, layouts, and orderings, and their application as software design criteria. Ph.D. thesis, Brandenburg University of Technology.Google Scholar
- Noack, A. 2009. Modularity clustering is force-directed layout. Phys. Rev. E 79, 026102.Google ScholarCross Ref
- Pons, P. and Latapy, M. 2006. Computing communities in large networks using random walks. J. Graph Algorithms Appl. 10, 2, 191--218.Google ScholarCross Ref
- Porter, M. A., Onnela, J.-P., and Mucha, P. J. 2009. Communities in networks. Not. Amer. Math. Soc. 56, 9, 1082--1097.Google Scholar
- Preis, R. and Diekmann, R. 1997. PARTY—A software library for graph partitioning. B. Topping, Ed. Advances in Computational Mechanics with Parallel and Distributed Processing, Civil-Comp Press, United Kingdom, 63--71.Google Scholar
- Pujol, J. M., Béjar, J., and Delgado, J. 2006. Clustering algorithm for determining community structure in large networks. Phys. Rev. E 74, 016107.Google ScholarCross Ref
- Reichardt, J. and Bornholdt, S. 2006. Statistical mechanics of community detection. Phys. Rev. E 74, 016110.Google ScholarCross Ref
- Richardson, T., Mucha, P., and Porter, M. 2009. Spectral tripartitioning of networks. Phys. Rev. E 80, 036111.Google ScholarCross Ref
- Rotta, R. 2008. A multi-level algorithm for modularity graph clustering. M.S. thesis, Brandenburg University of Technology.Google Scholar
- Schaeffer, S. E. 2007. Graph clustering. Comput. Sci. Rev. 1, 1, 27--64. Google ScholarDigital Library
- Schuetz, P. and Caflisch, A. 2008a. Efficient modularity optimization by multistep greedy algorithm and vertex mover refinement. Phys. Rev. E 77, 046112.Google ScholarCross Ref
- Schuetz, P. and Caflisch, A. 2008b. Multistep greedy algorithm identifies community structure in real-world and computer-generated networks. Phys. Rev. E 78, 026112.Google ScholarCross Ref
- Siek, J., Lee, L., and Lumsdaine, A. 2002. The Boost Graph Library: User Guide and Reference Manual. Addison-Wesley, Upper Saddle River, NJ. Google ScholarDigital Library
- Sun, Y., Danila, B., Josić, K., and Bassler, K. 2009. Improved community structure detection using a modified fine-tuning strategy. EPL (Europhysics Letters) 86, 28004.Google ScholarCross Ref
- Wakita, K. and Tsurumi, T. 2007. Finding community structure in mega-scale social networks. http://www2007.org/posters/poster950.pdf.Google Scholar
- Walshaw, C. and Cross, M. 2000. Mesh partitioning: A multilevel balancing and refinement algorithm. SIAM J. Sci. Comput. 22, 1, 63--80. Google ScholarDigital Library
- White, S. and Smyth, P. 2005. A spectral clustering approach to finding communities in graphs. In Proceedings of the 5th SIAM International Conference on Data Mining (SDM'05). SIAM, Philadelphia, 274--285.Google Scholar
- Xu, G., Tsoka, S., and Papageorgiou, L. G. 2007. Finding community structures in complex networks using mixed integer optimisation. Eur. Phys. J. B 60, 2, 231--239.Google ScholarCross Ref
- Ye, Z., Hu, S., and Yu, J. 2008. Adaptive clustering algorithm for community detection in complex networks. Phys. Rev. E 78, 046115.Google ScholarCross Ref
- Zachary, W. W. 1977. An information flow model for conflict and fission in small groups. J. Anthropol. Res. 33, 452--473.Google ScholarCross Ref
Index Terms
- Multilevel local search algorithms for modularity clustering
Recommendations
Community detection by modularity maximization using GRASP with path relinking
Complex systems in diverse areas such as biology, sociology and physics are frequently being modelled as graphs, that provide the mathematical framework upon which small scale dynamics between the fundamental elements of the system can reveal large ...
Local search algorithms for the k-cardinality tree problem
In this paper we deal with an NP-hard combinatorial optimization problem, the k-cardinality tree problem in node-weighted graphs. This problem has several applications, which justify the need for efficient methods to obtain good solutions. We review ...
Local search versus Path Relinking in metaheuristics
Display Omitted Meta-RaPS is a recent metaheuristic for optimization problems that is simple and effective.The time consuming local search phase of Meta-RaPS is replaced by Path Relinking as a learning module.The results show that the newly designed ...
Comments