Abstract
This article reviews the state-of-the-art in overlapping community detection algorithms, quality measures, and benchmarks. A thorough comparison of different algorithms (a total of fourteen) is provided. In addition to community-level evaluation, we propose a framework for evaluating algorithms' ability to detect overlapping nodes, which helps to assess overdetection and underdetection. After considering community-level detection performance measured by normalized mutual information, the Omega index, and node-level detection performance measured by F-score, we reached the following conclusions. For low overlapping density networks, SLPA, OSLOM, Game, and COPRA offer better performance than the other tested algorithms. For networks with high overlapping density and high overlapping diversity, both SLPA and Game provide relatively stable performance. However, test results also suggest that the detection in such networks is still not yet fully resolved. A common feature observed by various algorithms in real-world networks is the relatively small fraction of overlapping nodes (typically less than 30%), each of which belongs to only 2 or 3 communities.
- Ahn, Y.-Y., Bagrow, J. P., and Lehmann, S. 2010. Link communities reveal multiscale complexity in networks. Nature 466, 761--764.Google ScholarCross Ref
- Ankerst, M., Breunig, M. M., Kriegel, H.-P., and Sander, J. 1999. Optics: Ordering points to identify the clustering structure. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 49--60. Google ScholarDigital Library
- Arenas, A., Diaz-Guilera, A., and Perez-Vicente, C. J. 2006. Synchronization reveals topological scales in complex networks. Phys. Rev. Lett. 96, 11.Google ScholarCross Ref
- Ball, B., Karrer, B., and Newman, M. E. J. 2011. Efficient and principled method for detecting communities in networks. Phys. Rev. E 84, 3.Google ScholarCross Ref
- Baumes, J., Goldberg, M., Krishnamoorthy, M., Magdon-Ismail, M., and Preston, N. 2005. Finding communities by clustering a graph into overlapping subgraphs. In Proceedings of the IADIS International Conference on Applied Computing. 97--104.Google Scholar
- Bianconi, G., Pin, P., and Marsili, M. 2008. Assessing the relevance of node features for network structure. Proc. Natl. Acad. Sci. USA 106, 28, 7.Google Scholar
- Blatt, M., Wiseman, S., and Domany, E. 1996. Superparamagnetic clustering of data. Phys. Rev. Lett. 76, 3251--3254.Google ScholarCross Ref
- Boguna, M., Pastor-Satorras, R., Diaz-Guilera, A., and Arenas, A. 2004. Models of social networks based on social distance attachment. Phys. Rev. E 70, 5.Google ScholarCross Ref
- Breve, F., Zhao, L., and Quiles, M. 2009. Uncovering overlap community structure in complex networks using particle competition. In Proceedings of the International Conference on Artificial Intelligence and Computational Intelligence (AICI'09). 619--628. Google ScholarDigital Library
- Campello, R. J. G. B. 2007. A fuzzy extension of the rand index and other related indexes for clustering and classification assessment. Pattern Recogn. Lett. 28, 7, 833--841. Google ScholarDigital Library
- Campello., R. J. G. B. 2010. Generalized external indexes for comparing data partitions with overlapping categories. Pattern Recogn. Lett. 31, 9, 966--975. Google ScholarDigital Library
- Cazabet, R., Amblard, F., and Hanachi, C. 2010. Detection of overlapping communities in dynamical social networks. In Proceedings of the 2nd IEEE International Conference on Social Computing (SOCIALCOM'10). 309--314. Google ScholarDigital Library
- Chen, D., Shang, M., Lv, Z., and Fu, Y. 2010a. Detecting overlapping communities of weighted networks via a local algorithm. Physica A 389, 19, 4177--4187.Google ScholarCross Ref
- Chen, J., Zaiane, O. R., and Goebel, R. 2009. A visual data mining approach to find overlapping communities in networks. In Proceeding of the International Conference on Advances in Social Network Analysis and Mining (ASONAM'09). 338--343. Google ScholarDigital Library
- Chen, W., Liu, Z., Sun, X., and Wang, Y. 2010b. A game-theoretic framework to identify overlapping communities in social networks. Data Mining Knowl. Discov. 21, 2, 224--240. Google ScholarDigital Library
- Collins, L. M. and Dent, C. W. 1988. Omega: A general formulation of the rand index of cluster recovery suitable for non-disjoint solutions. Multivar. Behav. Res. 23, 2, 231--242.Google ScholarCross Ref
- Condon, A. and Karp, R. M. 2001. Algorithms for graph partitioning on the planted bisection model. Rand. Struct. Algor. 18, 116--140. Google ScholarDigital Library
- Danon, L., Duch, J., Arenas, A., and Diaz-Guilera, A. 2005. Comparing community structure identification. J. Stat. Mech. Thoer. Exp. 2005, 9.Google ScholarCross Ref
- Davis, G. B. and Carley, K. 2008. Clearing the fog: Fuzzy, overlapping groups for social networks. Soc. Netw. 30, 3, 201--212.Google ScholarCross Ref
- Ding, F., Luo, Z., Shi, J., and Fang, X. 2010. Overlapping community detection by kernel-based fuzzy affinity propagation. In Proceedings of the International Workshop on Indoor Spatial Awareness (ISA'10). 1--4.Google Scholar
- Du, N., Wang, B., and Wu, B. 2008. Overlapping community structure detection in networks. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM'08). 1371--1372. Google ScholarDigital Library
- Evans, T. 2010. Clique graphs and overlapping communities. J. Stat. Mech.-Theor. Exp. 2010, 12.Google ScholarCross Ref
- Evans, T. and Lambiotte, R. 2010. Line graphs of weighted networks for overlapping communities. Euro. Phys. J. B 77, 265.Google Scholar
- Evans, T. S. and Lambiotte, R. 2009. Line graphs, link partitions and overlapping communities. Phys. Rev. E 80, 1.Google ScholarCross Ref
- Farkas, I., Abel, D., Palla, G., and Vicsek, T. 2007. Weighted network modules. New J. Phys. 9, 6, 180.Google ScholarCross Ref
- Fisher, D. C. 1989. Lower bounds on the number of triangles in a graph. J. Graph Theor. 13, 4, 505--512.Google ScholarCross Ref
- Fortunato, S. 2010. Community detection in graphs. Phys. Rep. 486, 75--174.Google ScholarCross Ref
- Frey, B. J. and Dueck, D. 2007. Clustering by passing messages between data points. Sci. 315, 972--976.Google ScholarCross Ref
- Fu, Q. and Banerjee, A. 2008. Multiplicative mixture models for overlapping clustering. In Proceedings of the International Conference on Data Mining (ICDM'08). 791--796. Google ScholarDigital Library
- Geweniger, T., Zuhlke, D., Hammer, B., and Villmann, T. 2009. Fuzzy variant of affinity propagation in comparison to median fuzzy c-means. In Proceedings of the 7th International Workshop on Self Organizing Maps (WSOM'09). 72--79. Google ScholarDigital Library
- Gfeller, D., Chappelier, J.-C., and de Los Rios, P. 2005. Finding instabilities in the community structure of complex networks. Phys. Rev. E 72, 5.Google ScholarCross Ref
- Girvan, M. and Newman, M. E. J. 2002. Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 99, 12, 7821--7826.Google ScholarCross Ref
- Gregory, S. 2007. An algorithm to find overlapping community structure in networks. In Proceedings of the 11th European Conference on Principles and Practice of Knowledge Discovery in Databases. 91--102. Google ScholarDigital Library
- Gregory, S. 2008. A fast algorithm to find overlapping communities in networks. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science, vol. 5211, Springer, 408--423. Google ScholarDigital Library
- Gregory, S. 2009. Finding overlapping communities using disjoint community detection algorithms. CompleNet 207, 47--61.Google Scholar
- Gregory, S. 2010. Finding overlapping communities in networks by label propagation. New J. Phys. 12, 10.Google ScholarCross Ref
- Gregory, S. 2011. Fuzzy overlapping communities in networks. J. Stat. Mech. 2011, 2.Google ScholarCross Ref
- Guimera, R., Sales-Pardo, M., and Amaral, L. A. N. 2004. Modularity from fluctuations in random graphs and complex networks. Phys. Rev. E 70, 2.Google ScholarCross Ref
- Havemann, F., Heinz, M., Struck, A., and Glaser, J. 2011. Identification of overlapping communities and their hierarchy by locally calculating community-changing resolution levels. J. Statist. Mech. 2011, 1.Google ScholarCross Ref
- Hubert, L. and Arabie, P. 1985. Comparing partitions. J. Classif. 2, 193--218.Google ScholarCross Ref
- Hullermeier, E. and Rifqi, M. 2009. A fuzzy variant of the rand index for comparing clustering structures. In Proceedings of the Joint International Fuzzy Systems Association World Congress and European Society of Fuzzy Logic and Technology Conference. 1294--1298.Google Scholar
- Jin, D., Yang, B., Baquero, C., Liu, D., He, D., and Liu, J. 2011. A markov random walk under constraint for discovering overlapping communities in complex networks. J. Statist. Mech. 2011, 5.Google ScholarCross Ref
- Karrer, B., Levina, E., and Newman, M. E. J. 2008. Robustness of community structure in networks. Phys. Rev. E 77, 4.Google ScholarCross Ref
- Kelley, S. 2009. The existence and discovery of overlapping communities in large-scale networks. Ph.D. thesis, Rensselaer Polytechnic Institute, Troy, NY.Google Scholar
- Kelley, S., Goldberg, M., Magdon-Ismail, M., Mertsalov, K., and Wallace, A. 2011. Defining and discovering communities in social networks. In Handbook of Optimization in Complex Networks, Springer, 139--168.Google Scholar
- Kim, Y. and Jeong, H. 2011. The map equation for link community (unpublished). http://stat.kaist.ac.kr/∼hjeong/papers/2011_Map.pdf.Google Scholar
- Kovacs, I. A., Palotai, R., Szalay, M., and Csermely, P. 2010. Community landscapes: An integrative approach to determine overlapping network module hierarchy, identify key nodes and predict network dynamics. PLoS ONE 5, 9.Google ScholarCross Ref
- Kumpula, J. M., Kivela, M., Kaski, K., and Saramaki, J. 2008. Sequential algorithm for fast clique percolation. Phys. Rev. E 78, 2.Google ScholarCross Ref
- Lancichinetti, A. and Fortunato, S. 2009. Community detection algorithms: A comparative analysis. Phys. Rev. E 80, 5.Google ScholarCross Ref
- Lancichinetti, A., Fortunato, S., and Kertesz, J. 2009. Detecting the overlapping and hierarchical community structure of complex networks. New J. Phys. 11, 3.Google ScholarCross Ref
- Lancichinetti, A., Fortunato, S., and Radicchi, F. 2008. Benchmark graphs for testing community detection algorithms. Phys. Rev. E 78, 4.Google ScholarCross Ref
- Lancichinetti, A., Radicchi, F., Ramasco, J. J., and Fortunato, S. 2011. Finding statistically significant communities in networks. PLoS ONE 6, 4.Google ScholarCross Ref
- Langfelder, P. and Horvath, S. 2008. WGCNA: An r package for weighted correlation network analysis. BMC Bioinf. 1, 559.Google ScholarCross Ref
- Latouche, P., Birmele, E., and Ambroise, C. 2011. Overlapping stochastic block models with application to the french political blogosphere. Annals Appl. Statist. 5, 309--336.Google ScholarCross Ref
- Lee, C., Reid, F., Mcdaid, A., and Hurley, N. 2010. Detecting highly overlapping community structure by greedy clique expansion. In Proceedings of the 4th Workshop on Social Network Mining and Analysis held in Conjunction with the International Conference on Knowledge Discovery and Data Mining (SNA/KDD'10). 33--42.Google Scholar
- Leskovec, J., Adamic, L. A., and Huberman, B. A. 2007a. The dynamics of viral marketing. ACM Trans. Web 1, 5. Google ScholarDigital Library
- Leskovec, J., Kleinberg, J., and Faloutsos, C. 2007b. Graph evolution: Densification and shrinking diameters. ACM Trans. Knowl. Disc. Data 1, 1. Google ScholarDigital Library
- Leskovec, J., Lang, K. J., Dasgupta, A., and Mahoney, M. W. 2009. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Math. 6, 29--123.Google ScholarCross Ref
- Leskovec, J., Lang, K. J., and Mahoney, M. W. 2010. Empirical comparison of algorithms for network community detection. In Proceedings of the 19th Conference on World Wide Web (WWW'10). 631--640. Google ScholarDigital Library
- Li, D., Leyva, I., Almendral, J., Sendina-Nadal, I., Buldu, J., Havlin, S., and Boccaletti, S. 2008. Synchronization interfaces and overlapping communities in complex networks. Phys. Rev. Lett. 101, 16.Google ScholarCross Ref
- Lu, Q., Korniss, G., and Szymanski, B. K. 2009. The naming game in social networks: Community formation and consensus engineering. J. Econ. Interact. Coord. 4, 221--235.Google ScholarCross Ref
- Magdon-Ismail, M. and Purnell, J. 2011. Fast overlapping clustering of networks using sampled spectral distance embedding and gmms. Tech. rep., Rensselaer Polytechnic Institute, Troy, NY.Google Scholar
- Massen, C. and Doye, J. 2005. Identifying communities within energy landscapes. Phys. Rev. E 71, 4.Google ScholarCross Ref
- Massen, C. and Doye, J. 2007. Thermodynamics of community structure. Preprint arXiv:con-mat/0610077v1.Google Scholar
- Mcdaid, A. and Hurley, N. 2010. Detecting highly overlapping communities with model-based overlapping seed expansion. In Proceedings of the International Conference on Advances in Social Networks Analysis and Mining (ASONAM'10). 112--119. Google ScholarDigital Library
- Molloy, M. and Reed, B. 1995. A critical point for random graphs with a given degree sequence. Rand. Struct. Algor. 6, 161--179. Google ScholarDigital Library
- Moon, J. and Moser, L. 1965. On cliques in graphs. Israel J. Math. 3, 23--28.Google ScholarCross Ref
- Nepusz, T., Petroczi, A., Negyessy, L., and Bazso, F. 2008. Fuzzy communities and the concept of bridgeness in complex networks. Phys. Rev. E 77, 1.Google ScholarCross Ref
- Newman, M. E. J. 2006. Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74, 3.Google ScholarCross Ref
- Newman, M. E. J. and Leicht, E. A. 2007. Mixture models and exploratory analysis in networks. Proc. Natl. Acad. Sci. USA 104, 9564--9569.Google ScholarCross Ref
- Newman, M. E. J., Strogatz, S. H., and Watts, D. J. 2001. Random graphs with arbitrary degree distributions and their applications. Phys. Rev. E 64, 2.Google ScholarCross Ref
- Nicosia, V., Mangioni, G., Carchiolo, V., and Malgeri, M. 2009. Extending the definition of modularity to directed graphs with overlapping communities. J. Stat. Mech. 2009, 3.Google ScholarCross Ref
- Nowicki, K. and Snijders, T. A. B. 2001. Estimation and prediction for stochastic blockstructures. J. Amer. Statist. Assoc. 96, 455, 1077--1087.Google ScholarCross Ref
- Padrol-Sureda, A., Perarnau-Llobet, G., Pfeifle, J., and Munts-Mulero, V. 2010. Overlapping community search for social networks. In Proceedings of the 26th International Conference on Data Engineering (ICDE'10). 992--995.Google Scholar
- Palla, G., Derenyi, I., Farkas, I., and Vicsek, T. 2005. Uncovering the overlapping community structure of complex networks in nature and society. Nature 435, 814--818.Google ScholarCross Ref
- Psorakis, I., Roberts, S., Ebden, M., and Sheldon, B. 2011. Overlapping community detection using bayesian non-negative matrix factorization. Phys. Rev. E 83, 6.Google ScholarCross Ref
- Raghavan, U. N., Albert, R., and Kumara, S. 2007. Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E 76, 3.Google ScholarCross Ref
- Rees, B. and Gallagher, K. 2010. Overlapping community detection by collective friendship group inference. In Proceedings of the International Conference on Advances in Social Network Analysis and Mining (ASONAM'10). 375--379. Google ScholarDigital Library
- Reichardt, J. and Bornholdt, S. 2004. Detecting fuzzy community structures in complex networks with a potts model. Phys. Rev. Lett. 93, 2.Google ScholarCross Ref
- Reichardt., J. and Bornholdt, S. 2006a. Statistical mechanics of community detection. Phys. Rev. E 74, 1.Google ScholarCross Ref
- Reichardt, J. and Bornholdt, S. 2006b. When are networks truly modular? Physica D224, 20--26.Google Scholar
- Reid, F., Mcdaid, A. F., and Hurley, N. J. 2011. Partitioning breaks communities. In Proceedings of the International Conference on Advances in Social Networks Analysis and Mining (ASONAM'11). 102--109. Google ScholarDigital Library
- Ren, W., Yan, G., Liao, X., and Xiao, L. 2009. Simple probabilistic algorithm for detecting community structure. Phys. Rev. E 79, 3.Google ScholarCross Ref
- Richardson, M., Agrawal, R., and Domingos, P. 2003. Trust management for the semantic web. In Proceedings of the 2nd International Semantic Web Conference (ISWC'03). Lecture Notes in Computer Science, vol. 2870. Springer, 351--368.Google Scholar
- Ripeanu, M., Foster, I., and Iamnitchi, A. 2002. Mapping the gnutella network: Properties of large-scale peer-to-peer systems and implications for system design. IEEE Internet Comput. J. 6, 1, 50--57. Google ScholarDigital Library
- Ronhovde, P. and Nussinov, Z. 2009. Multiresolution community detection for megascale networks by information-based replica correlations. Phys. Rev. E 80, 1.Google ScholarCross Ref
- Rosvall, M. and Bergstrom, C. T. 2008. Maps of random walks on complex networks reveal community structure. Proc. Natl. Acad. Sci. 105, 1118--1123.Google ScholarCross Ref
- Sawardecker, E., Sales-Pardo, M., and Amaral, L. 2009. Detection of node group membership in networks with group overlap. Euro. Phys. J. B67, 277.Google Scholar
- Shen, H., Cheng, X., Cai, K., and Hu, M.-B. 2009a. Detect overlapping and hierarchical community structure. Physica A388, 1706.Google ScholarCross Ref
- Shen, H., Cheng, X., and Guo, J. 2009b. Quantifying and identifying the overlapping community structure in networks. J. Stat. Mech. 2009, 7, 9.Google ScholarCross Ref
- Wang, X., Jiao, L., and Wu, J. 2009. Adjusting from disjoint to overlapping community detection of complex networks. Physica A388, 5045--5056.Google ScholarCross Ref
- White, S. and Smyth, P. 2005. A spectral clustering approach to finding communities in graphs. In Proceedings of the SIAM International Conference on Data Mining. 76--84.Google Scholar
- Wu, Z., Lin, Y., Wan, H., and Tian, S. 2010. A fast and reasonable method for community detection with adjustable extent of overlapping. In Proceedings of the Conference on Intelligent Systems and Knowledge Engineering (ISKE'10). 376--379.Google Scholar
- Xie, J. and Szymanski, B. K. 2011. Community detection using a neighborhood strength driven label propagation algorithm. In Proceedings of the IEEE Network Science Workshop (NSW'11). 188--195. Google ScholarDigital Library
- Xie, J. and Szymanski, B. K. 2012. Towards linear time overlapping community detection in social networks. In Proceedings of the 16th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD'12). 25--36. Google ScholarDigital Library
- Xie, J., Szymanski, B. K., and Liu, X. 2011. SLPA: Uncovering overlapping communities in social networks via a speaker-listener interaction dynamic process. In Proceedings of the 11th IEEE International Conference on Data Mining Workshops (ICDMW'11). 344--349. Google ScholarDigital Library
- Zarei, M., Izadi, D., and Samani, K. A. 2009. Detecting overlapping community structure of networks based on vertex-vertex correlations. J. Stat. Mech. 2009, 11.Google ScholarCross Ref
- Zhang, S., Wang, R.-S., and Zhang, X.-S. 2007a. Identification of overlapping community structure in complex networks using fuzzy c-means clustering. Physica A374, 483--490.Google ScholarCross Ref
- Zhang, S., Wang, R.-S., and Zhang, X.-S. 2007b. Uncovering fuzzy community structure in complex networks. Phys. Rev. E 76, 4.Google ScholarCross Ref
- Zhang, Y., Wang, J., Wang, Y., and Zhou, L. 2009. Parallel community detection on large networks with propinquity dynamics. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD'09). 997--1006. Google ScholarDigital Library
- Zhao, K., Zhang, S.-W., and Pan, Q. 2010. Fuzzy analysis for overlapping community structure of complex network. In Proceedings of the Chinese Control and Decision Conference (CCDC'10). 3976--3981.Google Scholar
Index Terms
- Overlapping community detection in networks: The state-of-the-art and comparative study
Recommendations
Overlapping community detection at scale: a nonnegative matrix factorization approach
WSDM '13: Proceedings of the sixth ACM international conference on Web search and data miningNetwork communities represent basic structures for understanding the organization of real-world networks. A community (also referred to as a module or a cluster) is typically thought of as a group of nodes with more connections amongst its members than ...
Detecting highly overlapping community structure based on maximal clique networks
ASONAM '14: Proceedings of the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and MiningMost of overlapping community detection algorithms cannot be applied to networks with highly overlapping community such as online social networks where individuals belong to many communities. One important reason is that many algorithms detect ...
Overlapping Community Detection by Node-Weighting
ICCDA '18: Proceedings of the 2nd International Conference on Compute and Data AnalysisCommunity detection is an important task with great practical value for understanding the structure and function of complex networks. However, in many social networks, a node may belong to more than one community. Thus, the detection of overlapping ...
Comments