ABSTRACT
Network communities represent basic structures for understanding the organization of real-world networks. A community (also referred to as a module or a cluster) is typically thought of as a group of nodes with more connections amongst its members than between its members and the remainder of the network. Communities in networks also overlap as nodes belong to multiple clusters at once. Due to the difficulties in evaluating the detected communities and the lack of scalable algorithms, the task of overlapping community detection in large networks largely remains an open problem.
In this paper we present BIGCLAM (Cluster Affiliation Model for Big Networks), an overlapping community detection method that scales to large networks of millions of nodes and edges. We build on a novel observation that overlaps between communities are densely connected. This is in sharp contrast with present community detection methods which implicitly assume that overlaps between communities are sparsely connected and thus cannot properly extract overlapping communities in networks. In this paper, we develop a model-based community detection algorithm that can detect densely overlapping, hierarchically nested as well as non-overlapping communities in massive networks. We evaluate our algorithm on 6 large social, collaboration and information networks with ground-truth community information. Experiments show state of the art performance both in terms of the quality of detected communities as well as in speed and scalability of our algorithm.
- Y.-Y. Ahn, J. P. Bagrow, and S. Lehmann. Link communities reveal multi-scale complexity in networks. Nature, 466:761--764, 2010.Google ScholarCross Ref
- E. M. Airoldi, D. M. Blei, S. E. Fienberg, and E. P. Xing. Mixed membership stochastic blockmodels. JMLR, 2007. Google ScholarDigital Library
- L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan. Group formation in large social networks: membership, growth, and evolution. In KDD'06, 2006. Google ScholarDigital Library
- S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004. Google ScholarCross Ref
- R. L. Breiger. The duality of persons and groups. Social Forces, 53(2):181--190, 1974.Google ScholarCross Ref
- I. Dhillon, Y. Guan, and B. Kulis. Weighted graph cuts without eigenvectors: A multilevel approach. IEEE PAMI, 29(11):1944--1957, 2007. Google ScholarDigital Library
- S. L. Feld. The focused organization of social ties. American J. of Sociology, 86(5):1015--1035, 1981.Google ScholarCross Ref
- S. Fortunato. Community detection in graphs. Physics Reports, 486(3-5):75--174, 2010.Google ScholarCross Ref
- S. Fortunato and M. Barthélemy. Resolution limit in community detection. PNAS, 104(1):36--41, 2007.Google ScholarCross Ref
- M. Girvan and M. Newman. Community structure in social and biological networks. PNAS, 99(12):7821--7826, 2002.Google ScholarCross Ref
- D. F. Gleich and C. Seshadhri. Neighborhoods are good communities. In KDD'12, 2012.Google Scholar
- M. S. Granovetter. The strength of weak ties. American J. of Sociology, 78:1360--1380, 1973.Google ScholarCross Ref
- S. Gregory. Fuzzy overlapping communities in networks. J. of Stat. Mech.: Theory and Experiment, 2011.Google Scholar
- C.-J. Hsieh, K.-Y. Chiang, and I. S. Dhillon. Low-rank modeling of signed networks. In KDD'12, 2012. Google ScholarDigital Library
- C.-J. Hsieh and I. S. Dhillon. Fast coordinate descent methods with variable selection for non-negative matrix factorization. In KDD'11, 2011. Google ScholarDigital Library
- G. Karypis and V. Kumar. Multilevel k-way partitioning scheme for irregular graphs. Journal of Parallel and Distributed Computing, 48:96--129, 1998. Google ScholarDigital Library
- N. Krogan et al. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature, 440(7084):637--643, 2006.Google ScholarCross Ref
- S. Lattanzi and D. Sivakumar. Affiliation networks. In STOC'09, 2009. Google ScholarDigital Library
- D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 1999.Google Scholar
- J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Mathematics, 6(1):29--123, 2009.Google ScholarCross Ref
- C.-J. Lin. Projected gradient methods for nonnegative matrix factorization. Neural Computation, 2007. Google ScholarDigital Library
- J. J. McAuley and J. Leskovec. Learning to discover social circles in ego networks. In NIPS, 2012.Google ScholarDigital Library
- M. McPherson, L. Smith-Lovin, and J. M. Cook. Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27:415--444, 2001..Google ScholarCross Ref
- M. Newman. Modularity and community structure in networks. PNAS, 103(23):8577--8582, 2006.Google ScholarCross Ref
- G. Palla, I. Derényi, I. Farkas, and T. Vicsek. Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435(7043):814--818, 2005.Google ScholarCross Ref
- W. W. Powell, D. R. White, K. W. Koput, and J. Owen-Smith. Network dynamics and field evolution: The growth of interorganizational collaboration in the life sciences. American J. of Sociology, 110(4):1132-1205, 2005.Google ScholarCross Ref
- M. Rosvall and C. T. Bergstrom. Maps of random walks on complex networks reveal community structure. PNAS, 105:1118--1123, 2008.Google ScholarCross Ref
- S. Schaeffer. Graph clustering. Computer Science Review, 1(1):27--64, 2007. Google ScholarDigital Library
- G. Simmel. Conflict and the web of group affiliations. Simon and Schuster, 1964.Google Scholar
- U. von Luxburg. A tutorial on spectral clustering. Technical Report 149, MPI for Biological Cybernetics, August 2006.Google Scholar
- D. J. Watts and S. H. Strogatz. Collective dynamics of small-world networks Nature, 393:440-442, 1998.Google Scholar
- J. Xie, S. Kelley, and B. K. Szymanski. Overlapping community detection in networks: the state of the art and comparative study ACM Computing Surveys, 45:4, 2013.Google Scholar
- J. Yang and J. Leskovec. Community-affiliation graph model for overlapping network community detection. In ICDM'12, 2012. Google ScholarDigital Library
- J. Yang and J. Leskovec. Defining and evaluating network communities based on ground-truth. In ICDM'12, 2012. Google ScholarDigital Library
- J. Yang and J. Leskovec. Structure and overlaps of communities in networks. In SNAKDD'12, 2012.Google Scholar
- E. Zheleva, H. Sharara, and L. Getoor. Co-evolution of social and affiliation networks. In KDD'09, 2009. Google ScholarDigital Library
Index Terms
- Overlapping community detection at scale: a nonnegative matrix factorization approach
Recommendations
Detecting cohesive and 2-mode communities indirected and undirected networks
WSDM '14: Proceedings of the 7th ACM international conference on Web search and data miningNetworks are a general language for representing relational information among objects. An effective way to model, reason about, and summarize networks, is to discover sets of nodes with common connectivity patterns. Such sets are commonly referred to as ...
Overlapping Community Detection by Node-Weighting
ICCDA '18: Proceedings of the 2nd International Conference on Compute and Data AnalysisCommunity detection is an important task with great practical value for understanding the structure and function of complex networks. However, in many social networks, a node may belong to more than one community. Thus, the detection of overlapping ...
Community detection in social networks using hybrid merging of sub-communities
Network vertices are often divided into groups or communities with dense connections within communities and sparse connections between communities. Community detection has recently attracted considerable attention in the field of data mining and social ...
Comments