skip to main content
10.1145/2433396.2433471acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Overlapping community detection at scale: a nonnegative matrix factorization approach

Published:04 February 2013Publication History

ABSTRACT

Network communities represent basic structures for understanding the organization of real-world networks. A community (also referred to as a module or a cluster) is typically thought of as a group of nodes with more connections amongst its members than between its members and the remainder of the network. Communities in networks also overlap as nodes belong to multiple clusters at once. Due to the difficulties in evaluating the detected communities and the lack of scalable algorithms, the task of overlapping community detection in large networks largely remains an open problem.

In this paper we present BIGCLAM (Cluster Affiliation Model for Big Networks), an overlapping community detection method that scales to large networks of millions of nodes and edges. We build on a novel observation that overlaps between communities are densely connected. This is in sharp contrast with present community detection methods which implicitly assume that overlaps between communities are sparsely connected and thus cannot properly extract overlapping communities in networks. In this paper, we develop a model-based community detection algorithm that can detect densely overlapping, hierarchically nested as well as non-overlapping communities in massive networks. We evaluate our algorithm on 6 large social, collaboration and information networks with ground-truth community information. Experiments show state of the art performance both in terms of the quality of detected communities as well as in speed and scalability of our algorithm.

References

  1. Y.-Y. Ahn, J. P. Bagrow, and S. Lehmann. Link communities reveal multi-scale complexity in networks. Nature, 466:761--764, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  2. E. M. Airoldi, D. M. Blei, S. E. Fienberg, and E. P. Xing. Mixed membership stochastic blockmodels. JMLR, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan. Group formation in large social networks: membership, growth, and evolution. In KDD'06, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004. Google ScholarGoogle ScholarCross RefCross Ref
  5. R. L. Breiger. The duality of persons and groups. Social Forces, 53(2):181--190, 1974.Google ScholarGoogle ScholarCross RefCross Ref
  6. I. Dhillon, Y. Guan, and B. Kulis. Weighted graph cuts without eigenvectors: A multilevel approach. IEEE PAMI, 29(11):1944--1957, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. L. Feld. The focused organization of social ties. American J. of Sociology, 86(5):1015--1035, 1981.Google ScholarGoogle ScholarCross RefCross Ref
  8. S. Fortunato. Community detection in graphs. Physics Reports, 486(3-5):75--174, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  9. S. Fortunato and M. Barthélemy. Resolution limit in community detection. PNAS, 104(1):36--41, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  10. M. Girvan and M. Newman. Community structure in social and biological networks. PNAS, 99(12):7821--7826, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  11. D. F. Gleich and C. Seshadhri. Neighborhoods are good communities. In KDD'12, 2012.Google ScholarGoogle Scholar
  12. M. S. Granovetter. The strength of weak ties. American J. of Sociology, 78:1360--1380, 1973.Google ScholarGoogle ScholarCross RefCross Ref
  13. S. Gregory. Fuzzy overlapping communities in networks. J. of Stat. Mech.: Theory and Experiment, 2011.Google ScholarGoogle Scholar
  14. C.-J. Hsieh, K.-Y. Chiang, and I. S. Dhillon. Low-rank modeling of signed networks. In KDD'12, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. C.-J. Hsieh and I. S. Dhillon. Fast coordinate descent methods with variable selection for non-negative matrix factorization. In KDD'11, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. G. Karypis and V. Kumar. Multilevel k-way partitioning scheme for irregular graphs. Journal of Parallel and Distributed Computing, 48:96--129, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. N. Krogan et al. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature, 440(7084):637--643, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  18. S. Lattanzi and D. Sivakumar. Affiliation networks. In STOC'09, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 1999.Google ScholarGoogle Scholar
  20. J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Mathematics, 6(1):29--123, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  21. C.-J. Lin. Projected gradient methods for nonnegative matrix factorization. Neural Computation, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. J. McAuley and J. Leskovec. Learning to discover social circles in ego networks. In NIPS, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. McPherson, L. Smith-Lovin, and J. M. Cook. Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27:415--444, 2001..Google ScholarGoogle ScholarCross RefCross Ref
  24. M. Newman. Modularity and community structure in networks. PNAS, 103(23):8577--8582, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  25. G. Palla, I. Derényi, I. Farkas, and T. Vicsek. Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435(7043):814--818, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  26. W. W. Powell, D. R. White, K. W. Koput, and J. Owen-Smith. Network dynamics and field evolution: The growth of interorganizational collaboration in the life sciences. American J. of Sociology, 110(4):1132-1205, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  27. M. Rosvall and C. T. Bergstrom. Maps of random walks on complex networks reveal community structure. PNAS, 105:1118--1123, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  28. S. Schaeffer. Graph clustering. Computer Science Review, 1(1):27--64, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. G. Simmel. Conflict and the web of group affiliations. Simon and Schuster, 1964.Google ScholarGoogle Scholar
  30. U. von Luxburg. A tutorial on spectral clustering. Technical Report 149, MPI for Biological Cybernetics, August 2006.Google ScholarGoogle Scholar
  31. D. J. Watts and S. H. Strogatz. Collective dynamics of small-world networks Nature, 393:440-442, 1998.Google ScholarGoogle Scholar
  32. J. Xie, S. Kelley, and B. K. Szymanski. Overlapping community detection in networks: the state of the art and comparative study ACM Computing Surveys, 45:4, 2013.Google ScholarGoogle Scholar
  33. J. Yang and J. Leskovec. Community-affiliation graph model for overlapping network community detection. In ICDM'12, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. J. Yang and J. Leskovec. Defining and evaluating network communities based on ground-truth. In ICDM'12, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. J. Yang and J. Leskovec. Structure and overlaps of communities in networks. In SNAKDD'12, 2012.Google ScholarGoogle Scholar
  36. E. Zheleva, H. Sharara, and L. Getoor. Co-evolution of social and affiliation networks. In KDD'09, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Overlapping community detection at scale: a nonnegative matrix factorization approach

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WSDM '13: Proceedings of the sixth ACM international conference on Web search and data mining
      February 2013
      816 pages
      ISBN:9781450318693
      DOI:10.1145/2433396

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 February 2013

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate498of2,863submissions,17%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader