ABSTRACT
The communities of a social network are sets of vertices with more connections inside the set than outside. We theoretically demonstrate that two commonly observed properties of social networks, heavy-tailed degree distributions and large clustering coefficients, imply the existence of vertex neighborhoods (also known as egonets) that are themselves good communities. We evaluate these neighborhood communities on a range of graphs. What we find is that the neighborhood communities can exhibit conductance scores that are as good as the Fiedler cut. Also, the conductance of neighborhood communities shows similar behavior as the network community profile computed with a personalized PageRank community detection method. Neighborhood communities give us a simple and powerful heuristic for speeding up local partitioning methods. Since finding good seeds for the PageRank clustering method is difficult, most approaches involve an expensive sweep over a great many starting vertices. We show how to use neighborhood communities to quickly generate a small set of seeds.
Supplemental Material
- L. Akoglu, M. McGlohon, and C. Faloutsos. oddball: Spotting anomalies in weighted graphs. In M. Zaki, J. Yu, B. Ravindran, and V. Pudi, editors, Advances in Knowledge Discovery and Data Mining, volume 6119 of Lecture Notes in Computer Science, pages 410--421. Springer, 2010. Google ScholarDigital Library
- R. Andersen, F. Chung, and K. Lang. Local graph partitioning using PageRank vectors. In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science, 2006. Google ScholarDigital Library
- R. Andersen and K. J. Lang. Communities from seed sets. In Proceedings of the 15th international conference on the World Wide Web, pages 223--232. ACM Press, 2006. Google ScholarDigital Library
- L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan. Group formation in large social networks: membership, growth, and evolution. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD'06, pages 44--54. ACM, 2006. Google ScholarDigital Library
- A.-L. Barabási and R. Albert. Emergence of scaling in random networks. Science, 286(5439):509--512, 1999.Google ScholarCross Ref
- P. Boldi, M. Rosa, M. Santini, and S. Vigna. Layered label propagation: A multiresolution coordinate-free ordering for compressing social networks. In Proceedings of the 20th WWW2011, pages 587--596, March 2011. Google ScholarDigital Library
- P. Boldi and S. Vigna. The Webgraph Framework I: Compression techniques. In Proceedings of the 13th international conference on the World Wide Web, pages 595--602. ACM Press, 2004. Google ScholarDigital Library
- F. Bonchi, P. Esfandiar, D. F. Gleich, C. Greif, and L. V. S. Lakshmanan. Fast matrix computations for pair-wise and column-wise commute times and katz scores. Internet Mathematics, To appear., 2011.Google Scholar
- R. Burt. Structural Holes: The Social Structure of Competition. Harvard University Press, 1995.Google Scholar
- T. CAIDA. http://www.caida.org/tools/measurement/skitter/router_topology/. Accessed in 2005.Google Scholar
- F. R. K. Chung. Spectral Graph Theory. American Mathematical Society, 1992.Google Scholar
- J. Cohen. Graph twiddling in a MapReduce world. Computing in Science and Engineering, 11(4):29--41, 2009. Google ScholarDigital Library
- M. Faloutsos, P. Faloutsos, and C. Faloutsos. On power-law relationships of the internet topology. SIGCOMM Comput. Commun. Rev., 29:251--262, August 1999. Google ScholarDigital Library
- M. Fiedler. Algebraic connectivity of graphs. Czechoslovak Mathematical Journal, 23(98):298--305, 1973.Google ScholarCross Ref
- U. Gargi, W. Lu, V. Mirrokni, and S. Yoon. Large-scale community detection on YouTube for topic discovery and exploration. In Proceedings of Fifth International AAAI Conference on Weblogs and Social Media, 2011.Google Scholar
- J. Huang, H. Sun, Y. Liu, Q. Song, and T. Weninger. Towards online multiresolution community detection in large-scale networks. PLoS ONE, 6(8):e23829, August 2011.Google ScholarCross Ref
- R. Kannan, S. Vempala, and A. Vetta. On clusterings: Good, bad and spectral. J. ACM, 51(3):497--515, May 2004. Google ScholarDigital Library
- G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput., 20(1):359--392, 1998. Google ScholarDigital Library
- J. Kleinberg, S. Suri, E. Tardos, and T. Wexler. Strategic network formation with structural holes. In Proceedings of the 9th ACM conference on Electronic commerce, EC '08, pages 284--293. ACM, 2008. Google ScholarDigital Library
- D. E. Knuth. The Stanford GraphBase: A Platform for Combinatorial Computing. Addison-Wesley, 1993. Google Scholar
- M. Kolountzakis, G. Miller, R. Peng, and C. Tsourakakis. Efficient triangle counting in large graphs via degree-based vertex partitioning. In Algorithms and Models for the Web-Graph, volume 6516 of Lecture Notes in Computer Science, pages 15--24. Springer, 2010.Google ScholarCross Ref
- T. La Fond and J. Neville. Randomization tests for distinguishing social influence and homophily effects. In Proceedings of the 19th international conference on World wide web, WWW '10, pages 601--610. ACM, 2010. Google ScholarDigital Library
- J. Leskovec, J. Kleinberg, and C. Faloutsos. Graph evolution: Densification and shrinking diameters. ACM Trans. Knowl. Discov. Data, 1:1--41, March 2007. Google ScholarDigital Library
- J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney. Statistical properties of community structure in large social and information networks. In WWW '08: Proceeding of the 17th international conference on World Wide Web, pages 695--704. ACM, 2008. Google ScholarDigital Library
- J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Mathematics, 6(1):29--123, September 2009.Google ScholarCross Ref
- J. Leskovec, K. J. Lang, and M. Mahoney. Empirical comparison of algorithms for network community detection. In Proceedings of the 19th international conference on World wide web, WWW '10, pages 631--640. ACM, 2010. Google ScholarDigital Library
- R. Matei, A. Iamnitchi, and P. Foster. Mapping the Gnutella network. Internet Computing, IEEE, 6(1):50--57, January 2002. Google ScholarDigital Library
- M. McPherson, L. Smith-Lovin, and J. M. Cook. Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27:415--444, 2001.Google ScholarCross Ref
- M. Newman. http://www-personal.umich.edu/ mejn/netdata/, 2006.Google Scholar
- M. E. J. Newman. Scientific collaboration networks. II: Shortest paths, weighted networks, and centrality. Phys. Rev. E, 64:016132, Jun 2001.Google ScholarCross Ref
- B. S. Rees and K. B. Gallagher. Overlapping community detection by collective friendship group inference. In International Conference on Advances in Social Network Analysis andMining, pages 375--379, Los Alamitos, CA, USA, 2010. IEEE Computer Society. Google ScholarDigital Library
- S. E. Schaeffer. Algorithms for Nonuniform Networks. PhD thesis, Helsinki University of Technology, 2006.Google Scholar
- S. E. Schaeffer. Graph clustering. Computer Science Review, 1(1):27--64, 2007. Google ScholarDigital Library
- S. B. Seidman. Network structure and minimum degree. Social Networks, 5(3):269--287, 1983.Google ScholarCross Ref
- J. Shi and J. Malik. Normalized cuts and image segmentation. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 22(8):888--905, August 2000. Google ScholarDigital Library
- A. L. Traud, P. J. Mucha, and M. A. Porter. Social structure of facebook networks. arXiv, cs.SI:1102.2166, 2011.Google Scholar
- S. Wasserman and K. Faust. Social network analysis: methods and applications. Cambridge University Press, 1994.Google ScholarCross Ref
- D. J. Watts and S. H. Strogatz. Collective dynamics of "small-world" networks. Nature, 393(6684):440--442, 1998.Google ScholarCross Ref
- C. Wilson, B. Boe, A. Sala, K. P. Puttaswamy, and B. Y. Zhao. User interactions in social networks and their implications. In Proceedings of the 4th ACM European conference on Computer systems, EuroSys '09, pages 205--218. ACM, 2009. Google ScholarDigital Library
Index Terms
- Vertex neighborhoods, low conductance cuts, and good seeds for local community methods
Recommendations
Mining Large Quasi-cliques with Quality Guarantees from Vertex Neighborhoods
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningMining dense subgraphs is an important primitive across a spectrum of graph-mining tasks. In this work, we formally establish that two recurring characteristics of real-world graphs, namely heavy-tailed degree distributions and large clustering ...
On Vertex-Disjoint Triangles in Tripartite Graphs and Multigraphs
AbstractLet G be a tripartite graph with tripartition , where . It is proved that if for every pair of nonadjacent vertices with , then G contains k vertex-disjoint triangles. As a ...
Contractible elements in k-connected graphs not containing some specified graphs
In [15], Thomassen proved that any triangle-free k-connected graph has a contractible edge. Starting with this result, there are several results concerning the existence of contractible elements in k-connected graphs which do not contain specified ...
Comments