research-article

Vertex neighborhoods, low conductance cuts, and good seeds for local community methods

Authors:
David F. Gleich

Purdue University, West Lafayette, IN, USA

Purdue University, West Lafayette, IN, USA
View Profile

,
C. Seshadhri

Sandia National Laboratories, Livermore, CA, USA

Sandia National Laboratories, Livermore, CA, USA
View Profile

KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2012Pages 597–605https://doi.org/10.1145/2339530.2339628

Published:12 August 2012Publication History

KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 597–605

ABSTRACT

The communities of a social network are sets of vertices with more connections inside the set than outside. We theoretically demonstrate that two commonly observed properties of social networks, heavy-tailed degree distributions and large clustering coefficients, imply the existence of vertex neighborhoods (also known as egonets) that are themselves good communities. We evaluate these neighborhood communities on a range of graphs. What we find is that the neighborhood communities can exhibit conductance scores that are as good as the Fiedler cut. Also, the conductance of neighborhood communities shows similar behavior as the network community profile computed with a personalized PageRank community detection method. Neighborhood communities give us a simple and powerful heuristic for speeding up local partitioning methods. Since finding good seeds for the PageRank clustering method is difficult, most approaches involve an expensive sweep over a great many starting vertices. We show how to use neighborhood communities to quickly generate a small set of seeds.

Supplemental Material

306_t_talk_2.mp4

mp4

368.4 MB

Download

References

L. Akoglu, M. McGlohon, and C. Faloutsos. oddball: Spotting anomalies in weighted graphs. In M. Zaki, J. Yu, B. Ravindran, and V. Pudi, editors, Advances in Knowledge Discovery and Data Mining, volume 6119 of Lecture Notes in Computer Science, pages 410--421. Springer, 2010. Google ScholarDigital Library
R. Andersen, F. Chung, and K. Lang. Local graph partitioning using PageRank vectors. In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science, 2006. Google ScholarDigital Library
R. Andersen and K. J. Lang. Communities from seed sets. In Proceedings of the 15th international conference on the World Wide Web, pages 223--232. ACM Press, 2006. Google ScholarDigital Library
L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan. Group formation in large social networks: membership, growth, and evolution. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD'06, pages 44--54. ACM, 2006. Google ScholarDigital Library
A.-L. Barabási and R. Albert. Emergence of scaling in random networks. Science, 286(5439):509--512, 1999.Google ScholarCross Ref
P. Boldi, M. Rosa, M. Santini, and S. Vigna. Layered label propagation: A multiresolution coordinate-free ordering for compressing social networks. In Proceedings of the 20th WWW2011, pages 587--596, March 2011. Google ScholarDigital Library
P. Boldi and S. Vigna. The Webgraph Framework I: Compression techniques. In Proceedings of the 13th international conference on the World Wide Web, pages 595--602. ACM Press, 2004. Google ScholarDigital Library
F. Bonchi, P. Esfandiar, D. F. Gleich, C. Greif, and L. V. S. Lakshmanan. Fast matrix computations for pair-wise and column-wise commute times and katz scores. Internet Mathematics, To appear., 2011.Google Scholar
R. Burt. Structural Holes: The Social Structure of Competition. Harvard University Press, 1995.Google Scholar
T. CAIDA. http://www.caida.org/tools/measurement/skitter/router_topology/. Accessed in 2005.Google Scholar
F. R. K. Chung. Spectral Graph Theory. American Mathematical Society, 1992.Google Scholar
J. Cohen. Graph twiddling in a MapReduce world. Computing in Science and Engineering, 11(4):29--41, 2009. Google ScholarDigital Library
M. Faloutsos, P. Faloutsos, and C. Faloutsos. On power-law relationships of the internet topology. SIGCOMM Comput. Commun. Rev., 29:251--262, August 1999. Google ScholarDigital Library
M. Fiedler. Algebraic connectivity of graphs. Czechoslovak Mathematical Journal, 23(98):298--305, 1973.Google ScholarCross Ref
U. Gargi, W. Lu, V. Mirrokni, and S. Yoon. Large-scale community detection on YouTube for topic discovery and exploration. In Proceedings of Fifth International AAAI Conference on Weblogs and Social Media, 2011.Google Scholar
J. Huang, H. Sun, Y. Liu, Q. Song, and T. Weninger. Towards online multiresolution community detection in large-scale networks. PLoS ONE, 6(8):e23829, August 2011.Google ScholarCross Ref
R. Kannan, S. Vempala, and A. Vetta. On clusterings: Good, bad and spectral. J. ACM, 51(3):497--515, May 2004. Google ScholarDigital Library
G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput., 20(1):359--392, 1998. Google ScholarDigital Library
J. Kleinberg, S. Suri, E. Tardos, and T. Wexler. Strategic network formation with structural holes. In Proceedings of the 9th ACM conference on Electronic commerce, EC '08, pages 284--293. ACM, 2008. Google ScholarDigital Library
D. E. Knuth. The Stanford GraphBase: A Platform for Combinatorial Computing. Addison-Wesley, 1993. Google Scholar
M. Kolountzakis, G. Miller, R. Peng, and C. Tsourakakis. Efficient triangle counting in large graphs via degree-based vertex partitioning. In Algorithms and Models for the Web-Graph, volume 6516 of Lecture Notes in Computer Science, pages 15--24. Springer, 2010.Google ScholarCross Ref
T. La Fond and J. Neville. Randomization tests for distinguishing social influence and homophily effects. In Proceedings of the 19th international conference on World wide web, WWW '10, pages 601--610. ACM, 2010. Google ScholarDigital Library
J. Leskovec, J. Kleinberg, and C. Faloutsos. Graph evolution: Densification and shrinking diameters. ACM Trans. Knowl. Discov. Data, 1:1--41, March 2007. Google ScholarDigital Library
J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney. Statistical properties of community structure in large social and information networks. In WWW '08: Proceeding of the 17th international conference on World Wide Web, pages 695--704. ACM, 2008. Google ScholarDigital Library
J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Mathematics, 6(1):29--123, September 2009.Google ScholarCross Ref
J. Leskovec, K. J. Lang, and M. Mahoney. Empirical comparison of algorithms for network community detection. In Proceedings of the 19th international conference on World wide web, WWW '10, pages 631--640. ACM, 2010. Google ScholarDigital Library
R. Matei, A. Iamnitchi, and P. Foster. Mapping the Gnutella network. Internet Computing, IEEE, 6(1):50--57, January 2002. Google ScholarDigital Library
M. McPherson, L. Smith-Lovin, and J. M. Cook. Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27:415--444, 2001.Google ScholarCross Ref
M. Newman. http://www-personal.umich.edu/ mejn/netdata/, 2006.Google Scholar
M. E. J. Newman. Scientific collaboration networks. II: Shortest paths, weighted networks, and centrality. Phys. Rev. E, 64:016132, Jun 2001.Google ScholarCross Ref
B. S. Rees and K. B. Gallagher. Overlapping community detection by collective friendship group inference. In International Conference on Advances in Social Network Analysis andMining, pages 375--379, Los Alamitos, CA, USA, 2010. IEEE Computer Society. Google ScholarDigital Library
S. E. Schaeffer. Algorithms for Nonuniform Networks. PhD thesis, Helsinki University of Technology, 2006.Google Scholar
S. E. Schaeffer. Graph clustering. Computer Science Review, 1(1):27--64, 2007. Google ScholarDigital Library
S. B. Seidman. Network structure and minimum degree. Social Networks, 5(3):269--287, 1983.Google ScholarCross Ref
J. Shi and J. Malik. Normalized cuts and image segmentation. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 22(8):888--905, August 2000. Google ScholarDigital Library
A. L. Traud, P. J. Mucha, and M. A. Porter. Social structure of facebook networks. arXiv, cs.SI:1102.2166, 2011.Google Scholar
S. Wasserman and K. Faust. Social network analysis: methods and applications. Cambridge University Press, 1994.Google ScholarCross Ref
D. J. Watts and S. H. Strogatz. Collective dynamics of "small-world" networks. Nature, 393(6684):440--442, 1998.Google ScholarCross Ref
C. Wilson, B. Boe, A. Sala, K. P. Puttaswamy, and B. Y. Zhao. User interactions in social networks and their implications. In Proceedings of the 4th ACM European conference on Computer systems, EuroSys '09, pages 205--218. ACM, 2009. Google ScholarDigital Library

Index Terms

Vertex neighborhoods, low conductance cuts, and good seeds for local community methods
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Cluster analysis

Recommendations

Mining Large Quasi-cliques with Quality Guarantees from Vertex Neighborhoods
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Mining dense subgraphs is an important primitive across a spectrum of graph-mining tasks. In this work, we formally establish that two recurring characteristics of real-world graphs, namely heavy-tailed degree distributions and large clustering ...
Read More
On Vertex-Disjoint Triangles in Tripartite Graphs and Multigraphs
Abstract
Let G be a tripartite graph with tripartition $(V_{1}, V_{2}, V_{3})$ , where $∣ V_{1} ∣ = ∣ V_{2} ∣ = ∣ V_{3} ∣ = k > 0$ . It is proved that if $d (x) + d (y) \geq 3 k$ for every pair of nonadjacent vertices $x \in V_{i}, y \in V_{j}$ with $i \neq j (i, j \in {1, 2, 3})$ , then G contains k vertex-disjoint triangles. As a ... $\frac{}{}$ $_{}_{}_{}$ $_{}_{}_{}$ $_{}$
Read More
Contractible elements in k-connected graphs not containing some specified graphs

In [15], Thomassen proved that any triangle-free k-connected graph has a contractible edge. Starting with this result, there are several results concerning the existence of contractible elements in k-connected graphs which do not contain specified ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2012
1616 pages
ISBN:9781450314626
DOI:10.1145/2339530
General Chair:
Qiang Yang
Hong Kong University of Science and Technology
,
Program Chairs:
Deepak Agarwal
LinkedIn
,
Jian Pei
Simon Fraser University
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 August 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
clustering coefficients
conductance
egonets
triangles
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 130
  Total Citations
  View Citations
- 1,291
  Total Downloads
- Downloads (Last 12 months)34
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Vertex neighborhoods, low conductance cuts, and good seeds for local community methods

KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Mining Large Quasi-cliques with Quality Guarantees from Vertex Neighborhoods

On Vertex-Disjoint Triangles in Tripartite Graphs and Multigraphs

Contractible elements in k-connected graphs not containing some specified graphs