research-article

GEMSEC: graph embedding with self clustering

Authors:
Benedek Rozemberczki

The University of Edinburgh

The University of Edinburgh
View Profile

,
Ryan Davies

The University of Edinburgh

The University of Edinburgh
View Profile

,
Rik Sarkar

The University of Edinburgh

The University of Edinburgh
View Profile

,
Charles Sutton

The University of Edinburgh

The University of Edinburgh
View Profile

ASONAM '19: Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and MiningAugust 2019Pages 65–72https://doi.org/10.1145/3341161.3342890

Published:15 January 2020Publication History

ASONAM '19: Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Pages 65–72

ABSTRACT

Modern graph embedding procedures can efficiently process graphs with millions of nodes. In this paper, we propose GEMSEC - a graph embedding algorithm which learns a clustering of the nodes simultaneously with computing their embedding. GEMSEC is a general extension of earlier work in the domain of sequence-based graph embedding. GEMSEC places nodes in an abstract feature space where the vertex features minimize the negative log-likelihood of preserving sampled vertex neighborhoods, and it incorporates known social network properties through a machine learning regularization. We present two new social network datasets and show that by simultaneously considering the embedding and clustering problems with respect to social properties, GEMSEC extracts high-quality clusters competitive with or superior to other community detection algorithms. In experiments, the method is found to be computationally efficient and robust to the choice of hyperparameters.

References

T. Van Laarhoven and E. Marchiori, "Robust community detection methods with resolution parameter for complex detection in protein protein interaction networks," Pattern Recognition in Bioinformatics, pp. 1--13, 2012.Google Scholar
L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan, "Group formation in large social networks: Membership, growth, and evolution," in Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2006, pp. 44--54.Google Scholar
S. Papadopoulos, Y. Kompatsiaris, A. Vakali, and P. Spyridonos, "Community detection in social media," Data Mining and Knowledge Discovery, vol. 24, no. 3, pp. 515--554, 2012.Google ScholarDigital Library
J. Leskovec, A. Rajaraman, and J. D. Ullman, Mining of Massive Datasets. Cambridge University Press, 2014.Google ScholarDigital Library
P. Pascal and M. Latapy, In International Symposium on Computer and Information Sciences. Springer Berlin Heidelberg, 2005, ch. Computing Communities in Large Networks Using Random Walks., pp. 284--293.Google Scholar
S. Gregory, "Finding overlapping communities in networks by label propagation," New Journal of Physics, vol. 12, no. 10, p. 103018, 2010.Google ScholarCross Ref
P. Goyal and E. Ferrara, "Graph embedding techniques, applications, and performance: A survey," arXiv preprint arXiv:1705.02801, 2017.Google Scholar
B. Perozzi, R. Al-Rfou, and S. Skiena, "Deepwalk: Online learning of social representations." in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining., 2014.Google Scholar
A. Grover and J. Leskovec, "Node2vec: Scalable feature learning for networks," in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 855--864.Google Scholar
W. W. Zachary, "An information flow model for conflict and fission in small groups," Journal of anthropological research, vol. 33, no. 4, pp. 452--473, 1977.Google ScholarCross Ref
T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient estimation of word representations in vector space," 2013.Google Scholar
J. B. Diederik P. Kingma, "Adam: A method for stochastic optimization," in Proceedings of the 3rd International Conference on Learning Representations (ICLR), 2015.Google Scholar
S. Cavallari, V. W. Zheng, H. Cai, K. C.-C. Chang, and E. Cambria, "Learning community embedding with community detection and node embedding on graphs," in Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 2017, pp. 377--386.Google Scholar
X. Wang, P. Cui, J. Wang, J. Pei, W. Zhu, and S. Yang, "Community preserving network embedding." in AAAI, 2017, pp. 203--209.Google ScholarDigital Library
F. Ye, C. Chen, and Z. Zheng, "Deep autoencoder-like nonnegative matrix factorization for community detection," in Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 2018, pp. 1393--1402.Google Scholar
J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei, "Line: Large-scale information network embedding," in Proceedings of the 24th International Conference on World Wide Web, 2015, pp. 1067--1077.Google Scholar
B. Perozzi, V. Kulkarni, H. Chen, and S. Skiena, "Don't walk, skip!: Online learning of multi-scale network embeddings," in Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017, 2017, pp. 258--265.Google Scholar
J. Fakcharoenphol, S. Rao, and K. Talwar, "A tight bound on approximating arbitrary metrics by tree metrics," Journal of Computer and System Sciences, vol. 69, no. 3, pp. 485--497, 2004.Google ScholarDigital Library
J. Matoušek, Lectures on discrete geometry. Springer, 2002, vol. 108.Google ScholarCross Ref
X. Yu, X. Ban, W. Zeng, R. Sarkar, X. Gu, and J. Gao, "Spherical representation and polyhedron routing for load balancing in wireless sensor networks," in 2011 Proceedings IEEE INFOCOM. IEEE, 2011, pp. 621--625.Google Scholar
K. Huang, C.-C. Ni, R. Sarkar, J. Gao, and J. S. Mitchell, "Bounded stretch geographic homotopic routing in sensor networks," in IEEE INFOCOM 2014-IEEE Conference on Computer Communications. IEEE, 2014, pp. 979--987.Google Scholar
R. Sarkar, "Low distortion delaunay embedding of trees in hyperbolic plane," in International Symposium on Graph Drawing. Springer, 2011, pp. 355--366.Google Scholar
C. De Sa, A. Gu, C. Ré, and F. Sala, "Representation tradeoffs for hyperbolic embeddings," Proceedings of machine learning research, vol. 80, p. 4460, 2018.Google Scholar
W. Zeng, R. Sarkar, F. Luo, X. Gu, and J. Gao, "Resilient routing for sensor networks using hyperbolic embedding of universal covering space," in 2010 Proceedings IEEE INFOCOM. IEEE, 2010, pp. 1--9.Google Scholar
B. Rozemberczki and R. Sarkar, "Fast sequence-based embedding with diffusion graphs," in International Workshop on Complex Networks. Springer, 2018, pp. 99--107.Google Scholar
J. Qiu, Y. Dong, H. Ma, J. Li, K. Wang, and J. Tang, "Network embedding as matrix factorization: Unifying deepwalk, line, pte, and node2vec," in Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. ACM, 2018, pp. 459--467.Google Scholar
M. Gutmann and A. Hyvarinen, "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models," in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010, pp. 297--304.Google Scholar
J.-P. Onnela, J. Saramäki, J. Hyvönen, G. Szabó, D. Lazer, K. Kaski, J. Kertész, and A.-L. Barabási, "Structure and tie strengths in mobile communication networks," Proceedings of the national academy of sciences, vol. 104, no. 18, pp. 7332--7336, 2007.Google ScholarCross Ref
A. Ahmed, N. Shervashidze, S. Narayanamurthy, V. Josifovski, and A. J. Smola, "Distributed large-scale natural graph factorization," in Proceedings of the 22nd international conference on World Wide Web, 2013, pp. 37--48.Google Scholar

GEMSEC: graph embedding with self clustering
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction paradigms
2. Information systems
  1. Information systems applications

Recommendations

Deep MinCut: Learning Node Embeddings by Detecting Communities
Highlights
- Our node embeddings are both interpretable and competent for classification tasks.
Abstract
We present Deep MinCut (DMC), an unsupervised approach to learn node embeddings for graph-structured data. It derives node representations based on their membership in communities. As such, the embeddings directly provide insights into ...
Read More
Network Embedding for Community Detection in Attributed Networks

Community detection aims to partition network nodes into a set of clusters, such that nodes are more densely connected to each other within the same cluster than other clusters. For attributed networks, apart from the denseness requirement of topology ...
Read More
Iterative Graph Embedding and Clustering
Advances in Computational Intelligence
Abstract
Graph embedding can be seen as a transformation of any graph into low-dimensional vector space, where each vertex of the graph has a one-to-one correspondence with a vector in that space. The latest study in this field shows a particular interest ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ASONAM '19: Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
August 2019
1228 pages
ISBN:9781450368681
DOI:10.1145/3341161
Editors:
Francesca Spezzano
Boise State University
,
Wei Chen
Microsoft Research, China
,
Xiaokui Xiao
National University of Singapore, Singapore
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 January 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
clustering
community detection
feature extraction
network embedding
node embedding
Qualifiers
- research-article
Conference

Acceptance Rates
ASONAM '19 Paper Acceptance Rate41of286submissions,14%Overall Acceptance Rate116of549submissions,21%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 157
  Total Citations
  View Citations
- 1,247
  Total Downloads
- Downloads (Last 12 months)182
- Downloads (Last 6 weeks)21
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

GEMSEC: graph embedding with self clustering

ASONAM '19: Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

ABSTRACT

References

Cited By

Recommendations

Deep MinCut: Learning Node Embeddings by Detecting Communities

Network Embedding for Community Detection in Attributed Networks

Iterative Graph Embedding and Clustering

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

GEMSEC: graph embedding with self clustering

ASONAM '19: Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

ABSTRACT

References

Cited By

Recommendations

Deep MinCut: Learning Node Embeddings by Detecting Communities

Network Embedding for Community Detection in Attributed Networks

Iterative Graph Embedding and Clustering

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media