research-article

GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training

Authors:
Jiezhong Qiu

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Qibin Chen

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Yuxiao Dong

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

,
Jing Zhang

Renmin University, Beijing, China

Renmin University, Beijing, China
View Profile

,
Hongxia Yang

DAMO Academy, Alibaba Group, Hangzhou, China

DAMO Academy, Alibaba Group, Hangzhou, China
View Profile

,
Ming Ding

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Kuansan Wang

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

,
Jie Tang

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningAugust 2020Pages 1150–1160https://doi.org/10.1145/3394486.3403168

Published:20 August 2020Publication History

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 1150–1160

ABSTRACT

Graph representation learning has emerged as a powerful technique for addressing real-world problems. Various downstream graph learning tasks have benefited from its recent developments, such as node classification, similarity search, and graph classification. However, prior arts on graph representation learning focus on domain specific problems and train a dedicated model for each graph dataset, which is usually non-transferable to out-of-domain data. Inspired by the recent advances in pre-training from natural language processing and computer vision, we design Graph Contrastive Coding (GCC) --- a self-supervised graph neural network pre-training framework --- to capture the universal network topological properties across multiple networks. We design GCC's pre-training task as subgraph instance discrimination in and across networks and leverage contrastive learning to empower graph neural networks to learn the intrinsic and transferable structural representations. We conduct extensive experiments on three graph learning tasks and ten graph datasets. The results show that GCC pre-trained on a collection of diverse datasets can achieve competitive or better performance to its task-specific and trained-from-scratch counterparts. This suggests that the pre-training and fine-tuning paradigm presents great potential for graph representation learning.

References

Réka Albert and Albert-László Barabási. 2002. Statistical mechanics of complex networks. Reviews of modern physics, Vol. 74, 1 (2002), 47.Google Scholar
J Ignacio Alvarez-Hamelin, Luca Dall'Asta, Alain Barrat, and Alessandro Vespignani. 2006. Large scale networks fingerprinting and visualization using the k-core decomposition. In Advances in neural information processing systems. 41--50.Google Scholar
Lars Backstrom, Dan Huttenlocher, Jon Kleinberg, and Xiangyang Lan. 2006. Group formation in large social networks: membership, growth, and evolution. In KDD '06 . 44--54.Google ScholarDigital Library
Peter W Battaglia, Jessica B Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, et almbox. 2018. Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261 (2018).Google Scholar
Austin R Benson, David F Gleich, and Jure Leskovec. 2016. Higher-order organization of complex networks. Science , Vol. 353, 6295 (2016), 163--166.Google Scholar
Stephen P Borgatti and Martin G Everett. 2000. Models of core/periphery structures. Social networks, Vol. 21, 4 (2000), 375--395.Google Scholar
Ronald S Burt. 2009. Structural holes: The social structure of competition .Harvard university press.Google Scholar
Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM transactions on intelligent systems and technology (TIST) , Vol. 2, 3 (2011), 1--27.Google ScholarDigital Library
Kevin Clark, Minh-Thang Luong, Quoc V Le, and Christopher D Manning. 2019. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. In ICLR '19 .Google Scholar
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT '19. 4171--4186.Google Scholar
Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami. 2017. metapath2vec: Scalable representation learning for heterogeneous networks. In KDD '17 . 135--144.Google ScholarDigital Library
Claire Donnat, Marinka Zitnik, David Hallac, and Jure Leskovec. 2018. Learning structural node embeddings via diffusion wavelets. In KDD '18 . 1320--1329.Google ScholarDigital Library
Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. 2017. Neural message passing for quantum chemistry. In ICML '17. JMLR. org, 1263--1272.Google Scholar
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In KDD '16. 855--864.Google ScholarDigital Library
Raia Hadsell, Sumit Chopra, and Yann LeCun. 2006. Dimensionality reduction by learning an invariant mapping. In CVPR '06, Vol. 2. IEEE, 1735--1742.Google ScholarDigital Library
Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in neural information processing systems. 1024--1034.Google Scholar
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In CVPR '20 . 9729--9738.Google ScholarCross Ref
Keith Henderson, Brian Gallagher, Tina Eliassi-Rad, Hanghang Tong, Sugato Basu, Leman Akoglu, Danai Koutra, Christos Faloutsos, and Lei Li. 2012. Rolx: structural role extraction & mining in large graphs. In KDD '12. 1231--1239.Google ScholarDigital Library
Weihua Hu, Bowen Liu, Joseph Gomes, Marinka Zitnik, Percy Liang, Vijay Pande, and Jure Leskovec. 2019 b. Pre-training graph neural networks. In ICLR '19 .Google Scholar
Ziniu Hu, Changjun Fan, Ting Chen, Kai-Wei Chang, and Yizhou Sun. 2019 a. Unsupervised Pre-Training of Graph Convolutional Networks. ICLR 2019 Workshop: Representation Learning on Graphs and Manifolds (2019).Google Scholar
Glen Jeh and Jennifer Widom. 2002. SimRank: a measure of structural-context similarity. In KDD '02 . 538--543.Google ScholarDigital Library
Yilun Jin, Guojie Song, and Chuan Shi. 2019. GraLSP: Graph Neural Networks with Local Structural Patterns. arXiv preprint arXiv:1911.07675 (2019).Google Scholar
Kristian Kersting, Nils M. Kriege, Christopher Morris, Petra Mutzel, and Marion Neumann. 2016. Benchmark Data Sets for Graph Kernels. http://graphkernels.cs.tu-dortmund.deGoogle Scholar
Diederik P Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. ICLR '15 .Google Scholar
Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR '17 .Google Scholar
Elizabeth A Leicht, Petter Holme, and Mark EJ Newman. 2006. Vertex similarity in networks. Physical Review E , Vol. 73, 2 (2006), 026120.Google ScholarCross Ref
Jure Leskovec and Christos Faloutsos. 2006. Sampling from large graphs. In KDD '06. 631--636.Google ScholarDigital Library
Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2005. Graphs over time: densification laws, shrinking diameters and possible explanations. In KDD '05 . 177--187.Google ScholarDigital Library
Silvio Micali and Zeyuan Allen Zhu. 2016. Reconstructing markov processes from independent and anonymous experiments. Discrete Applied Mathematics , Vol. 200 (2016), 108--122.Google ScholarDigital Library
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.Google Scholar
Ron Milo, Shalev Itzkovitz, Nadav Kashtan, Reuven Levitt, Shai Shen-Orr, Inbal Ayzenshtat, Michal Sheffer, and Uri Alon. 2004. Superfamilies of evolved and designed networks. Science , Vol. 303, 5663 (2004), 1538--1542.Google Scholar
Ron Milo, Shai Shen-Orr, Shalev Itzkovitz, Nadav Kashtan, Dmitri Chklovskii, and Uri Alon. 2002. Network motifs: simple building blocks of complex networks. Science , Vol. 298, 5594 (2002), 824--827.Google Scholar
Annamalai Narayanan, Mahinthan Chandramohan, Rajasekar Venkatesan, Lihui Chen, Yang Liu, and Shantanu Jaiswal. 2017. graph2vec: Learning distributed representations of graphs. arXiv preprint arXiv:1707.05005 (2017).Google Scholar
Mark EJ Newman. 2006. Modularity and community structure in networks. Proceedings of the national academy of sciences , Vol. 103, 23 (2006), 8577--8582.Google ScholarCross Ref
Aaron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018).Google Scholar
Jia-Yu Pan, Hyung-Jeong Yang, Christos Faloutsos, and Pinar Duygulu. 2004. Automatic multimedia cross-modal correlation discovery. In KDD '04 . 653--658.Google ScholarDigital Library
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems. 8024--8035.Google ScholarDigital Library
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et almbox. 2011. Scikit-learn: Machine learning in Python. Journal of machine learning research , Vol. 12, Oct (2011), 2825--2830.Google ScholarDigital Library
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In KDD '14 . 701--710.Google ScholarDigital Library
Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. 2019. Netsmf: Large-scale network embedding as sparse matrix factorization. In The World Wide Web Conference. 1509--1520.Google ScholarDigital Library
Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Kuansan Wang, and Jie Tang. 2018a. Network embedding as matrix factorization: Unifying deepwalk, line, pte, and node2vec. In WSDM '18 . 459--467.Google ScholarDigital Library
Jiezhong Qiu, Jian Tang, Hao Ma, Yuxiao Dong, Kuansan Wang, and Jie Tang. 2018b. Deepinf: Social influence prediction with deep learning. In KDD '18 . 2110--2119.Google ScholarDigital Library
Leonardo FR Ribeiro, Pedro HP Saverese, and Daniel R Figueiredo. 2017. struc2vec: Learning node representations from structural identity. In KDD '17 . 385--394.Google ScholarDigital Library
Scott C Ritchie, Stephen Watts, Liam G Fearnley, Kathryn E Holt, Gad Abraham, and Michael Inouye. 2016. A scalable permutation approach reveals replication and preservation patterns of network modules in large datasets. Cell systems , Vol. 3, 1 (2016), 71--82.Google Scholar
Daniel A Spielman and Shang-Hua Teng. 2013. A local clustering algorithm for massive graphs and its application to nearly linear time graph partitioning. SIAM Journal on computing , Vol. 42, 1 (2013), 1--26.Google Scholar
Fan-Yun Sun, Jordan Hoffman, Vikas Verma, and Jian Tang. 2019. InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization. In ICLR '19 .Google Scholar
Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. Line: Large-scale information network embedding. In WWW '15. 1067--1077.Google ScholarDigital Library
Shang-Hua Teng et almbox. 2016. Scalable algorithms for data and network analysis. Foundations and Trends® in Theoretical Computer Science , Vol. 12, 1--2 (2016), 1--274.Google Scholar
Yonglong Tian, Dilip Krishnan, and Phillip Isola. 2019. Contrastive multiview coding. arXiv preprint arXiv:1906.05849 (2019).Google Scholar
Hanghang Tong, Christos Faloutsos, and Jia-Yu Pan. 2006. Fast random walk with restart and its applications. In ICDM '06. IEEE, 613--622.Google ScholarDigital Library
Johan Ugander, Lars Backstrom, Cameron Marlow, and Jon Kleinberg. 2012. Structural diversity in social contagion. Proceedings of the National Academy of Sciences , Vol. 109, 16 (2012), 5962--5966.Google ScholarCross Ref
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.Google Scholar
Petar Velivc ković , Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018. Graph attention networks. ICLR '18 (2018).Google Scholar
Ulrike Von Luxburg. 2007. A tutorial on spectral clustering. Statistics and computing , Vol. 17, 4 (2007), 395--416.Google Scholar
Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. 2019 a. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. In ICLR '19 .Google Scholar
Minjie Wang, Lingfan Yu, Da Zheng, Quan Gan, Yu Gai, Zihao Ye, Mufei Li, Jinjing Zhou, Qi Huang, Chao Ma, et almbox. 2019 b. Deep graph library: Towards efficient and scalable deep learning on graphs. arXiv preprint arXiv:1909.01315 (2019).Google Scholar
Duncan J Watts and Steven H Strogatz. 1998. Collective dynamics of small-world networks. nature , Vol. 393, 6684 (1998), 440.Google Scholar
Zhirong Wu, Yuanjun Xiong, Stella X Yu, and Dahua Lin. 2018. Unsupervised feature learning via non-parametric instance discrimination. In CVPR '18 . 3733--3742.Google ScholarCross Ref
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How Powerful are Graph Neural Networks?. In ICLR '19 .Google Scholar
Pinar Yanardag and SVN Vishwanathan. 2015. Deep graph kernels. In KDD '15. 1365--1374.Google ScholarDigital Library
Jaewon Yang and Jure Leskovec. 2015. Defining and evaluating network communities based on ground-truth. Knowledge and Information Systems , Vol. 42, 1 (2015), 181--213.Google ScholarDigital Library
Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L Hamilton, and Jure Leskovec. 2018. Graph convolutional neural networks for web-scale recommender systems. In KDD '18 . 974--983.Google ScholarDigital Library
Fanjin Zhang, Xiao Liu, Jie Tang, Yuxiao Dong, Peiran Yao, Jie Zhang, Xiaotao Gu, Yan Wang, Bin Shao, Rui Li, and et al. 2019 b. OAG: Toward Linking Large-Scale Heterogeneous Entity Graphs. In KDD '19 . 2585--2595.Google Scholar
Jie Zhang, Yuxiao Dong, Yan Wang, Jie Tang, and Ming Ding. 2019 a. ProNE: fast and scalable network representation learning. In IJCAI '19 . 4278--4284.Google ScholarCross Ref
Jing Zhang, Jie Tang, Cong Ma, Hanghang Tong, Yu Jing, and Juanzi Li. 2015. Panther: Fast top-k similarity search on large networks. In KDD '15 . 1445--1454.Google ScholarDigital Library
Muhan Zhang, Zhicheng Cui, Marion Neumann, and Yixin Chen. 2018. An end-to-end deep learning architecture for graph classification. In AAAI '18 .Google ScholarCross Ref

Index Terms

GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
2. Information systems
  1. Information systems applications
    1. Data mining
  2. World Wide Web
    1. Web applications
      1. Social networks

Recommendations

Multi-scale Graph Pooling Approach with Adaptive Key Subgraph for Graph Representations
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

The recent progress in graph representation learning boosts the development of many graph classification tasks, such as protein classification and social network classification. One of the mainstream approaches for graph representation learning is the ...
Read More
Self-supervised contrastive graph representation with node and graph augmentation
Abstract
Graph representation is a critical technology in the field of knowledge engineering and knowledge-based applications since most knowledge bases are represented in the graph structure. Nowadays, contrastive learning has become a prominent way for ...
Read More
SMGCL: Semi-supervised Multi-view Graph Contrastive Learning
Abstract
Graph contrastive learning (GCL), aiming to generate supervision information by transforming the graph data itself, is increasingly becoming a focus of graph research. It has shown promising performance in graph representation learning ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
August 2020
3664 pages
ISBN:9781450379984
DOI:10.1145/3394486
General Chairs:
Rajesh Gupta
UC San Diego, USA
,
Yan Liu
USC, USA
,
Program Chairs:
Mohak Shah
LG Electronics, USA
,
Suju Rajan
Linkedin, USA
,
Publications Chairs:
Jiliang Tang
Michigan State, USA
,
B. Aditya Prakash
Georgia Tech, USA
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 August 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
graph neural network
graph representation learning
pre-training
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 396
  Total Citations
  View Citations
- 4,449
  Total Downloads
- Downloads (Last 12 months)793
- Downloads (Last 6 weeks)86
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Multi-scale Graph Pooling Approach with Adaptive Key Subgraph for Graph Representations

Self-supervised contrastive graph representation with node and graph augmentation

SMGCL: Semi-supervised Multi-view Graph Contrastive Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Multi-scale Graph Pooling Approach with Adaptive Key Subgraph for Graph Representations

Self-supervised contrastive graph representation with node and graph augmentation

SMGCL: Semi-supervised Multi-view Graph Contrastive Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media