skip to main content
10.1145/3097983.3098036acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Public Access

metapath2vec: Scalable Representation Learning for Heterogeneous Networks

Published:04 August 2017Publication History

ABSTRACT

We study the problem of representation learning in heterogeneous networks. Its unique challenges come from the existence of multiple types of nodes and links, which limit the feasibility of the conventional network embedding techniques. We develop two scalable representation learning models, namely metapath2vec and metapath2vec++. The metapath2vec model formalizes meta-path-based random walks to construct the heterogeneous neighborhood of a node and then leverages a heterogeneous skip-gram model to perform node embeddings. The metapath2vec++ model further enables the simultaneous modeling of structural and semantic correlations in heterogeneous networks. Extensive experiments show that metapath2vec and metapath2vec++ are able to not only outperform state-of-the-art embedding models in various heterogeneous network mining tasks, such as node classification, clustering, and similarity search, but also discern the structural and semantic correlations between diverse network objects.

References

  1. Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, and others. 2016. TensorFlow: A system for large-scale machine learning OSDI '16.Google ScholarGoogle Scholar
  2. Amr Ahmed, Nino Shervashidze, Shravan Narayanamurthy, Vanja Josifovski, and Alexander J. Smola 2013. Distributed Large-scale Natural Graph Factorization WWW 13. ACM, 37--48.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Yoshua Bengio, Aaron Courville, and Pierre Vincent. 2013. Representation learning: A review and new perspectives. IEEE TPAMI, Vol. 35, 8 (2013), 1798--1828. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Shiyu Chang, Wei Han, Jiliang Tang, Guo-Jun Qi, Charu C. Aggarwal, and Thomas S. Huang 2015. Heterogeneous Network Embedding via Deep Architectures KDD '15. ACM, 119--128.Google ScholarGoogle Scholar
  5. Ting Chen and Yizhou Sun 2017. Task-Guided and Path-Augmented Heterogeneous Network Embedding for Author Identification WSDM '17. ACM.Google ScholarGoogle Scholar
  6. Yuxiao Dong, Jing Zhang, Jie Tang, Nitesh V. Chawla, and Bai Wang 2015. CoupledLP: Link Prediction in Coupled Networks. In KDD '15. ACM, 199--208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Yoav Goldberg and Omer Levy 2014. word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method. CoRR Vol. abs/1402.3722 (2014).Google ScholarGoogle Scholar
  8. Aditya Grover and Jure Leskovec 2016. Node2Vec: Scalable Feature Learning for Networks. KDD '16. ACM, 855--864.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Keith Henderson, Brian Gallagher, Tina Eliassi-Rad, Hanghang Tong, Sugato Basu, Leman Akoglu, Danai Koutra, Christos Faloutsos, and Lei Li 2012. Rolx: structural role extraction & mining in large graphs KDD '12. ACM, 1231--1239.Google ScholarGoogle Scholar
  10. Peter D Hoff, Adrian E Raftery, and Mark S Handcock. 2002. Latent space approaches to social network analysis. Journal of the American Statistical association, Vol. 97, 460 (2002), 1090--1098.Google ScholarGoogle ScholarCross RefCross Ref
  11. Xiao Huang, Jundong Li, and Xia Hu 2017. Label Informed Attributed Network Embedding. In WSDM '17. na. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Zhipeng Huang, Yudian Zheng, Reynold Cheng, Yizhou Sun, Nikos Mamoulis, and Xiang Li. 2016. Meta structure: Computing relevance in large heterogeneous information networks KDD '16. ACM, 1595--1604.Google ScholarGoogle Scholar
  13. Ming Ji, Jiawei Han, and Marina Danilevsky 2011. Ranking-based classification of heterogeneous information networks KDD '11. ACM, 1298--1306.Google ScholarGoogle Scholar
  14. Yehuda Koren. 2008. Factorization meets the neighborhood: a multifaceted collaborative filtering model KDD '08. ACM, 426--434.Google ScholarGoogle Scholar
  15. Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature, Vol. 521, 7553 (2015), 436--444. Google ScholarGoogle Scholar
  16. Hao Ma, Dengyong Zhou, Chao Liu, Michael R Lyu, and Irwin King 2011. Recommender systems with social regularization. In WSDM '11. 287--296.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. CoRR Vol. abs/1301.3781 (2013).Google ScholarGoogle Scholar
  18. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean 2013. Distributed representations of words and phrases and their compositionality NIPS '13. 3111--3119.Google ScholarGoogle Scholar
  19. Jennifer Neville and David Jensen 2005. Leveraging relational autocorrelation with latent group models Proceedings of the 4th international workshop on Multi-relational mining. ACM, 49--55.Google ScholarGoogle Scholar
  20. Mingdong Ou, Peng Cui, Jian Pei, Ziwei Zhang, and Wenwu Zhu 2016. Asymmetric Transitivity Preserving Graph Embedding KDD '16. ACM, 1105--1114.Google ScholarGoogle Scholar
  21. Siddharth Pal, Yuxiao Dong, Bishal Thapa, Nitesh V Chawla, Ananthram Swami, and Ram Ramanathan. 2016. Deep learning for network analysis: Problems, approaches and challenges Military Communications Conference, MILCOM 2016--2016. IEEE, 588--593.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: Online Learning of Social Representations KDD '14. ACM, 701--710.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Xiang Ren, Wenqi He, Meng Qu, Clare R Voss, Heng Ji, and Jiawei Han. 2016. Label noise reduction in entity typing by heterogeneous partial-label embedding KDD '16. ACM.Google ScholarGoogle Scholar
  24. Xin Rong 2014. word2vec Parameter Learning Explained. CoRR Vol. abs/1411.2738 (2014).Google ScholarGoogle Scholar
  25. Yizhou Sun and Jiawei Han 2012. Mining Heterogeneous Information Networks: Principles and Methodologies. Morgan & Claypool Publishers.Google ScholarGoogle Scholar
  26. Yizhou Sun, Jiawei Han, Xifeng Yan, Philip S. Yu, and Tianyi Wu 2011. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks VLDB '11. 992--1003.Google ScholarGoogle Scholar
  27. Yizhou Sun, Brandon Norick, Jiawei Han, Xifeng Yan, Philip S. Yu, and Xiao Yu. 2012. Integrating Meta-path Selection with User-guided Object Clustering in Heterogeneous Information Networks. In KDD '12. ACM, 1348--1356. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Yizhou Sun, Yintao Yu, and Jiawei Han 2009. Ranking-based Clustering of Heterogeneous Information Networks with Star Network Schema KDD '09. ACM, 797--806.Google ScholarGoogle Scholar
  29. Jian Tang, Meng Qu, and Qiaozhu Mei 2015. PTE: Predictive Text Embedding Through Large-scale Heterogeneous Text Networks KDD '15. ACM, 1165--1174.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. LINE: Large-scale Information Network Embedding.. WWW '15. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, and Zhong Su 2008. ArnetMiner: Extraction and Mining of Academic Social Networks KDD '08. 990--998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Lei Tang and Huan Liu. 2009. Relational learning via latent social dimensions. KDD '09. 817--826. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Lei Tang and Huan Liu. 2011. Leveraging social media networks for classification. DMKD, Vol. 23, 3 (2011), 447--478. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Shuicheng Yan, Dong Xu, Benyu Zhang, Hong-Jiang Zhang, Qiang Yang, and Stephen Lin. 2007. Graph embedding and extensions: A general framework for dimensionality reduction. IEEE TPAMI, Vol. 29, 1 (2007). Google ScholarGoogle ScholarCross RefCross Ref
  35. Jing Zhang, Jie Tang, Cong Ma, Hanghang Tong, Yu Jing, and Juanzi Li. 2015. Panther: Fast top-k similarity search on large networks KDD '15. ACM, 1445--1454.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. metapath2vec: Scalable Representation Learning for Heterogeneous Networks

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
            August 2017
            2240 pages
            ISBN:9781450348874
            DOI:10.1145/3097983

            Copyright © 2017 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 4 August 2017

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            KDD '17 Paper Acceptance Rate64of748submissions,9%Overall Acceptance Rate1,133of8,635submissions,13%

            Upcoming Conference

            KDD '24

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader