skip to main content
research-article

AliGraph: a comprehensive graph neural network platform

Published:01 August 2019Publication History
Skip Abstract Section

Abstract

An increasing number of machine learning tasks require dealing with large graph datasets, which capture rich and complex relationship among potentially billions of elements. Graph Neural Network (GNN) becomes an effective way to address the graph learning problem by converting the graph data into a low dimensional space while keeping both the structural and property information to the maximum extent and constructing a neural network for training and referencing. However, it is challenging to provide an efficient graph storage and computation capabilities to facilitate GNN training and enable development of new GNN algorithms. In this paper, we present a comprehensive graph neural network system, namely AliGraph, which consists of distributed graph storage, optimized sampling operators and runtime to efficiently support not only existing popular GNNs but also a series of in-house developed ones for different scenarios. The system is currently deployed at Alibaba to support a variety of business scenarios, including product recommendation and personalized search at Alibaba's E-Commerce platform. By conducting extensive experiments on a real-world dataset with 492.90 million vertices, 6.82 billion edges and rich attributes, AliGraph performs an order of magnitude faster in terms of graph building (5 minutes vs hours reported from the state-of-the-art PowerGraph platform). At training, AliGraph runs 40%-50% faster with the novel caching strategy and demonstrates around 12 times speed up with the improved runtime. In addition, our in-house developed GNN models all showcase their statistically significant superiorities in terms of both effectiveness and efficiency (e.g., 4.12%--17.19% lift by F1 scores).

References

  1. P. Battaglia, R. Pascanu, M. Lai, and D. J. Rezende. Interaction networks for learning about objects, relations and physics. In NIPS, pages 4502--4510, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Bhagat, G. Cormode, and S. Muthukrishnan. Node classification in social networks. Computer Science, 16(3):115--148, 2011.Google ScholarGoogle Scholar
  3. E. G. Boman, K. D. Devine, and S. Rajamanickam. Scalable matrix computations on large scale-free graphs using 2d graph partitioning. 2013.Google ScholarGoogle Scholar
  4. U. Brandes, M. Gaertler, and D. Wagner. Experiments on graph clustering algorithms. LNCS, 2832:568--579, 2003.Google ScholarGoogle Scholar
  5. H. Cai, V. W. Zheng, C. C. Chang, H. Cai, V. W. Zheng, and C. C. Chang. A comprehensive survey of graph embedding: Problems, techniques and applications. TKDE, 30(9):1616--1637, 2017.Google ScholarGoogle Scholar
  6. S. Chang, W. Han, J. Tang, G.-J. Qi, C. C. Aggarwal, and T. S. Huang. Heterogeneous network embedding via deep architectures. In KDD, pages 119--128, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Cen, Y., Zou, X., Zhang, J., Yang, H., Zhou, J., Tang, J. Representation Learning for Attributed Multiplex Heterogeneous Network. In KDD, 2019. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Liu, N., Tan, Q., Li, Y., Yang, H., Zhou, J., Hu, X. Is a Single Vector Enough? Exploring Node Polysemy for Network Embedding. In KDD, 2019. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Li, C., Shen, D., Jia, K., Yang, H. Hierarchical Representation Learning for Bipartite Graphs. In IJCAI, 2019.Google ScholarGoogle Scholar
  10. Zhao, Y., Wang, X., Yang, H., Song, L. and Tang, J. Large Scale Evolving Graphs with Burst Detection. In IJCAI, 2019.Google ScholarGoogle Scholar
  11. J. Chen, T. Ma, and C. Xiao. Fastgcn: fast learning with graph convolutional networks via importance sampling. arXiv.1801.10247, 2018.Google ScholarGoogle Scholar
  12. M. Chrobak and J. Noga. Lru is better than fifo. In Acm-siam Symposium on Discrete Algorithms, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. P. Cui, X. Wang, J. Pei, and W. Zhu. A survey on network embedding. TKDE, 2018.Google ScholarGoogle Scholar
  14. Y. Dong, N. V. Chawla, and A. Swami. metapath2vec: Scalable representation learning for heterogeneous networks. In KDD, pages 135--144, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. L. Du, Y. Wang, G. Song, Z. Lu, and J. Wang. Dynamic network embedding: An extended approach for skip-gram based network embedding. In IJCAI, pages 2086--2092, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. G. Duran and M. Niepert. Learning graph representations with embedding propagation. In Advances in Neural Information Processing Systems, pages 5119--5130, 2017.Google ScholarGoogle Scholar
  17. W. Fan, J. Xu, Y. Wu, W. Yu, J. Jiang, Z. Zheng, B. Zhang, Y. Cao, and C. Tian. Parallelizing sequential graph computations. In SIGMOD, pages 495--510, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Fout, J. Byrd, B. Shariat, and A. Ben-Hur. Protein interface prediction using graph convolutional networks. In NIPS, pages 6530--6539, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. H. Gao and H. Huang. Deep attributed network embedding. In IJCAI, pages 3364--3370, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In OSDI, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. P. Goyal and E. Ferrara. Graph embedding techniques, applications, and performance: A survey. Knowledge-Based Systems, 2018.Google ScholarGoogle Scholar
  22. A. Grover and J. Leskovec. node2vec: Scalable feature learning for networks. In KDD, pages 855--864, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. T. Hamaguchi, H. Oiwa, M. Shimbo, and Y. Matsumoto. Knowledge transfer for out-of-knowledge-base entities : A graph neural network approach. In IJCAI, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. W. L. Hamilton, R. Ying, and J. Leskovec. Representation learning on graphs: Methods and applications. 2017.Google ScholarGoogle Scholar
  25. W. L. Hamilton, Z. Ying, and J. Leskovec. Inductive representation learning on large graphs. In NIPS, pages 1025--1035, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. R. He and J. McAuley. Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In WWW, pages 507--517. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. X. Huang, J. Li, and X. Hu. Accelerated attributed network embedding. In SDM, pages 633--641. SIAM, 2017.Google ScholarGoogle Scholar
  28. X. Huang, J. Li, and X. Hu. Label informed attributed network embedding. In WSDM, pages 731--739, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. D. R. Hush and J. M. Salas. Improving the learning rate of back-propagation with the gradient reuse algorithm. In IEEE International Conference on Neural Networks, 1988.Google ScholarGoogle ScholarCross RefCross Ref
  30. G. Karypis and V. Kumar. Metis-unstructured graph partitioning and sparse matrix ordering system. Technical Report.Google ScholarGoogle Scholar
  31. D. P. Kingma and M. Welling. Auto-encoding variational bayes. arXiv.1312.6114, 2013.Google ScholarGoogle Scholar
  32. T. N. Kipf and M. Welling. Semi-supervised classification with graph convolutional networks. In ICLR, 2017.Google ScholarGoogle Scholar
  33. Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. In Nature, pages 521--436. 2015.Google ScholarGoogle Scholar
  34. J. Li, H. Dani, X. Hu, J. Tang, Y. Chang, and H. Liu. Attributed network embedding for learning in a dynamic environment. In CIKM, pages 387--396. ACM, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Li, C., Shen, D., Jia, K., Yang, H. Hierarchical Representation Learning for Bipartite Graphs. In IJCAI, 2019.Google ScholarGoogle Scholar
  36. D. Liang, R. G. Krishnan, M. D. Hoffman, and T. Jebara. Variational autoencoders for collaborative filtering. 2018.Google ScholarGoogle Scholar
  37. L. Liao, X. He, H. Zhang, and T.-S. Chua. Attributed social network embedding. TKDE, 30(12):2257--2270, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. D. Liben-Nowell and J. Kleinberg. The link prediction problem for social networks. 2003.Google ScholarGoogle Scholar
  39. Z. Lin, M. Feng, C. N. d. Santos, M. Yu, B. Xiang, B. Zhou, and Y. Bengio. A structured self-attentive sentence embedding. arXiv:1703.03130, 2017.Google ScholarGoogle Scholar
  40. W. Liu, P.-Y. Chen, S. Yeung, T. Suzumura, and L. Chen. Principled multilayer network embedding. In ICDM, pages 134--141. IEEE, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  41. Liu, N., Tan, Q., Li, Y., Yang, H., Zhou, J., Hu, X. Is a Single Vector Enough? Exploring Node Polysemy for Network Embedding. In KDD, 2019. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. J. McAuley, C. Targett, Q. Shi, and A. Van Den Hengel. Image-based recommendations on styles and substitutes. In SIGIR, pages 43--52. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. B. Perozzi, R. Al-Rfou, and S. Skiena. Deepwalk: Online learning of social representations. In KDD, pages 701--710. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. J. Qiu, Y. Dong, H. Ma, J. Li, K. Wang, and J. Tang. Network embedding as matrix factorization: Unifying deepwalk, line, pte, and node2vec. In WSDM, pages 459--467, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. M. Qu, J. Tang, J. Shang, X. Ren, M. Zhang, and J. Han. An attention-based collaboration framework for multi-view network representation learning. In CIKM, pages 1767--1776. ACM, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. L. F. R. Ribeiro, P. H. P. Saverese, and D. R. Figueiredo. struc2vec : Learning node representations from structural identity. 2017.Google ScholarGoogle Scholar
  47. C. Shi, B. Hu, X. Zhao, and P. Yu. Heterogeneous information network embedding for recommendation. TKDE, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. I. Stanton and G. Kliot. Streaming graph partitioning for large distributed graphs. In KDD, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. J. Tang, M. Qu, and Q. Mei. Pte: Predictive text embedding through large-scale heterogeneous text networks. In KDD, pages 1165--1174. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei. Line: Large-scale information network embedding. In WWW, pages 1067--1077, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. S. Tanimoto. Power laws of the in-degree and out-degree distributions of complex networks. Physics, 2009.Google ScholarGoogle Scholar
  52. P. Vincent, H. Larochelle, Y. Bengio, and P. A. Manzagol. Extracting and composing robust features with denoising autoencoders. In ICML, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. D. Wang, P. Cui, and W. Zhu. Structural deep network embedding. In KDD, pages 1225--1234, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Z. Wang, Y. Tan, and Z. Ming. Graph-based recommendation on social networks. In APWeb, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. W. Xiong, M. Yu, S. Chang, X. Guo, and W. Y. Wang. One-shot relational learning for knowledge graphs. In EMNLP, pages 1980--1990, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  56. C. Yang, Z. Liu, D. Zhao, M. Sun, and E. Y. Chang. Network representation learning with rich text information. In IJCAI, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. H. Zhang, L. Qiu, L. Yi, and Y. Song. Scalable multiplex network embedding. In IJCAI, pages 3082--3088, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Z. Zhang, H. Yang, J. Bu, S. Zhou, P. Yu, J. Zhang, M. Ester, and C. Wang. Anrl: Attributed network representation learning via deep neural networks. In IJCAI, pages 3155--3161, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. V. W. Zheng, M. Sha, Y. Li, H. Yang, Z. Zhang, and K.-L. Tan. Heterogeneous embedding propagation for large-scale e-commerce user alignment. In ICDM, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  60. L. Zhou, Y. Yang, X. Ren, F. Wu, and Y. Zhuang. Dynamic network embedding by modeling triadic closure process. In AAAI, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  61. Zhao, Y., Wang, X., Yang, H., Song, L. and Tang, J. Large Scale Evolving Graphs with Burst Detection. In IJCAI, 2019.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 12, Issue 12
    August 2019
    547 pages

    Publisher

    VLDB Endowment

    Publication History

    • Published: 1 August 2019
    Published in pvldb Volume 12, Issue 12

    Qualifiers

    • research-article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader