Skip to main content
Top
Published in: Data Mining and Knowledge Discovery 2/2024

02-09-2023

Network embedding based on high-degree penalty and adaptive negative sampling

Authors: Gang-Feng Ma, Xu-Hua Yang, Wei Ye, Xin-Li Xu, Lei Ye

Published in: Data Mining and Knowledge Discovery | Issue 2/2024

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Network embedding can effectively dig out potentially useful information and discover the relationships and rules which exist in the data, that has attracted increasing attention in many real-world applications. The goal of network embedding is to map high-dimensional and sparse networks into low-dimensional and dense vector representations. In this paper, we propose a network embedding method based on high-degree penalty and adaptive negative sampling (NEPS). First, we analyze the problem of imbalanced node training in random walk and propose an indicator base on high-degree penalty, which can control the random walk and avoid over-sampling high-degree neighbor node. Then, we propose a two-stage adaptive negative sampling strategy, which can dynamically obtain negative samples suitable for the current training according to the training stage to improve training effect. By comparing with seven well-known network embedding algorithms on eight real-world data sets, experiments show that the NEPS has good performance in node classification, network reconstruction and link prediction. The code is available at: https://​github.​com/​Andrewsama/​NEPS-master.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, et al. (2016) Tensorflow: a system for large-scale machine learning. In: 12th \(\{\)USENIX\(\}\) symposium on operating systems design and implementation (\(\{\)OSDI\(\}\) 16), pp 265–283 Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, et al. (2016) Tensorflow: a system for large-scale machine learning. In: 12th \(\{\)USENIX\(\}\) symposium on operating systems design and implementation (\(\{\)OSDI\(\}\) 16), pp 265–283
go back to reference Adhikari B, Zhang Y, Ramakrishnan N, Prakash BA (2018) Sub2vec: Feature learning for subgraphs. In: Pacific-Asia conference on knowledge discovery and data mining, Springer, pp 170–182 Adhikari B, Zhang Y, Ramakrishnan N, Prakash BA (2018) Sub2vec: Feature learning for subgraphs. In: Pacific-Asia conference on knowledge discovery and data mining, Springer, pp 170–182
go back to reference Alanis-Lobato G, Mier P, Andrade-Navarro MA (2016) Efficient embedding of complex networks to hyperbolic space via their Laplacian. Sci Rep 6(1):1–10CrossRef Alanis-Lobato G, Mier P, Andrade-Navarro MA (2016) Efficient embedding of complex networks to hyperbolic space via their Laplacian. Sci Rep 6(1):1–10CrossRef
go back to reference Armandpour M, Ding P, Huang J, Hu X (2019) Robust negative sampling for network embedding. In: Proceedings of the AAAI conference on artificial intelligence 33:3191–3198 Armandpour M, Ding P, Huang J, Hu X (2019) Robust negative sampling for network embedding. In: Proceedings of the AAAI conference on artificial intelligence 33:3191–3198
go back to reference Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6):1373–1396CrossRef Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6):1373–1396CrossRef
go back to reference Bu D, Zhao Y, Cai L, Xue H, Zhu X, Lu H, Zhang J, Sun S, Ling L, Zhang N et al (2003) Topological structure analysis of the protein-protein interaction network in budding yeast. Nucleic Acids Res 31(9):2443–2450CrossRefPubMedPubMedCentral Bu D, Zhao Y, Cai L, Xue H, Zhu X, Lu H, Zhang J, Sun S, Ling L, Zhang N et al (2003) Topological structure analysis of the protein-protein interaction network in budding yeast. Nucleic Acids Res 31(9):2443–2450CrossRefPubMedPubMedCentral
go back to reference Cao S, Lu W, Xu Q (2015) Grarep: Learning graph representations with global structural information. In: Proceedings of the 24th ACM international on conference on information and knowledge management, pp 891–900 Cao S, Lu W, Xu Q (2015) Grarep: Learning graph representations with global structural information. In: Proceedings of the 24th ACM international on conference on information and knowledge management, pp 891–900
go back to reference Cao S, Lu W, Xu Q (2016) Deep neural networks for learning graph representations. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 30 Cao S, Lu W, Xu Q (2016) Deep neural networks for learning graph representations. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 30
go back to reference Chang S, Han W, Tang J, Qi GJ, Aggarwal CC, Huang TS (2015) Heterogeneous network embedding via deep architectures. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pp 119–128 Chang S, Han W, Tang J, Qi GJ, Aggarwal CC, Huang TS (2015) Heterogeneous network embedding via deep architectures. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pp 119–128
go back to reference Chen H, Perozzi B, Hu Y, Skiena S (2018) Harp: Hierarchical representation learning for networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 32 Chen H, Perozzi B, Hu Y, Skiena S (2018) Harp: Hierarchical representation learning for networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 32
go back to reference Cox MA, Cox TF (2008) Multidimensional scaling. In: Handbook of data visualization, Springer, pp 315–347 Cox MA, Cox TF (2008) Multidimensional scaling. In: Handbook of data visualization, Springer, pp 315–347
go back to reference Dai Q, Li Q, Tang J, Wang D (2018) Adversarial network embedding. In: Proceedings of the AAAI conference on artificial intelligence, vol 32 Dai Q, Li Q, Tang J, Wang D (2018) Adversarial network embedding. In: Proceedings of the AAAI conference on artificial intelligence, vol 32
go back to reference Dong Y, Chawla NV, Swami A (2017) metapath2vec: Scalable representation learning for heterogeneous networks. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 135–144 Dong Y, Chawla NV, Swami A (2017) metapath2vec: Scalable representation learning for heterogeneous networks. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 135–144
go back to reference Feng R, Yang Y, Hu W, Wu F, Zhang Y (2018) Representation learning for scale-free networks. In: Proceedings of the AAAI conference on artificial intelligence, vol 32 Feng R, Yang Y, Hu W, Wu F, Zhang Y (2018) Representation learning for scale-free networks. In: Proceedings of the AAAI conference on artificial intelligence, vol 32
go back to reference Gao H, Huang H (2018) Self-paced network embedding. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1406–1415 Gao H, Huang H (2018) Self-paced network embedding. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1406–1415
go back to reference Grover A, Leskovec J (2016) node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 855–864 Grover A, Leskovec J (2016) node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 855–864
go back to reference Hamilton WL, Ying R, Leskovec J (2017) Inductive representation learning on large graphs. In: Proceedings of the 31st international conference on neural information processing systems, pp 1025–1035 Hamilton WL, Ying R, Leskovec J (2017) Inductive representation learning on large graphs. In: Proceedings of the 31st international conference on neural information processing systems, pp 1025–1035
go back to reference Hou Z, Cen Y, Dong Y, Zhang J, Tang J (2021) Automated unsupervised graph representation learning. IEEE Trans Knowl Data Eng 35:2285–2298 Hou Z, Cen Y, Dong Y, Zhang J, Tang J (2021) Automated unsupervised graph representation learning. IEEE Trans Knowl Data Eng 35:2285–2298
go back to reference Hu B, Fang Y, Shi C (2019) Adversarial learning on heterogeneous information networks. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp 120–129 Hu B, Fang Y, Shi C (2019) Adversarial learning on heterogeneous information networks. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp 120–129
go back to reference Huang X, Li J, Hu X (2017) Label informed attributed network embedding. In: Proceedings of the tenth ACM international conference on web search and data mining, pp 731–739 Huang X, Li J, Hu X (2017) Label informed attributed network embedding. In: Proceedings of the tenth ACM international conference on web search and data mining, pp 731–739
go back to reference Kendall MG (1938) A new measure of rank correlation. Biometrika 30(1/2):81–93CrossRef Kendall MG (1938) A new measure of rank correlation. Biometrika 30(1/2):81–93CrossRef
go back to reference Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: Densification and shrinking diameters. ACM transactions on Knowledge Discovery from Data (TKDD) 1(1):2–es Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: Densification and shrinking diameters. ACM transactions on Knowledge Discovery from Data (TKDD) 1(1):2–es
go back to reference Li AQ, Ahmed A, Ravi S, Smola AJ (2014) Reducing the sampling complexity of topic models. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 891–900 Li AQ, Ahmed A, Ravi S, Smola AJ (2014) Reducing the sampling complexity of topic models. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 891–900
go back to reference Lü L, Zhou T (2011) Link prediction in complex networks: a survey. Phys A: Stat Mech Appl 390(6):1150–1170CrossRef Lü L, Zhou T (2011) Link prediction in complex networks: a survey. Phys A: Stat Mech Appl 390(6):1150–1170CrossRef
go back to reference Mahoney M (2011) Large text compression benchmark Mahoney M (2011) Large text compression benchmark
go back to reference Mikolov T, Chen K, Corrado G, Dean J (2013a) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 Mikolov T, Chen K, Corrado G, Dean J (2013a) Efficient estimation of word representations in vector space. arXiv preprint arXiv:​1301.​3781
go back to reference Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013b) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119 Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013b) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119
go back to reference Narayanan A, Chandramohan M, Chen L, Liu Y, Saminathan S (2016) subgraph2vec: Learning distributed representations of rooted sub-graphs from large graphs. arXiv preprint arXiv:1606.08928 Narayanan A, Chandramohan M, Chen L, Liu Y, Saminathan S (2016) subgraph2vec: Learning distributed representations of rooted sub-graphs from large graphs. arXiv preprint arXiv:​1606.​08928
go back to reference Ou M, Cui P, Pei J, Zhang Z, Zhu W (2016) Asymmetric transitivity preserving graph embedding. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp 1105–1114 Ou M, Cui P, Pei J, Zhang Z, Zhu W (2016) Asymmetric transitivity preserving graph embedding. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp 1105–1114
go back to reference Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 701–710 Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 701–710
go back to reference Perozzi B, Kulkarni V, Chen H, Skiena S (2017) Don’t walk, skip! online learning of multi-scale network embeddings. In: Proceedings of the 2017 IEEE/ACM international conference on advances in social networks analysis and mining 2017, pp 258–265 Perozzi B, Kulkarni V, Chen H, Skiena S (2017) Don’t walk, skip! online learning of multi-scale network embeddings. In: Proceedings of the 2017 IEEE/ACM international conference on advances in social networks analysis and mining 2017, pp 258–265
go back to reference Ribeiro LF, Saverese PH, Figueiredo DR (2017) struc2vec: Learning node representations from structural identity. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 385–394 Ribeiro LF, Saverese PH, Figueiredo DR (2017) struc2vec: Learning node representations from structural identity. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 385–394
go back to reference Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326ADSCrossRefPubMed Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326ADSCrossRefPubMed
go back to reference Rozemberczki B, Allen C, Sarkar R (2021) Multi-scale attributed node embedding. J Complex Netw cnab9(2):014MathSciNet Rozemberczki B, Allen C, Sarkar R (2021) Multi-scale attributed node embedding. J Complex Netw cnab9(2):014MathSciNet
go back to reference Shao J (2006) Mathematical statistics: exercises and solutions. Springer Science & Business Media Shao J (2006) Mathematical statistics: exercises and solutions. Springer Science & Business Media
go back to reference Shaw B, Jebara T (2009) Structure preserving embedding. In: Proceedings of the 26th annual international conference on machine learning, pp 937–944 Shaw B, Jebara T (2009) Structure preserving embedding. In: Proceedings of the 26th annual international conference on machine learning, pp 937–944
go back to reference Spearman C (1987) The proof and measurement of association between two things. Am J Psychol 100(3/4):441–471CrossRefPubMed Spearman C (1987) The proof and measurement of association between two things. Am J Psychol 100(3/4):441–471CrossRefPubMed
go back to reference Tang J, Qu M, Mei Q (2015a) Pte: Predictive text embedding through large-scale heterogeneous text networks. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1165–1174 Tang J, Qu M, Mei Q (2015a) Pte: Predictive text embedding through large-scale heterogeneous text networks. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1165–1174
go back to reference Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015b) Line: large-scale information network embedding. In: Proceedings of the 24th international conference on world wide web, pp 1067–1077 Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015b) Line: large-scale information network embedding. In: Proceedings of the 24th international conference on world wide web, pp 1067–1077
go back to reference Tenenbaum JB, De Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323ADSCrossRefPubMed Tenenbaum JB, De Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323ADSCrossRefPubMed
go back to reference Wang D, Cui P, Zhu W (2016) Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp 1225–1234 Wang D, Cui P, Zhu W (2016) Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp 1225–1234
go back to reference Wang J, Yu L, Zhang W, Gong Y, Xu Y, Wang B, Zhang P, Zhang D (2017) Irgan: a minimax game for unifying generative and discriminative information retrieval models. In: Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval, pp 515–524 Wang J, Yu L, Zhang W, Gong Y, Xu Y, Wang B, Zhang P, Zhang D (2017) Irgan: a minimax game for unifying generative and discriminative information retrieval models. In: Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval, pp 515–524
go back to reference Wang J, Huang P, Zhao H, Zhang Z, Zhao B, Lee DL (2018) Billion-scale commodity embedding for e-commerce recommendation in alibaba. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 839–848 Wang J, Huang P, Zhao H, Zhang Z, Zhao B, Lee DL (2018) Billion-scale commodity embedding for e-commerce recommendation in alibaba. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 839–848
go back to reference Wang X, Zhang Y, Shi C (2019) Hyperbolic heterogeneous information network embedding. In: Proceedings of the AAAI conference on artificial intelligence 33:5337–5344 Wang X, Zhang Y, Shi C (2019) Hyperbolic heterogeneous information network embedding. In: Proceedings of the AAAI conference on artificial intelligence 33:5337–5344
go back to reference Wang Z, Ye X, Wang C, Cui J, Yu P (2020) Network embedding with completely-imbalanced labels. IEEE Trans Knowl Data Eng 33:3634–3647CrossRef Wang Z, Ye X, Wang C, Cui J, Yu P (2020) Network embedding with completely-imbalanced labels. IEEE Trans Knowl Data Eng 33:3634–3647CrossRef
go back to reference Yang C, Liu Z, Zhao D, Sun M, Chang E (2015) Network representation learning with rich text information. In: Twenty-fourth international joint conference on artificial intelligence Yang C, Liu Z, Zhao D, Sun M, Chang E (2015) Network representation learning with rich text information. In: Twenty-fourth international joint conference on artificial intelligence
go back to reference Yang Z, Ding M, Zhou C, Yang H, Zhou J, Tang J (2020) Understanding negative sampling in graph representation learning. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1666–1676 Yang Z, Ding M, Zhou C, Yang H, Zhou J, Tang J (2020) Understanding negative sampling in graph representation learning. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1666–1676
go back to reference Yin H, Benson AR, Leskovec J, Gleich DF (2017) Local higher-order graph clustering. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 555–564 Yin H, Benson AR, Leskovec J, Gleich DF (2017) Local higher-order graph clustering. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 555–564
go back to reference Zhang J, Shi X, Xie J, Ma H, King I, Yeung DY (2018) Gaan: Gated attention networks for learning on large and spatiotemporal graphs. arXiv preprint arXiv:1803.07294 Zhang J, Shi X, Xie J, Ma H, King I, Yeung DY (2018) Gaan: Gated attention networks for learning on large and spatiotemporal graphs. arXiv preprint arXiv:​1803.​07294
go back to reference Zhang J, Dong Y, Wang Y, Tang J, Ding M (2019) Prone: Fast and scalable network representation learning. IJCAI 19:4278–4284 Zhang J, Dong Y, Wang Y, Tang J, Ding M (2019) Prone: Fast and scalable network representation learning. IJCAI 19:4278–4284
go back to reference Zhang W, Chen T, Wang J, Yu Y (2013) Optimizing top-n collaborative filtering via dynamic negative item sampling. In: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval, pp 785–788 Zhang W, Chen T, Wang J, Yu Y (2013) Optimizing top-n collaborative filtering via dynamic negative item sampling. In: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval, pp 785–788
Metadata
Title
Network embedding based on high-degree penalty and adaptive negative sampling
Authors
Gang-Feng Ma
Xu-Hua Yang
Wei Ye
Xin-Li Xu
Lei Ye
Publication date
02-09-2023
Publisher
Springer US
Published in
Data Mining and Knowledge Discovery / Issue 2/2024
Print ISSN: 1384-5810
Electronic ISSN: 1573-756X
DOI
https://doi.org/10.1007/s10618-023-00973-1

Other articles of this Issue 2/2024

Data Mining and Knowledge Discovery 2/2024 Go to the issue

Premium Partner