Skip to main content
Top
Published in: Knowledge and Information Systems 11/2021

08-09-2021 | Regular Paper

Word and graph attention networks for semi-supervised classification

Authors: Jing Zhang, Mengxi Li, Kaisheng Gao, Shunmei Meng, Cangqi Zhou

Published in: Knowledge and Information Systems | Issue 11/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Graph attention networks are effective graph neural networks that perform graph embedding for semi-supervised learning, which considers the neighbors of a node when learning its features. This paper presents a novel attention-based graph neural network that introduces an attention mechanism in the word-represented features of a node together incorporating the neighbors’ attention in the embedding process. Instead of using a vector as the feature of a node in the traditional graph attention networks, the proposed method uses a 2D matrix to represent a node, where each row in the matrix stands for a different attention distribution against the original word-represented features of a node. Then, the compressed features are fed into a graph attention layer that aggregates the matrix representation of the node and its neighbor nodes with different attention weights as a new representation. By stacking several graph attention layers, it obtains the final representation of nodes as matrices, which considers both that the neighbors of a node have different importance and that the words also have different importance in their original features. Experimental results on three citation network datasets show that the proposed method significantly outperforms eight state-of-the-art methods in semi-supervised classification tasks.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Ahmed A, Shervashidze N, Narayanamurthy S, Josifovski V, Smola AJ (2013) Distributed large-scale natural graph factorization. In: Proceedings of the 22nd international conference on world wide web, pp 37–48 Ahmed A, Shervashidze N, Narayanamurthy S, Josifovski V, Smola AJ (2013) Distributed large-scale natural graph factorization. In: Proceedings of the 22nd international conference on world wide web, pp 37–48
2.
go back to reference Ambartsoumian A, Popowich F (2018) Self-attention: A better building block for sentiment analysis neural network classifiers. ArXiv preprint arXiv:1812.07860 Ambartsoumian A, Popowich F (2018) Self-attention: A better building block for sentiment analysis neural network classifiers. ArXiv preprint arXiv:​1812.​07860
3.
go back to reference Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. ArXiv preprint arXiv:1409.0473 Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. ArXiv preprint arXiv:​1409.​0473
4.
go back to reference Belkin M, Niyogi P (2002) Laplacian eigenmaps and spectral techniques for embedding and clustering. In: Advances in neural information processing systems, Vol 14, pp 585–591 Belkin M, Niyogi P (2002) Laplacian eigenmaps and spectral techniques for embedding and clustering. In: Advances in neural information processing systems, Vol 14, pp 585–591
5.
go back to reference Benson AR, Gleich DF, Leskovec J (2016) Higher-order organization of complex networks. Science 353(6295):163–166CrossRef Benson AR, Gleich DF, Leskovec J (2016) Higher-order organization of complex networks. Science 353(6295):163–166CrossRef
6.
go back to reference Bruna J, Zaremba W, Szlam A, LeCun Y (2013) Spectral networks and locally connected networks on graphs. ArXiv preprint arXiv:1312.6203 Bruna J, Zaremba W, Szlam A, LeCun Y (2013) Spectral networks and locally connected networks on graphs. ArXiv preprint arXiv:​1312.​6203
7.
go back to reference Cao S, Lu W, Xu Q (2015) Grarep: Learning graph representations with global structural information. In: Proceedings of the 24th ACM international conference on information and knowledge management, pp 891–900 Cao S, Lu W, Xu Q (2015) Grarep: Learning graph representations with global structural information. In: Proceedings of the 24th ACM international conference on information and knowledge management, pp 891–900
8.
go back to reference Chang CH, Hwang SY (2021) A word embedding-based approach to cross-lingual topic modeling. Knowl Inf Syst 63(6):1529–1555CrossRef Chang CH, Hwang SY (2021) A word embedding-based approach to cross-lingual topic modeling. Knowl Inf Syst 63(6):1529–1555CrossRef
9.
go back to reference Cho K, Van Merriënboer B, Bahdanau D, Bengio Y (2014) On the properties of neural machine translation: Encoder-decoder approaches. ArXiv preprint arXiv:1409.1259 Cho K, Van Merriënboer B, Bahdanau D, Bengio Y (2014) On the properties of neural machine translation: Encoder-decoder approaches. ArXiv preprint arXiv:​1409.​1259
10.
go back to reference Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. ArXiv preprint arXiv:1412.3555 Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. ArXiv preprint arXiv:​1412.​3555
11.
go back to reference Cox MAA and Cox TF (2008) Multidimensional scaling. Handbook of data visualization pp 315–347 Cox MAA and Cox TF (2008) Multidimensional scaling. Handbook of data visualization pp 315–347
12.
go back to reference Dev S, Hassan S, Phillips JM (2021) Closed form word embedding alignment. Knowl Inf Syst 63(3):565–588CrossRef Dev S, Hassan S, Phillips JM (2021) Closed form word embedding alignment. Knowl Inf Syst 63(3):565–588CrossRef
13.
go back to reference Gehring J, Auli M, Grangier D, Dauphin YN (2016) A convolutional encoder model for neural machine translation. ArXiv preprint arXiv:1611.02344 Gehring J, Auli M, Grangier D, Dauphin YN (2016) A convolutional encoder model for neural machine translation. ArXiv preprint arXiv:​1611.​02344
14.
go back to reference Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE (2017) Neural message passing for quantum chemistry. In: Proceedings of the 34th international conference on machine learning, pp 1263–1272 Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE (2017) Neural message passing for quantum chemistry. In: Proceedings of the 34th international conference on machine learning, pp 1263–1272
15.
go back to reference Grover A, Leskovec J (2016) Node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 855–864 Grover A, Leskovec J (2016) Node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 855–864
16.
go back to reference Guo X, Zhao L, Homayoun H, Dinakarrao SM (2021) Deep graph transformation for attributed, directed, and signed networks. Knowl Inf Syst 63(6):1305–1337CrossRef Guo X, Zhao L, Homayoun H, Dinakarrao SM (2021) Deep graph transformation for attributed, directed, and signed networks. Knowl Inf Syst 63(6):1305–1337CrossRef
17.
go back to reference Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Advances in neural information processing systems, Vol 30, pp 1024–1034 Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Advances in neural information processing systems, Vol 30, pp 1024–1034
19.
go back to reference Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput. 9(8):1735–1780CrossRef Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput. 9(8):1735–1780CrossRef
21.
go back to reference Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: The 5th international conference on learning representations (ICLR) Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: The 5th international conference on learning representations (ICLR)
22.
go back to reference Lee JB, Rossi R, Kong X (2018) A structured self-attentive sentence embedding. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1666–1674 Lee JB, Rossi R, Kong X (2018) A structured self-attentive sentence embedding. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1666–1674
23.
go back to reference Lim KW, Buntine W (2015) Bibliographic analysis with the citation network topic model. In: Asian conference on machine learning, pp 142–158 Lim KW, Buntine W (2015) Bibliographic analysis with the citation network topic model. In: Asian conference on machine learning, pp 142–158
24.
go back to reference Lin Z, Feng M, Santos CN, Yu M, Xiang B, Zhou B, Bengio Y (2017) A structured self-attentive sentence embedding. ArXiv preprint arXiv:1703.03130 Lin Z, Feng M, Santos CN, Yu M, Xiang B, Zhou B, Bengio Y (2017) A structured self-attentive sentence embedding. ArXiv preprint arXiv:​1703.​03130
25.
go back to reference Lu J, Yang J, Batra D, Parikh D. (2016) Hierarchical question-image co-attention for visual question answering. In: Advances in neural information processing systems, pp 289–297 Lu J, Yang J, Batra D, Parikh D. (2016) Hierarchical question-image co-attention for visual question answering. In: Advances in neural information processing systems, pp 289–297
26.
go back to reference McCallum AK, Nigam K, Rennie J, Seymore K (2000) Automating the construction of internet portals with machine learning. Inform Retrieval 3(2):127–163CrossRef McCallum AK, Nigam K, Rennie J, Seymore K (2000) Automating the construction of internet portals with machine learning. Inform Retrieval 3(2):127–163CrossRef
27.
go back to reference Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. ArXiv preprint arXiv:1301.3781 Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. ArXiv preprint arXiv:​1301.​3781
28.
go back to reference Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: Online learning of social representations. In: Proceedings of the 20nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 701–710 Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: Online learning of social representations. In: Proceedings of the 20nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 701–710
29.
go back to reference Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326CrossRef Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326CrossRef
30.
go back to reference Shuman DI, Narang SK, Frossard P, Ortega A, Vandergheynst P (2013) The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Process Magaz 30(3):83–98CrossRef Shuman DI, Narang SK, Frossard P, Ortega A, Vandergheynst P (2013) The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Process Magaz 30(3):83–98CrossRef
31.
go back to reference Song Y, Wang J, Jiang T, Liu Z, Rao Y (2019) Attentional encoder network for targeted sentiment classification. ArXiv preprint arXiv:1902.09314 Song Y, Wang J, Jiang T, Liu Z, Rao Y (2019) Attentional encoder network for targeted sentiment classification. ArXiv preprint arXiv:​1902.​09314
32.
go back to reference Sukhbaatar S, Szlam A, Weston J, Fergus R (2015) End-to-end memory networks. In: Advances in neural information processing systems, Vol 28, pp 2440–2448 Sukhbaatar S, Szlam A, Weston J, Fergus R (2015) End-to-end memory networks. In: Advances in neural information processing systems, Vol 28, pp 2440–2448
33.
go back to reference Sun Y, Guo G, Chen X, Zhang P, Wang X (2020) Exploiting review embedding and user attention for item recommendation. Knowl Inf Syst 62(8):3015–3038CrossRef Sun Y, Guo G, Chen X, Zhang P, Wang X (2020) Exploiting review embedding and user attention for item recommendation. Knowl Inf Syst 62(8):3015–3038CrossRef
34.
go back to reference Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) Line: Large-scale information network embedding. In: Proceedings of the 24th international conference on world wide web, pp 21067–1077 Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) Line: Large-scale information network embedding. In: Proceedings of the 24th international conference on world wide web, pp 21067–1077
35.
go back to reference Tang J, Zhang J, Yao L, Li J, Zhang L, Su Z (2008) Arnetminer: extraction and mining of academic social networks. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 990–998 Tang J, Zhang J, Yao L, Li J, Zhang L, Su Z (2008) Arnetminer: extraction and mining of academic social networks. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 990–998
36.
go back to reference Tenenbaum JB, De Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323CrossRef Tenenbaum JB, De Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323CrossRef
37.
go back to reference Tu C, Zhang W, Liu Z, Sun M (2016) Max-margin deepwalk: Discriminative learning of network representation. In: Proceedings of the 25th international joint conference on artificial intelligence, pp 3889–3895 Tu C, Zhang W, Liu Z, Sun M (2016) Max-margin deepwalk: Discriminative learning of network representation. In: Proceedings of the 25th international joint conference on artificial intelligence, pp 3889–3895
38.
go back to reference Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y (2018) Graph attention networks. In: The 6th international conference on learning representations Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y (2018) Graph attention networks. In: The 6th international conference on learning representations
39.
go back to reference Wang D, Cui P, Zhu W (2016) Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1225–1234 Wang D, Cui P, Zhu W (2016) Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1225–1234
40.
go back to reference Wang S, Tang J, Aggarwal C, Liu H (2016) Linked document embedding for classification. In: Proceedings of the 25th ACM international on conference on information and knowledge management, pp 115–124 Wang S, Tang J, Aggarwal C, Liu H (2016) Linked document embedding for classification. In: Proceedings of the 25th ACM international on conference on information and knowledge management, pp 115–124
41.
go back to reference Zhang Z, Cui P, Wang X, Pei J, Yao X, and Zhu W (2018) Arbitrary-order proximity preserved network embedding. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery and data mining, pp 2778–2786 Zhang Z, Cui P, Wang X, Pei J, Yao X, and Zhu W (2018) Arbitrary-order proximity preserved network embedding. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery and data mining, pp 2778–2786
42.
go back to reference Zhou J, Cui G, Hu S, Zhang Z, Yang C, Liu Z, Wang L, Li C, Sun M (2020) Graph neural networks: a review of methods and applications. AI Open 1:57–81CrossRef Zhou J, Cui G, Hu S, Zhang Z, Yang C, Liu Z, Wang L, Li C, Sun M (2020) Graph neural networks: a review of methods and applications. AI Open 1:57–81CrossRef
Metadata
Title
Word and graph attention networks for semi-supervised classification
Authors
Jing Zhang
Mengxi Li
Kaisheng Gao
Shunmei Meng
Cangqi Zhou
Publication date
08-09-2021
Publisher
Springer London
Published in
Knowledge and Information Systems / Issue 11/2021
Print ISSN: 0219-1377
Electronic ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-021-01610-3

Other articles of this Issue 11/2021

Knowledge and Information Systems 11/2021 Go to the issue

Premium Partner