Skip to main content
Top
Published in: Neural Processing Letters 8/2023

08-09-2023

Word-Context Attention for Text Representation

Authors: Chengkai Piao, Yuchen Wang, Yapeng Zhu, Jin-Mao Wei, Jian Liu

Published in: Neural Processing Letters | Issue 8/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We tackle the insufficient context pattern limitation of existing Word-Word Attention caused by its spatial-shared property. To this end, we propose the Word-Context Attention method that utilizes item-wise filters to perform both temporal and spatial combinations. Specifically, the proposed method first compresses the global scale left and right context words into fixed-length vectors respectively. Then, a group of specific filters are learned to select features from the word and its context vectors. Last, a non-linear transformation is adopted to merge and activate the selected features. Since each word has its exclusive context filters and non-linear semantic transformations, the proposed method has the property of being spatial-specific, and thus can generate flexible context patterns. Experimental comparisons demonstrate the feasibility of our model and its attractive computational performance.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
This dataset contains about 87.5K URLs in which one-third are flagged as a spam URL and restrict are not spam. The dataset is available at https://​www.​kaggle.​com/​shivamb/​spam-url-prediction.
 
2
This dataset contains cleaned tweets from India on topics like corona-virus, COVID-19 and lock-down etc. The tweets have been collected between dates 23rd March 2020 and 15th July 2020. Then the text have been labeled into four sentiment categories fear, sad, anger and joy. The dataset is available at https://​www.​kaggle.​com/​surajkum1198/​twitterdata.
 
Literature
1.
go back to reference Sprugnoli R, Tonelli S (2019) Novel event detection and classification for historical texts. Comput Linguist 45(2):229–265CrossRef Sprugnoli R, Tonelli S (2019) Novel event detection and classification for historical texts. Comput Linguist 45(2):229–265CrossRef
2.
go back to reference Yang Z, Wang Y, Chen X, Liu J, Qiao Y (2020) Context-transformer: tackling object confusion for few-shot detection. Proc AAAI Conf Artif Intell 34:12653–12660 Yang Z, Wang Y, Chen X, Liu J, Qiao Y (2020) Context-transformer: tackling object confusion for few-shot detection. Proc AAAI Conf Artif Intell 34:12653–12660
3.
go back to reference Yang M, Zhang M, Chen K, Wang R, Zhao T (2020) Neural machine translation with target-attention model. IEICE Trans Inf Syst 103(3):684–694CrossRef Yang M, Zhang M, Chen K, Wang R, Zhao T (2020) Neural machine translation with target-attention model. IEICE Trans Inf Syst 103(3):684–694CrossRef
4.
go back to reference Lutellier T, Pham HV, Pang L, Li Y, Wei M, Tan L (2020) Coconut: combining context-aware neural translation models using ensemble for program repair. In: Proceedings of the 29th ACM SIGSOFT international symposium on software testing and analysis, pp 101–114 Lutellier T, Pham HV, Pang L, Li Y, Wei M, Tan L (2020) Coconut: combining context-aware neural translation models using ensemble for program repair. In: Proceedings of the 29th ACM SIGSOFT international symposium on software testing and analysis, pp 101–114
5.
go back to reference Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008 Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
6.
go back to reference Žabokrtskỳ Z, Zeman D, Ševčíková M (2020) Sentence meaning representations across languages: What can we learn from existing frameworks? Comput Linguist 46(3):605–665CrossRef Žabokrtskỳ Z, Zeman D, Ševčíková M (2020) Sentence meaning representations across languages: What can we learn from existing frameworks? Comput Linguist 46(3):605–665CrossRef
7.
go back to reference Jiang J, Zhang J, Zhang K (2020) Cascaded semantic and positional self-attention network for document classification. In: Proceedings of the 2020 conference on empirical methods in natural language processing: findings, pp 669–677 Jiang J, Zhang J, Zhang K (2020) Cascaded semantic and positional self-attention network for document classification. In: Proceedings of the 2020 conference on empirical methods in natural language processing: findings, pp 669–677
8.
go back to reference Wang W, Pan SJ (2020) Syntactically meaningful and transferable recursive neural networks for aspect and opinion extraction. Comput Linguist 45(4):705–736CrossRef Wang W, Pan SJ (2020) Syntactically meaningful and transferable recursive neural networks for aspect and opinion extraction. Comput Linguist 45(4):705–736CrossRef
9.
go back to reference Li C, Bao Z, Li L, Zhao Z (2020) Exploring temporal representations by leveraging attention-based bidirectional lstm-rnns for multi-modal emotion recognition. Inf Process Manag 57(3):102185CrossRef Li C, Bao Z, Li L, Zhao Z (2020) Exploring temporal representations by leveraging attention-based bidirectional lstm-rnns for multi-modal emotion recognition. Inf Process Manag 57(3):102185CrossRef
10.
go back to reference Laenen K, Moens M-F (2020) A comparative study of outfit recommendation methods with a focus on attention-based fusion. Inf Process Manag 57(6):102316CrossRef Laenen K, Moens M-F (2020) A comparative study of outfit recommendation methods with a focus on attention-based fusion. Inf Process Manag 57(6):102316CrossRef
11.
go back to reference Hu B, Lu Z, Li H, Chen Q (2014) Convolutional neural network architectures for matching natural language sentences. In: Advances in neural information processing systems, pp 2042–2050 Hu B, Lu Z, Li H, Chen Q (2014) Convolutional neural network architectures for matching natural language sentences. In: Advances in neural information processing systems, pp 2042–2050
12.
go back to reference Al-Rfou R, Choe D, Constant N, Guo M, Jones L (2019) Character-level language modeling with deeper self-attention. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 3159–3166 Al-Rfou R, Choe D, Constant N, Guo M, Jones L (2019) Character-level language modeling with deeper self-attention. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 3159–3166
13.
go back to reference Ibrahim MA, Ghani Khan MU, Mehmood F, Asim MN, Mahmood W (2021) Ghs-net a generic hybridized shallow neural network for multi-label biomedical text classification. J Biomed Inform 116:103699–103699CrossRef Ibrahim MA, Ghani Khan MU, Mehmood F, Asim MN, Mahmood W (2021) Ghs-net a generic hybridized shallow neural network for multi-label biomedical text classification. J Biomed Inform 116:103699–103699CrossRef
15.
go back to reference Du C, Chen Z, Feng F, Zhu L, Gan T, Nie L (2019) Explicit interaction model towards text classification. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 6359–6366 Du C, Chen Z, Feng F, Zhu L, Gan T, Nie L (2019) Explicit interaction model towards text classification. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 6359–6366
16.
go back to reference Dai Z, Yang Z, Yang Y, Cohen WW, Carbonell J, Le QV, Salakhutdinov R (2019) Transformer-xl: attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860 Dai Z, Yang Z, Yang Y, Cohen WW, Carbonell J, Le QV, Salakhutdinov R (2019) Transformer-xl: attentive language models beyond a fixed-length context. arXiv preprint arXiv:​1901.​02860
17.
18.
go back to reference Ke P, Ji H, Liu S, Zhu X, Huang M (2020) Sentilare: linguistic knowledge enhanced language representation for sentiment analysis. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp 6975–6988 Ke P, Ji H, Liu S, Zhu X, Huang M (2020) Sentilare: linguistic knowledge enhanced language representation for sentiment analysis. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp 6975–6988
19.
go back to reference Song C, Ning N, Zhang Y, Wu B (2021) A multimodal fake news detection model based on crossmodal attention residual and multichannel convolutional neural networks. Inf Process Manag 58(1):102437CrossRef Song C, Ning N, Zhang Y, Wu B (2021) A multimodal fake news detection model based on crossmodal attention residual and multichannel convolutional neural networks. Inf Process Manag 58(1):102437CrossRef
20.
go back to reference Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. In: Proceedings of the 28th international conference on neural information processing systems, vol 1, pp 649–657 Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. In: Proceedings of the 28th international conference on neural information processing systems, vol 1, pp 649–657
21.
go back to reference Liu P, Qiu X, Huang X (2017) Adversarial multi-task learning for text classification. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Vol 1 Long Papers), pp 1–10 Liu P, Qiu X, Huang X (2017) Adversarial multi-task learning for text classification. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Vol 1 Long Papers), pp 1–10
22.
go back to reference Dennis D, Acar DAE, Mandikal V, Sadasivan VS, Saligrama V, Simhadri HV, Jain P (2019) Shallow rnn: accurate time-series classification on resource constrained devices. In: Advances in neural information processing systems, pp 12896–12906 Dennis D, Acar DAE, Mandikal V, Sadasivan VS, Saligrama V, Simhadri HV, Jain P (2019) Shallow rnn: accurate time-series classification on resource constrained devices. In: Advances in neural information processing systems, pp 12896–12906
23.
go back to reference Wang B (2018) Disconnected recurrent neural networks for text categorization. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Vol 1, Long Papers), pp 2311–2320 Wang B (2018) Disconnected recurrent neural networks for text categorization. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Vol 1, Long Papers), pp 2311–2320
24.
go back to reference Xu J, Cai Y, Wu X, Lei X, Huang Q, Leung H-F, Li Q (2020) Incorporating context-relevant concepts into convolutional neural networks for short text classification. Neurocomputing 386:42–53CrossRef Xu J, Cai Y, Wu X, Lei X, Huang Q, Leung H-F, Li Q (2020) Incorporating context-relevant concepts into convolutional neural networks for short text classification. Neurocomputing 386:42–53CrossRef
25.
go back to reference Conneau A, Schwenk H, Barrault L, Lecun Y (2017) Very deep convolutional networks for text classification. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics, vol 1, Long Papers, pp 1107–1116 Conneau A, Schwenk H, Barrault L, Lecun Y (2017) Very deep convolutional networks for text classification. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics, vol 1, Long Papers, pp 1107–1116
26.
go back to reference Gururangan S, Dang T, Card D, Smith NA (2019) Variational pretraining for semi-supervised text classification. In: Proceedings of the 57th Annual meeting of the association for computational linguistics, pp 5880–5894 Gururangan S, Dang T, Card D, Smith NA (2019) Variational pretraining for semi-supervised text classification. In: Proceedings of the 57th Annual meeting of the association for computational linguistics, pp 5880–5894
27.
go back to reference Guo C, Xie L, Liu G, Wang X (2020) A text representation model based on convolutional neural network and variational auto encoder. In: International conference on web information systems and applications, pp 225–235 . Springer Guo C, Xie L, Liu G, Wang X (2020) A text representation model based on convolutional neural network and variational auto encoder. In: International conference on web information systems and applications, pp 225–235 . Springer
28.
go back to reference Li W, Qi F, Tang M, Yu Z (2020) Bidirectional lstm with self-attention mechanism and multi-channel features for sentiment classification. Neurocomputing 387:63–77CrossRef Li W, Qi F, Tang M, Yu Z (2020) Bidirectional lstm with self-attention mechanism and multi-channel features for sentiment classification. Neurocomputing 387:63–77CrossRef
29.
go back to reference Wang Y, Yang Y, Chen Y, Bai J, Zhang C, Su G, Kou X, Tong Y, Yang M, Zhou L (2020) Textnas: A neural architecture search space tailored for text representation. In: Proceedings of the AAAI conference on artificial intelligence vol 34, pp 9242–9249 Wang Y, Yang Y, Chen Y, Bai J, Zhang C, Su G, Kou X, Tong Y, Yang M, Zhou L (2020) Textnas: A neural architecture search space tailored for text representation. In: Proceedings of the AAAI conference on artificial intelligence vol 34, pp 9242–9249
30.
go back to reference Le HT, Cerisara C, Denis A (2018) Do convolutional networks need to be deep for text classification? In: Workshops at the thirty-second AAAI conference on artificial intelligence, pp 29–36 Le HT, Cerisara C, Denis A (2018) Do convolutional networks need to be deep for text classification? In: Workshops at the thirty-second AAAI conference on artificial intelligence, pp 29–36
31.
go back to reference Asghari M, Sierra-Sosa D, Elmaghraby AS (2020) A topic modeling framework for spatio–temporal information management. Inf Process Manag 57(6):102340CrossRef Asghari M, Sierra-Sosa D, Elmaghraby AS (2020) A topic modeling framework for spatio–temporal information management. Inf Process Manag 57(6):102340CrossRef
32.
go back to reference Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 746–1751 Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 746–1751
33.
go back to reference Dauphin YN, Fan A, Auli M, Grangier D (2017) Language modeling with gated convolutional networks. In: International conference on machine learning, pp 933–941 Dauphin YN, Fan A, Auli M, Grangier D (2017) Language modeling with gated convolutional networks. In: International conference on machine learning, pp 933–941
34.
go back to reference Guo X, Zhang H, Yang H, Xu L, Ye Z (2019) A single attention-based combination of CNN and RNN for relation classification. IEEE Access 7:12467–12475CrossRef Guo X, Zhang H, Yang H, Xu L, Ye Z (2019) A single attention-based combination of CNN and RNN for relation classification. IEEE Access 7:12467–12475CrossRef
35.
go back to reference Chambua J, Niu Z (2021) Review text based rating prediction approaches: preference knowledge learning, representation and utilization. Artif Intell Rev 54(2):1171–1200CrossRef Chambua J, Niu Z (2021) Review text based rating prediction approaches: preference knowledge learning, representation and utilization. Artif Intell Rev 54(2):1171–1200CrossRef
36.
go back to reference Zhang S, Jiang H, Xu M, Hou J, Dai L (2015) The fixed-size ordinally-forgetting encoding method for neural network language models. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (Vol 2, Short Papers), vol 2, pp 495–500 Zhang S, Jiang H, Xu M, Hou J, Dai L (2015) The fixed-size ordinally-forgetting encoding method for neural network language models. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (Vol 2, Short Papers), vol 2, pp 495–500
37.
go back to reference Conneau A, Schwenk H, Barrault L, Lecun Y (2016) Very deep convolutional networks for natural language processing. arXiv preprint arXiv:1606.01781 2 Conneau A, Schwenk H, Barrault L, Lecun Y (2016) Very deep convolutional networks for natural language processing. arXiv preprint arXiv:​1606.​01781 2
38.
go back to reference Pang B, Lee L (2005) Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd annual meeting on association for computational linguistics, pp 115–124 . Association for computational linguistics Pang B, Lee L (2005) Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd annual meeting on association for computational linguistics, pp 115–124 . Association for computational linguistics
39.
go back to reference Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543 Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
40.
go back to reference Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, vol 1 (Long and Short Papers), pp 4171–4186 Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, vol 1 (Long and Short Papers), pp 4171–4186
41.
go back to reference Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNetMATH Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNetMATH
43.
44.
go back to reference Zhang Y, Yang Q (2021) A survey on multi-task learning. IEEE Trans Knowl Data Eng 34:5586–5609CrossRef Zhang Y, Yang Q (2021) A survey on multi-task learning. IEEE Trans Knowl Data Eng 34:5586–5609CrossRef
Metadata
Title
Word-Context Attention for Text Representation
Authors
Chengkai Piao
Yuchen Wang
Yapeng Zhu
Jin-Mao Wei
Jian Liu
Publication date
08-09-2023
Publisher
Springer US
Published in
Neural Processing Letters / Issue 8/2023
Print ISSN: 1370-4621
Electronic ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-023-11396-w

Other articles of this Issue 8/2023

Neural Processing Letters 8/2023 Go to the issue