Skip to main content
Top
Published in: Neural Computing and Applications 14/2020

28-10-2019 | Original Article

Generating word and document matrix representations for document classification

Authors: Shun Guo, Nianmin Yao

Published in: Neural Computing and Applications | Issue 14/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We present an effective word and document matrix representation architecture based on a linear operation, referred to as doc2matrix, to learn representations for document-level classification. It uses a matrix to present each word or document, which is different from the traditional form of vector representation. Doc2matrix defines proper subwindows as the scale of text. A word matrix and a document matrix are generated by stacking the information of these subwindows. Our document matrix not only contains more fine-grained semantic and syntactic information than the original representation but also introduces abundant two-dimensional features. Experiments conducted on four document-level classification tasks demonstrate that the proposed architecture can generate higher-quality word and document representations and outperform previous models based on linear operations. We can see that compared to different classifiers, a convolutional-based classifier is more suitable for our document matrix. Furthermore, we also demonstrate that the convolution operation can better capture the two-dimensional features of the proposed document matrix by the analysis from both theoretical and experimental perspectives.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
2.
go back to reference Silva J, Coheur L, Mendes AC, Wichert A (2011) From symbolic to sub-symbolic information in question classification. Artif Intell Rev 35(2):137–154CrossRef Silva J, Coheur L, Mendes AC, Wichert A (2011) From symbolic to sub-symbolic information in question classification. Artif Intell Rev 35(2):137–154CrossRef
3.
go back to reference Mikolov T et al (2013) Efficient estimation of word representations in vector space. In: Computer science Mikolov T et al (2013) Efficient estimation of word representations in vector space. In: Computer science
4.
go back to reference Zhang H, Wang S, Xu X et al (2018) Tree2Vector: learning a vectorial representation for tree-structured data. IEEE Trans Neural Netw Learn Syst 99:1–15MathSciNet Zhang H, Wang S, Xu X et al (2018) Tree2Vector: learning a vectorial representation for tree-structured data. IEEE Trans Neural Netw Learn Syst 99:1–15MathSciNet
5.
go back to reference Zhang H, Wang S, Zhao M et al (2018) Locality reconstruction models for book representation. IEEE Trans Knowl Data Eng 30:1873–1886CrossRef Zhang H, Wang S, Zhao M et al (2018) Locality reconstruction models for book representation. IEEE Trans Knowl Data Eng 30:1873–1886CrossRef
6.
go back to reference Le QV, Mikolov T (2014) Distributed representations of sentences and documents. In: Computer science Le QV, Mikolov T (2014) Distributed representations of sentences and documents. In: Computer science
7.
go back to reference Chen M (2017) Efficient vector representation for documents through corruption. In: Proceedings of the fifth international conference on learning representations. ICLR Chen M (2017) Efficient vector representation for documents through corruption. In: Proceedings of the fifth international conference on learning representations. ICLR
8.
go back to reference Socher R, Perelygin A, Wu JY, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of conference on empirical methods in natural language processing. EMNLP, pp 1642 Socher R, Perelygin A, Wu JY, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of conference on empirical methods in natural language processing. EMNLP, pp 1642
9.
go back to reference Mesnil et al (2015) Ensemble of generative and discriminative techniques for sentiment analysis of movie reviews. J Lightwave Technol 32(17):3043–3060 Mesnil et al (2015) Ensemble of generative and discriminative techniques for sentiment analysis of movie reviews. J Lightwave Technol 32(17):3043–3060
10.
go back to reference Zhang H et al (2013) Multidimensional latent semantic analysis using term spatial information. IEEE Trans Cybern 43:1625–1640CrossRef Zhang H et al (2013) Multidimensional latent semantic analysis using term spatial information. IEEE Trans Cybern 43:1625–1640CrossRef
11.
go back to reference Huang EH, Socher R, Manning CD et al (2012) Improving word representations via global context and multiple word prototypes. In: Proceedings of meeting of the Association for Computational Linguistics: long papers Huang EH, Socher R, Manning CD et al (2012) Improving word representations via global context and multiple word prototypes. In: Proceedings of meeting of the Association for Computational Linguistics: long papers
12.
go back to reference Kim Y (2014). Convolutional neural networks for sentence classification. In: Proceedings of conference on empirical methods in natural language processing. EMNLP, pp 1746–1751 Kim Y (2014). Convolutional neural networks for sentence classification. In: Proceedings of conference on empirical methods in natural language processing. EMNLP, pp 1746–1751
13.
go back to reference Kim Y, Jernite Y, Sontag D et al (2016) Character-aware neural language models. In: Proceedings of the thirtieth AAAI conference on artificial intelligence. AAAI, pp 2741–2749 Kim Y, Jernite Y, Sontag D et al (2016) Character-aware neural language models. In: Proceedings of the thirtieth AAAI conference on artificial intelligence. AAAI, pp 2741–2749
14.
go back to reference Conneau A, Schwenk H et al (2016) Very deep convolutional networks for text classification. In: Proceedings of the 15th conference of the European chapter of the Association for Computational Linguistics, vol 1, long papers Conneau A, Schwenk H et al (2016) Very deep convolutional networks for text classification. In: Proceedings of the 15th conference of the European chapter of the Association for Computational Linguistics, vol 1, long papers
15.
go back to reference Shen D, Min MR, Li Y et al (2017) Learning context-sensitive convolutional filters for text processing Shen D, Min MR, Li Y et al (2017) Learning context-sensitive convolutional filters for text processing
16.
go back to reference Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Computation 9(8):1735–1780CrossRef Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Computation 9(8):1735–1780CrossRef
17.
go back to reference Zhou C, Sun C, Liu Z, et al (2015) A C-LSTM neural network for text classification. In: Computer science, pp 39–44 Zhou C, Sun C, Liu Z, et al (2015) A C-LSTM neural network for text classification. In: Computer science, pp 39–44
18.
go back to reference Ding Z, Xia R, Yu J, et al (2018) Densely connected bidirectional LSTM with applications to sentence classification. In: Natural language processing and chinese computing, pp 278–287 Ding Z, Xia R, Yu J, et al (2018) Densely connected bidirectional LSTM with applications to sentence classification. In: Natural language processing and chinese computing, pp 278–287
19.
go back to reference Pappas N, Popescu-Belis A (2017) Multilingual hierarchical attention networks for document classification Pappas N, Popescu-Belis A (2017) Multilingual hierarchical attention networks for document classification
20.
go back to reference Kumar A, Kawahara D, Kurohashi S (2018) Knowledge-enriched two-layered attention network for sentiment analysis Kumar A, Kawahara D, Kurohashi S (2018) Knowledge-enriched two-layered attention network for sentiment analysis
21.
go back to reference Zhang T, Huang M, Zhao L (2018) Learning structured representation for text classification via reinforcement learning. In: Proceedings of the thirty-second AAAI conference on artificial intelligence. AAAI Zhang T, Huang M, Zhao L (2018) Learning structured representation for text classification via reinforcement learning. In: Proceedings of the thirty-second AAAI conference on artificial intelligence. AAAI
22.
go back to reference Feng J, Huang M, Zhao L et al (2018) Reinforcement learning for relation classification from noisy data. In: Proceedings of the thirty-second AAAI conference on artificial intelligence. AAAI Feng J, Huang M, Zhao L et al (2018) Reinforcement learning for relation classification from noisy data. In: Proceedings of the thirty-second AAAI conference on artificial intelligence. AAAI
23.
go back to reference Miyato T, Dai A M, Goodfellow I (2017) Adversarial training methods for semi-supervised text classification. In: Proceedings of the fifth international conference on learning representations. ICLR Miyato T, Dai A M, Goodfellow I (2017) Adversarial training methods for semi-supervised text classification. In: Proceedings of the fifth international conference on learning representations. ICLR
24.
go back to reference Liu P, Qiu X, Huang X (2017) Adversarial multi-task learning for text classification Liu P, Qiu X, Huang X (2017) Adversarial multi-task learning for text classification
25.
go back to reference Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. Meeting of the Association for Computational Linguistics. ACL, pp 142–150 Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. Meeting of the Association for Computational Linguistics. ACL, pp 142–150
26.
go back to reference Johnson R, Zhang T (2015) Semi-supervised convolutional neural networks for text categorization via region embedding. In: Advances in neural information processing systems, pp 919–927 Johnson R, Zhang T (2015) Semi-supervised convolutional neural networks for text categorization via region embedding. In: Advances in neural information processing systems, pp 919–927
27.
go back to reference Joachims T (1996) A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization. In: ICML, pp 143–151 Joachims T (1996) A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization. In: ICML, pp 143–151
28.
go back to reference Vincent P, Larochelle H, Bengio Y, Manzagol P-A (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the twenty-fifth international conference. ICML, pp 1096–1103 Vincent P, Larochelle H, Bengio Y, Manzagol P-A (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the twenty-fifth international conference. ICML, pp 1096–1103
29.
go back to reference Mikolov T et al (2010) Recurrent neural network based language model. In: Proceedings of the 37th international symposium on computer architecture. ISCA, pp 1045–1048 Mikolov T et al (2010) Recurrent neural network based language model. In: Proceedings of the 37th international symposium on computer architecture. ISCA, pp 1045–1048
30.
go back to reference Fan RE, Chang KW, Hsieh CJ et al (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874MATH Fan RE, Chang KW, Hsieh CJ et al (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874MATH
Metadata
Title
Generating word and document matrix representations for document classification
Authors
Shun Guo
Nianmin Yao
Publication date
28-10-2019
Publisher
Springer London
Published in
Neural Computing and Applications / Issue 14/2020
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-019-04541-x

Other articles of this Issue 14/2020

Neural Computing and Applications 14/2020 Go to the issue

Premium Partner