Skip to main content
Top
Published in: Neural Processing Letters 5/2021

11-08-2020

Combining Embeddings of Input Data for Text Classification

Authors: Zuzanna Parcheta, Germán Sanchis-Trilles, Francisco Casacuberta, Robin Rendahl

Published in: Neural Processing Letters | Issue 5/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The problem of automatic text classification is an essential part of text analysis. The improvement of text classification can be done at different levels such as a preprocessing step, network implementation, etc. In this paper, we focus on how the combination of different methods of text encoding may affect classification accuracy. To do this, we implemented a multi-input neural network that is able to encode input text using several text encoding techniques such as BERT, neural embedding layer, GloVe, skip-thoughts and ParagraphVector. The text can be represented at different levels of tokenised input text such as the sentence level, word level, byte pair encoding level and character level. Experiments were conducted on seven datasets from different language families: English, German, Swedish and Czech. Some of those languages contain agglutinations and grammatical cases. Two out of seven datasets originated from real commercial scenarios: (1) classifying ingredients into their corresponding classes by means of a corpus provided by Northfork; and (2) classifying texts according to the English level of their corresponding writers by means of a corpus provided by ProvenWord. The developed architecture achieves an improvement with different combinations of text encoding techniques depending on the different characteristics of the datasets. Once the best combination of embeddings at different levels was determined, different architectures of multi-input neural networks were compared. The results obtained with the best embedding combination and best neural network architecture were compared with state-of-the-art approaches. The results obtained with the dataset used in the experiments were better than the state-of-the-art baselines.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Abadi M, Barham P, Chen J, Chen Z et al (2016) Tensorflow: a system for large-scale machine learning. In: Proceedings of 12th USENIX symposium on operating systems design and implementation (OSDI 16), pp 265–283 Abadi M, Barham P, Chen J, Chen Z et al (2016) Tensorflow: a system for large-scale machine learning. In: Proceedings of 12th USENIX symposium on operating systems design and implementation (OSDI 16), pp 265–283
2.
go back to reference Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Proceedings of workshop at the international conference on learning representations (ICLR) Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Proceedings of workshop at the international conference on learning representations (ICLR)
3.
go back to reference Bergsma S, Kondrak G (2007) Alignment-based discriminative string similarity. In: Proceedings of the 45th annual meeting of the association of computational linguistics, pp 656–663 Bergsma S, Kondrak G (2007) Alignment-based discriminative string similarity. In: Proceedings of the 45th annual meeting of the association of computational linguistics, pp 656–663
4.
go back to reference Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146 CrossRef Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146 CrossRef
5.
go back to reference Bridle JS (1989) Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In: Fogelman Soulié F, Hérault J (eds) Neurocomputing. Springer, Berlin, Heidelberg, pp 227–236 Bridle JS (1989) Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In: Fogelman Soulié F, Hérault J (eds) Neurocomputing. Springer, Berlin, Heidelberg, pp 227–236
6.
go back to reference Chen D, Manning CD (2014) A fast and accurate dependency parser using neural networks. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 740–750 Chen D, Manning CD (2014) A fast and accurate dependency parser using neural networks. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 740–750
7.
go back to reference Chollet F (2016) Using pre-trained word embeddings in a keras model. The Keras Blog, London Chollet F (2016) Using pre-trained word embeddings in a keras model. The Keras Blog, London
8.
go back to reference Chollet F, Falbel D, Allaire J, Tang YT, Van Der Bijl W, Studer M, Keydana S (2015) Keras: deep learning library for theano and tensorflow, vols 7, 8. https://keras.io/k Chollet F, Falbel D, Allaire J, Tang YT, Van Der Bijl W, Studer M, Keydana S (2015) Keras: deep learning library for theano and tensorflow, vols 7, 8. https://​keras.​io/​k
9.
go back to reference Conneau A, Kiela D, Schwenk H, Barrault L, Bordes A (2017) Supervised learning of universal sentence representations from natural language inference data. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 670–680 Conneau A, Kiela D, Schwenk H, Barrault L, Bordes A (2017) Supervised learning of universal sentence representations from natural language inference data. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 670–680
11.
go back to reference Devlin J, Chang M, Lee K, Toutanova K (2018) BERT: pre-training of deep bidirectional transformers for language understanding. Preprint arXiv:1810.04805 Devlin J, Chang M, Lee K, Toutanova K (2018) BERT: pre-training of deep bidirectional transformers for language understanding. Preprint arXiv:​1810.​04805
12.
go back to reference Gage P (1994) A new algorithm for data compression. C Users J 12:23–38 Gage P (1994) A new algorithm for data compression. C Users J 12:23–38
13.
go back to reference Goasduff L, Omale G (2018) Gartner survey finds consumers would use AI to save time and money. Gartner, Berlin Goasduff L, Omale G (2018) Gartner survey finds consumers would use AI to save time and money. Gartner, Berlin
14.
go back to reference Gupta V, Karnick H, Bansal A, Jhala P (2016) Product classification in e-commerce using distributional semantics. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp 536–546 Gupta V, Karnick H, Bansal A, Jhala P (2016) Product classification in e-commerce using distributional semantics. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp 536–546
15.
go back to reference Habernal I, Brychcín T (2013) Unsupervised improving of sentiment analysis using global target context. Proc Recent Adv Nat Lang Process 2013:122–128 Habernal I, Brychcín T (2013) Unsupervised improving of sentiment analysis using global target context. Proc Recent Adv Nat Lang Process 2013:122–128
16.
go back to reference Hill F, Cho K, Korhonen A (2016) Learning distributed representations of sentences from unlabelled data. In:Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1367–1377 Hill F, Cho K, Korhonen A (2016) Learning distributed representations of sentences from unlabelled data. In:Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1367–1377
17.
go back to reference Hövelmann L, Allee S, Friedrich CM (2017) Fasttext and gradient boosted trees at Germeval-2017 on relevance classification and document-level polarity. In: Shared task on aspect-based sentiment in social media customer feedback, pp 30–35 Hövelmann L, Allee S, Friedrich CM (2017) Fasttext and gradient boosted trees at Germeval-2017 on relevance classification and document-level polarity. In: Shared task on aspect-based sentiment in social media customer feedback, pp 30–35
18.
go back to reference Ionescu RT, Butnaru A (2019) Vector of locally-aggregated word embeddings (VLAWE): a novel document-level representation. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), pp 363–369 Ionescu RT, Butnaru A (2019) Vector of locally-aggregated word embeddings (VLAWE): a novel document-level representation. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), pp 363–369
19.
go back to reference Jain G, Sharma M, Agarwal B (2019) Spam detection in social media using convolutional and long short term memory neural network. Ann Math Artif Intell 85(1):21–44CrossRef Jain G, Sharma M, Agarwal B (2019) Spam detection in social media using convolutional and long short term memory neural network. Ann Math Artif Intell 85(1):21–44CrossRef
20.
go back to reference Jauhiainen TS, Lui M, Zampieri M, Baldwin T, Lindén K (2019) Automatic language identification in texts: a survey. J Artif Intell Res 65:675–782MathSciNetMATH Jauhiainen TS, Lui M, Zampieri M, Baldwin T, Lindén K (2019) Automatic language identification in texts: a survey. J Artif Intell Res 65:675–782MathSciNetMATH
21.
go back to reference Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: Proceedings of conference of the European chapter of the association for computational linguistics (ACL), vol 2, pp 427–431 Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: Proceedings of conference of the European chapter of the association for computational linguistics (ACL), vol 2, pp 427–431
23.
24.
go back to reference Koehn P (2004) Statistical significance tests for machine translation evaluation. In: Proceedings of the 2004 conference on empirical methods on natural language processing, pp 388–395 Koehn P (2004) Statistical significance tests for machine translation evaluation. In: Proceedings of the 2004 conference on empirical methods on natural language processing, pp 388–395
25.
go back to reference Parcheta Z, Sanchis-Trilles G, Casacuberta F, Redahl R (2019) Multi-input CNN for text classification in commercial scenarios. In: Proceedings of the international work-conference on artificial neural networks. Springer, Berlin, pp 596–608 Parcheta Z, Sanchis-Trilles G, Casacuberta F, Redahl R (2019) Multi-input CNN for text classification in commercial scenarios. In: Proceedings of the international work-conference on artificial neural networks. Springer, Berlin, pp 596–608
26.
go back to reference Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), pp 1532–1543 Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
27.
go back to reference Sadr H, Pedram MM, Teshnehlab M (2019) A robust sentiment analysis method based on sequential combination of convolutional and recursive neural networks. Neural Process Lett 50(3):2745–2761CrossRef Sadr H, Pedram MM, Teshnehlab M (2019) A robust sentiment analysis method based on sequential combination of convolutional and recursive neural networks. Neural Process Lett 50(3):2745–2761CrossRef
28.
go back to reference Sayyed ZA, Dakota D, Kübler S (2017) IDS IUCL: investigating feature selection and oversampling for GermEval2017. Shared task on aspect-based sentiment in social media customer feedback, pp 43–48 Sayyed ZA, Dakota D, Kübler S (2017) IDS IUCL: investigating feature selection and oversampling for GermEval2017. Shared task on aspect-based sentiment in social media customer feedback, pp 43–48
29.
go back to reference Sennrich R, Haddow B, Birch A (2016) Neural machine translation of rare words with subword units. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies (NAACL HLT), vol 1, pp 1715–1725 Sennrich R, Haddow B, Birch A (2016) Neural machine translation of rare words with subword units. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies (NAACL HLT), vol 1, pp 1715–1725
30.
go back to reference Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 1631–1642 Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 1631–1642
31.
go back to reference Stein RA, Jaques PA, Valiati JF (2018) An analysis of hierarchical text classification using word embeddings. Preprint arXiv:1809.01771 Stein RA, Jaques PA, Valiati JF (2018) An analysis of hierarchical text classification using word embeddings. Preprint arXiv:​1809.​01771
32.
go back to reference Strange W, Bohn OS, Nishi K, Trent SA (2005) Contextual variation in the acoustic and perceptual similarity of North German and American English vowels. J Acoust Soc Am 118(3):1751–1762CrossRef Strange W, Bohn OS, Nishi K, Trent SA (2005) Contextual variation in the acoustic and perceptual similarity of North German and American English vowels. J Acoust Soc Am 118(3):1751–1762CrossRef
33.
go back to reference Strange W, Bohn OS, Trent SA, Nishi K (2004) Acoustic and perceptual similarity of North German and American English vowels. J Acoust Soc Am 115(4):1791–1807CrossRef Strange W, Bohn OS, Trent SA, Nishi K (2004) Acoustic and perceptual similarity of North German and American English vowels. J Acoust Soc Am 115(4):1791–1807CrossRef
34.
go back to reference Tiwary A (2017) Time is money and artificial intelligence can save you time. Digital CMO, London Tiwary A (2017) Time is money and artificial intelligence can save you time. Digital CMO, London
35.
go back to reference Vaswani A, Bengio S, Brevdo E, Chollet F, Gomez AN, Gouws S, Jones L, Kaiser L, Kalchbrenner N, Parmar N, Sepassi R, Shazeer N, Uszkoreit J (2018) Tensor2tensor for neural machine translation. Preprint arXiv:1803.07416 Vaswani A, Bengio S, Brevdo E, Chollet F, Gomez AN, Gouws S, Jones L, Kaiser L, Kalchbrenner N, Parmar N, Sepassi R, Shazeer N, Uszkoreit J (2018) Tensor2tensor for neural machine translation. Preprint arXiv:​1803.​07416
36.
go back to reference Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008 Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
37.
go back to reference Wojatzki M, Ruppert E, Holschneider S, Zesch T, Biemann C (2017) Germeval 2017: shared task on aspect-based sentiment in social media customer feedback. In: Shared task on aspect-based sentiment in social media customer feedback, pp 1–12 Wojatzki M, Ruppert E, Holschneider S, Zesch T, Biemann C (2017) Germeval 2017: shared task on aspect-based sentiment in social media customer feedback. In: Shared task on aspect-based sentiment in social media customer feedback, pp 1–12
38.
39.
go back to reference Xu J, Zhang C, Zhang P, Song D (2018) Text classification with enriched word features. In: Proceedings of the 16th Pacific RIM international conference on artificial intelligence (PRICAI). Springer, Berlin, pp 274–281 Xu J, Zhang C, Zhang P, Song D (2018) Text classification with enriched word features. In: Proceedings of the 16th Pacific RIM international conference on artificial intelligence (PRICAI). Springer, Berlin, pp 274–281
40.
go back to reference Zhang L, Wang S, Liu B (2018) Deep learning for sentiment analysis: a survey. Wiley Interdiscip Rev Data Min Knowl Discov 8(4):e1253CrossRef Zhang L, Wang S, Liu B (2018) Deep learning for sentiment analysis: a survey. Wiley Interdiscip Rev Data Min Knowl Discov 8(4):e1253CrossRef
Metadata
Title
Combining Embeddings of Input Data for Text Classification
Authors
Zuzanna Parcheta
Germán Sanchis-Trilles
Francisco Casacuberta
Robin Rendahl
Publication date
11-08-2020
Publisher
Springer US
Published in
Neural Processing Letters / Issue 5/2021
Print ISSN: 1370-4621
Electronic ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-020-10312-w

Other articles of this Issue 5/2021

Neural Processing Letters 5/2021 Go to the issue