nach oben

Neural Processing Letters

Erschienen in:

11.08.2020

Combining Embeddings of Input Data for Text Classification

verfasst von: Zuzanna Parcheta, Germán Sanchis-Trilles, Francisco Casacuberta, Robin Rendahl

Erschienen in: Neural Processing Letters | Ausgabe 5/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The problem of automatic text classification is an essential part of text analysis. The improvement of text classification can be done at different levels such as a preprocessing step, network implementation, etc. In this paper, we focus on how the combination of different methods of text encoding may affect classification accuracy. To do this, we implemented a multi-input neural network that is able to encode input text using several text encoding techniques such as BERT, neural embedding layer, GloVe, skip-thoughts and ParagraphVector. The text can be represented at different levels of tokenised input text such as the sentence level, word level, byte pair encoding level and character level. Experiments were conducted on seven datasets from different language families: English, German, Swedish and Czech. Some of those languages contain agglutinations and grammatical cases. Two out of seven datasets originated from real commercial scenarios: (1) classifying ingredients into their corresponding classes by means of a corpus provided by Northfork; and (2) classifying texts according to the English level of their corresponding writers by means of a corpus provided by ProvenWord. The developed architecture achieves an improvement with different combinations of text encoding techniques depending on the different characteristics of the datasets. Once the best combination of embeddings at different levels was determined, different architectures of multi-input neural networks were compared. The results obtained with the best embedding combination and best neural network architecture were compared with state-of-the-art approaches. The results obtained with the dataset used in the experiments were better than the state-of-the-art baselines.

Vorheriger Artikel Multimodal Machine Learning for Natural Language Processing: Disambiguating Prepositional Phrase Attachments with Images

Nächster Artikel Knowledge Acquisition and Design Using Semantics and Perception: A Case Study for Autonomous Robots

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

https://github.com/zparcheta/multi-input_classifier.

http://download.tensorflow.org/models/skip_thoughts_uni_2017_02_02.tar.gz, http://download.tensorflow.org/models/skip_thoughts_bi_2017_02_16.tar.gz.

http://opus.nlpl.eu/Books.php.

https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1.

https://bit.ly/2DwXyME L6-Yahoo! Answers Comprehensive Questions and Answers version 1.0 (multi-part).

http://liks.fav.zcu.cz/sentiment/.

https://github.com/dmlc/xgboost.

Abadi M, Barham P, Chen J, Chen Z et al (2016) Tensorflow: a system for large-scale machine learning. In: Proceedings of 12th USENIX symposium on operating systems design and implementation (OSDI 16), pp 265–283

Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Proceedings of workshop at the international conference on learning representations (ICLR)

Bergsma S, Kondrak G (2007) Alignment-based discriminative string similarity. In: Proceedings of the 45th annual meeting of the association of computational linguistics, pp 656–663

Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146 CrossRef

Bridle JS (1989) Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In: Fogelman Soulié F, Hérault J (eds) Neurocomputing. Springer, Berlin, Heidelberg, pp 227–236

Chen D, Manning CD (2014) A fast and accurate dependency parser using neural networks. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 740–750

Chollet F (2016) Using pre-trained word embeddings in a keras model. The Keras Blog, London

Chollet F, Falbel D, Allaire J, Tang YT, Van Der Bijl W, Studer M, Keydana S (2015) Keras: deep learning library for theano and tensorflow, vols 7, 8. https://keras.io/k

Conneau A, Kiela D, Schwenk H, Barrault L, Bordes A (2017) Supervised learning of universal sentence representations from natural language inference data. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 670–680

10.

Dai AM, Olah C, Le QV (2015) Document embedding with paragraph vectors. Preprint arXiv:1507.07998v1

11.

Devlin J, Chang M, Lee K, Toutanova K (2018) BERT: pre-training of deep bidirectional transformers for language understanding. Preprint arXiv:1810.04805

12.

Gage P (1994) A new algorithm for data compression. C Users J 12:23–38

13.

Goasduff L, Omale G (2018) Gartner survey finds consumers would use AI to save time and money. Gartner, Berlin

14.

Gupta V, Karnick H, Bansal A, Jhala P (2016) Product classification in e-commerce using distributional semantics. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp 536–546

15.

Habernal I, Brychcín T (2013) Unsupervised improving of sentiment analysis using global target context. Proc Recent Adv Nat Lang Process 2013:122–128

16.

Hill F, Cho K, Korhonen A (2016) Learning distributed representations of sentences from unlabelled data. In:Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1367–1377

17.

Hövelmann L, Allee S, Friedrich CM (2017) Fasttext and gradient boosted trees at Germeval-2017 on relevance classification and document-level polarity. In: Shared task on aspect-based sentiment in social media customer feedback, pp 30–35

18.

Ionescu RT, Butnaru A (2019) Vector of locally-aggregated word embeddings (VLAWE): a novel document-level representation. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), pp 363–369

19.

Jain G, Sharma M, Agarwal B (2019) Spam detection in social media using convolutional and long short term memory neural network. Ann Math Artif Intell 85(1):21–44CrossRef

20.

Jauhiainen TS, Lui M, Zampieri M, Baldwin T, Lindén K (2019) Automatic language identification in texts: a survey. J Artif Intell Res 65:675–782MathSciNetMATH

21.

Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: Proceedings of conference of the European chapter of the association for computational linguistics (ACL), vol 2, pp 427–431

22.

Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. Preprint arXiv:1412.6980

23.

Kiros R, Zhu Y, Salakhutdinov R, Zemel RS, Torralba A, Urtasun R, Fidler S (2015) Skip-thought vectors. Preprint arXiv:1506.06726

24.

Koehn P (2004) Statistical significance tests for machine translation evaluation. In: Proceedings of the 2004 conference on empirical methods on natural language processing, pp 388–395

25.

Parcheta Z, Sanchis-Trilles G, Casacuberta F, Redahl R (2019) Multi-input CNN for text classification in commercial scenarios. In: Proceedings of the international work-conference on artificial neural networks. Springer, Berlin, pp 596–608

26.

Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), pp 1532–1543

27.

Sadr H, Pedram MM, Teshnehlab M (2019) A robust sentiment analysis method based on sequential combination of convolutional and recursive neural networks. Neural Process Lett 50(3):2745–2761CrossRef

28.

Sayyed ZA, Dakota D, Kübler S (2017) IDS IUCL: investigating feature selection and oversampling for GermEval2017. Shared task on aspect-based sentiment in social media customer feedback, pp 43–48

29.

Sennrich R, Haddow B, Birch A (2016) Neural machine translation of rare words with subword units. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies (NAACL HLT), vol 1, pp 1715–1725

30.

Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 1631–1642

31.

Stein RA, Jaques PA, Valiati JF (2018) An analysis of hierarchical text classification using word embeddings. Preprint arXiv:1809.01771

32.

Strange W, Bohn OS, Nishi K, Trent SA (2005) Contextual variation in the acoustic and perceptual similarity of North German and American English vowels. J Acoust Soc Am 118(3):1751–1762CrossRef

33.

Strange W, Bohn OS, Trent SA, Nishi K (2004) Acoustic and perceptual similarity of North German and American English vowels. J Acoust Soc Am 115(4):1791–1807CrossRef

34.

Tiwary A (2017) Time is money and artificial intelligence can save you time. Digital CMO, London

35.

Vaswani A, Bengio S, Brevdo E, Chollet F, Gomez AN, Gouws S, Jones L, Kaiser L, Kalchbrenner N, Parmar N, Sepassi R, Shazeer N, Uszkoreit J (2018) Tensor2tensor for neural machine translation. Preprint arXiv:1803.07416

36.

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008

37.

Wojatzki M, Ruppert E, Holschneider S, Zesch T, Biemann C (2017) Germeval 2017: shared task on aspect-based sentiment in social media customer feedback. In: Shared task on aspect-based sentiment in social media customer feedback, pp 1–12

38.

Xu B, Wang N, Chen T, Li M (2015) Empirical evaluation of rectified activations in convolutional network. Preprint arXiv:1505.00853

39.

Xu J, Zhang C, Zhang P, Song D (2018) Text classification with enriched word features. In: Proceedings of the 16th Pacific RIM international conference on artificial intelligence (PRICAI). Springer, Berlin, pp 274–281

40.

Zhang L, Wang S, Liu B (2018) Deep learning for sentiment analysis: a survey. Wiley Interdiscip Rev Data Min Knowl Discov 8(4):e1253CrossRef

41.

Zhang X, LeCun Y (2015) Text understanding from scratch. Preprint arXiv:1502.01710

Titel: Combining Embeddings of Input Data for Text Classification
verfasst von: Zuzanna Parcheta
Germán Sanchis-Trilles
Francisco Casacuberta
Robin Rendahl
Publikationsdatum: 11.08.2020
Verlag: Springer US
Erschienen in: Neural Processing Letters / Ausgabe 5/2021
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI: https://doi.org/10.1007/s11063-020-10312-w

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Arbeitszeit/© granata68 / Fotolia, E-Autos im Fuhrpark: Lohnt sich das noch?/© Petair / stock.adobe.com, Kryptowährungen/© gopixa / Getty Images / iStock, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 5/2021

Online GBDT with Chunk Dynamic Weighted Majority Learners for Noisy and Drifting Data Streams

Trajectory Association for Person Re-identification

A Novel Architecture with Separate Comparison and Interaction Modules for Chinese Semantic Sentence Matching

Finite Time Synchronization of Delayed Quaternion Valued Neural Networks with Fractional Order

Knowledge Acquisition and Design Using Semantics and Perception: A Case Study for Autonomous Robots

Integrating Machine Learning Techniques in Semantic Fake News Detection

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.