Top

Neural Processing Letters

Published in:

02-02-2022

Domain Adaptation for POS Tagging with Contrastive Monotonic Chunk-wise Attention

Authors: Rajesh Kumar Mundotiya, Arpit Mehta, Rupjyoti Baruah

Published in: Neural Processing Letters | Issue 4/2022

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Part of Speech (POS) tagging is a sequential labelling task and one of the core applications of Natural Language Processing. It has been a challenging problem for the low resource languages. Sequential labelling algorithms aim to model relationships among the words of a sentence. Availability of annotated datasets in ample amounts is another challenge for low resource languages. Contrastive training has been tried as a robust approach that captures the essential features during model training and based on this, Contrastive Monotonic Chunk-wise attention with CNN-GRU-Softmax (CMCCGS) model architecture has been proposed for POS tagging. It learns optimal features in a low resource regime. It comprises three components: contrastive training, monotonic chunk-wise attention and CNN-GRU-Softmax, where Monotonic Chunk-wise attention exploits the discrete and chunk level dependencies. We experimented on the datasets of four domains, Article, Conversation, Disease and Tourism, of the Hindi treebank, Tweet domain from TweeBank, Newswire domain from Penn TreeBank (PTB) and Tweet domain from ARK and compared it with several state-of-the-art models. We have obtained \(96.63\%\), \(94.34\%\), \(91.24\%\), \(93.76\%\), \(92.30\%\), \(97.51\%\) and \(93.55\%\) accuracy on respective domains after CMCCGS has been applied. CMCCGS model has been further extended to domain adaptation by using single and multi-source domain adaptation to allow fine-tuning. It is analysed the effects on different layers. The extremely low resource domains such as Tourism, Disease and tweet domain of TweeBank and ARK have shown improvement in accuracy of \(+3.00\% (96.76\%)\) by an Article domain, \(+4.14\% (95.38\%)\) by Article and Tourism (multi-source), \(+2.93\% (95.23\%)\) by PTB domain and \(+1.43\% (94.98\%)\) by PTB and TweeBank (multi-source) as source domain, respectively. However, the Conversation domain has a negative impact on domain adaptation.

previous article Educational Data Mining: Dropout Prediction in XuetangX MOOCs

next article A Robust Non-Fragile Control Lag Synchronization for Fractional Order Multi-Weighted Complex Dynamic Networks with Coupling Delays

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

http://ltrc.iiit.ac.in/hutb_release/.

https://ltrc.iiit.ac.in/showfile.php?filename=downloads/kolhi/.

https://github.com/Rajesh-Mundotiya/Journal-paper-adv-tl-monotonic-chunkwise-attn.

This abbreviation is used to compare our results.

Bhat RA, Bhatt R, Farudi A, Klassen P, Narasimhan B, Palmer M, Rambow O, Sharma DM, Vaidya A, Vishnu SR et al (2017) The hindi/urdu treebank project. In: Handbook of Linguistic Annotation. Springer, pp 659–697

Blitzer J, McDonald R, Pereira F (2006) Domain adaptation with structural correspondence learning. In: Proceedings of the 2006 conference on empirical methods in natural language processing, pp 120–128

Chelba C, Acero A (2006) Adaptation of maximum entropy capitalizer: little data can help a lot. Comput Speech Lang 20(4):382–399CrossRef

Chiu C, Raffel C (2018) Monotonic chunkwise attention. In: 6th international conference on learning representations, ICLR 2018, Vancouver, BC, Canada, April 30–May 3, 2018, Conference track proceedings. OpenReview.net . https://openreview.net/forum?id=Hko85plCW

Cho K, van Merrienboer B, Gülçehre Ç, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Moschitti A, Pang B, Daelemans W (eds) Proceedings of the 2014 conference on empirical methods in natural language processing, EMNLP 2014, October 25–29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, pp 1724–1734. ACL . https://doi.org/10.3115/v1/d14-1179

Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537MATH

Daumé III H (2007) Frustratingly easy domain adaptation. In: Proceedings of the 45th annual meeting of the association of computational linguistics. Association for computational linguistics, Prague, Czech Republic, pp 256–263. https://www.aclweb.org/anthology/P07-1033

Daumé III H, Kumar A, Saha A (2010) Frustratingly easy semi-supervised domain adaptation. In: Proceedings of the 2010 workshop on domain adaptation for natural language processing. Association for Computational Linguistics, pp 53–59

Daumé III H, Marcu D (2005) Learning as search optimization: approximate large margin methods for structured prediction. In: Proceedings of the 22nd international conference on machine learning, pp 169–176

10.

Du C, Sun H, Wang J, Qi Q, Liao J (2020) Adversarial and domain-aware bert for cross-domain sentiment analysis. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 4019–4028

11.

Ferraro JP, Daumé H III, DuVall SL, Chapman WW, Harkema H, Haug PJ (2013) Improving performance of natural language processing part-of-speech tagging on clinical narratives through domain adaptation. J Am Med Inf Assoc 20(5):931–939CrossRef

12.

Globerson A, Roweis S (2006) Nightmare at test time: robust learning by feature deletion. In: Proceedings of the 23rd international conference on Machine learning, pp 353–360

13.

Gui T, Zhang Q, Huang H, Peng M, Huang X (2017) Part-of-speech tagging for Twitter with adversarial neural networks. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, pp 2411–2420. https://doi.org/10.18653/v1/D17-1256.https://www.aclweb.org/anthology/D17-1256

14.

Guo J, Shah DJ, Barzilay R (2018) Multi-source domain adaptation with mixture of experts. arXiv preprint arXiv:1809.02256

15.

Huang F, Yates A (2009) Distributional representations for handling sparsity in supervised sequence-labeling. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP. Association for Computational Linguistics, vol 1, pp 495–503

16.

Huang F, Yates A (2010) Exploring representation-learning approaches to domain adaptation. In: Proceedings of the 2010 workshop on domain adaptation for natural language processing. Association for Computational Linguistics, pp 23–30

17.

Kim Y (2014) Convolutional neural networks for sentence classification. In: A. Moschitti, B. Pang, W. Daelemans (eds.) Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL. ACL, pp 1746–1751. https://doi.org/10.3115/v1/d14-1181

18.

Kruengkrai C, Uchimoto K, Kazama J, Wang Y, Torisawa K, Isahara H (2009) An error-driven word-character hybrid model for joint chinese word segmentation and pos tagging. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP. Association for Computational Linguistics, vol 1, pp 513–521

19.

Liu K, Chapman W, Hwa R, Crowley RS (2007) Heuristic sample selection to minimize reference standard training set for a part-of-speech tagger. J Am Med Inf Assoc 14(5):641–650CrossRef

20.

Liu Y, Zhang Y (2012) Unsupervised domain adaptation for joint segmentation and POS-tagging. In: Proceedings of COLING 2012: Posters. The COLING 2012 Organizing Committee, Mumbai, India, pp 745–754. https://www.aclweb.org/anthology/C12-2073

21.

Liu Y, Zhang Y (2012) Unsupervised domain adaptation for joint segmentation and pos-tagging. In: Proceedings of COLING 2012: Posters, pp 745–754

22.

Liu Y, Zhu Y, Che W, Qin B, Schneider N, Smith NA (2018) Parsing tweets into universal dependencies. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, vol 1 (Long Papers), pp 965–975

23.

Ma X, Hovy E (2016) End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: Proceedings of the 54th annual meeting of the association for computational linguistics. Association for Computational Linguistics, Berlin, Germany, (vol 1: Long Papers), pp 1064–1074. https://doi.org/10.18653/v1/P16-1101.https://www.aclweb.org/anthology/P16-1101

24.

Marcus MP, Marcinkiewicz MA, Santorini B (1993) Building a large annotated corpus of english: the penn treebank. Comput Linguist 19(2):313–330

25.

März L, Trautmann D, Roth B (2019) Domain adaptation for part-of-speech tagging of noisy user-generated text. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, vol 1 (Long and Short Papers), pp 3415–3420

26.

Meftah S, Semmar N (2018) A neural network model for part-of-speech tagging of social media texts. In: Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018)

27.

Miller J, Torii M, Vijay-Shanker K (2007) Adaptation of pos tagging for multiple biomedical domains. In: Biological, translational, and clinical language processing, pp 179–180

28.

Miller JE, Bloodgood M, Torii M, Vijay-Shanker K (2006) Rapid adaptation of pos tagging for domain specific uses. In: Proceedings of the HLT-NAACL BioNLP workshop on linking natural language and biology. Association for Computational Linguistics, pp 118–119

29.

Miyato T, Dai AM, Goodfellow I (2017) Adversarial training methods for semi-supervised text classification. In: ICLR

30.

Owoputi O, O’Connor B, Dyer C, Gimpel K, Schneider N, Smith NA (2013) Improved part-of-speech tagging for online conversational text with word clusters. In: Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 380–390

31.

Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359CrossRef

32.

Plank B, Søgaard A, Goldberg Y (2016) Multilingual part-of-speech tagging with bidirectional long short-term memory models and auxiliary loss. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Berlin, Germany (vol 2: Short Papers), pp 412–418. https://doi.org/10.18653/v1/P16-2067.https://www.aclweb.org/anthology/P16-2067

33.

Schnabel T, Schütze H (2013) Towards robust cross-domain domain adaptation for part-of-speech tagging. In: Proceedings of the sixth international joint conference on natural language processing. Asian Federation of Natural Language Processing, Nagoya, Japan, pp 198–206. https://www.aclweb.org/anthology/I13-1023

34.

Schnabel T, Schütze H (2014) Flors: fast and simple domain adaptation for part-of-speech tagging. Trans Assoc Comput Linguist 2:15–26CrossRef

35.

Søgaard A (2013) Part-of-speech tagging with antagonistic adversaries. In: Proceedings of the 51st annual meeting of the association for computational linguistics (vol 2: Short Papers), pp 640–644

36.

Vu TT, Phung D, Haffari G (2020) Effective unsupervised domain adaptation with adversarially trained language models. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Online, pp 6163–6173. https://doi.org/10.18653/v1/2020.emnlp-main.497.https://www.aclweb.org/anthology/2020.emnlp-main.497

37.

Wright D, Augenstein I (2020) Transformer based multi-source domain adaptation. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, pp 7963–7974. https://doi.org/10.18653/v1/2020.emnlp-main.639.https://www.aclweb.org/anthology/2020.emnlp-main.639

38.

Xiao M, Guo Y (2013) Domain adaptation for sequence labeling tasks with a probabilistic language adaptation model. In: International conference on machine learning, pp 293–301

39.

Yasunaga M, Kasai J, Radev D (2018) Robust multilingual part-of-speech tagging via adversarial training. In: Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies. Association for Computational Linguistics, New Orleans, Louisiana, vol 1 (Long Papers), pp 976–986. https://doi.org/10.18653/v1/N18-1089.https://www.aclweb.org/anthology/N18-1089

40.

Zennaki O, Semmar N, Besacier L (2019) A neural approach for inducing multilingual resources and natural language processing tools for low-resource languages. Nat Lang Eng 25(1):43–67CrossRef

41.

Zhang M, Zhang Y, Che W, Liu T (2014) Type-supervised domain adaptation for joint segmentation and pos-tagging. In: Proceedings of the 14th conference of the European chapter of the association for computational linguistics, pp 588–597

Title: Domain Adaptation for POS Tagging with Contrastive Monotonic Chunk-wise Attention
Authors: Rajesh Kumar Mundotiya
Arpit Mehta
Rupjyoti Baruah
Publication date: 02-02-2022
Publisher: Springer US
Published in: Neural Processing Letters / Issue 4/2022
Print ISSN: 1370-4621
Electronic ISSN: 1573-773X
DOI: https://doi.org/10.1007/s11063-022-10746-4

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 4/2022

PPIS-JOIN: A Novel Privacy-Preserving Image Similarity Join Method

Neo Arithmetic and Ranking Techniques for Trapezoidal Generalized Interval Valued Fuzzy Numbers: Their Applications in Decision Making for Medical Investigation

Optimization of Propofol Dose Estimated During Anesthesia Through Artificial Intelligence by Genetic Algorithm: Design and Clinical Assessment

TFM: A Triple Fusion Module for Integrating Lexicon Information in Chinese Named Entity Recognition

Parameter Control Based Cuckoo Search Algorithm for Numerical Optimization

Distributed Neural Network and Particle Swarm Optimization for Micro-grid Adaptive Power Allocation