nach oben

International Journal of Machine Learning and Cybernetics

Erschienen in:

27.05.2023 | Original Article

Improving cross-lingual language understanding with consistency regularization-based fine-tuning

verfasst von: Bo Zheng, Wanxiang Che

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 10/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Fine-tuning pre-trained cross-lingual language models alleviates the need for annotated data in different languages, as it allows the models to transfer task-specific supervision between languages, especially from high- to low-resource languages. In this work, we propose to improve cross-lingual language understanding with consistency regularization-based fine-tuning. Specifically, we use example consistency regularization to penalize the prediction sensitivity to four types of data augmentations, i.e., subword sampling, Gaussian noise, code-switch substitution, and machine translation. In addition, we employ model consistency to regularize the models trained with two augmented versions of the same training set. Experimental results on the XTREME benchmark show that our method (the code is available at https://github.com/bozheng-hit/xTune) achieves significant improvements across various cross-lingual language understanding tasks, including text classification, question answering, and sequence labeling. Furthermore, we extend our method to the few-shot cross-lingual transfer setting, particularly considering a more realistic setting where machine translation systems are available. Meanwhile, machine translation as data augmentation can be well combined with our consistency regularization method. Experimental results demonstrate that our method also benefits the few-shot scenario.

Vorheriger Artikel Siamese global location-aware network for visual object tracking

Nächster Artikel Improving label quality in crowdsourcing using deep co-teaching-based noise correction

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

ATZelectronics worldwide

ATZlectronics worldwide is up-to-speed on new trends and developments in automotive electronics on a scientific level with a high depth of information.

Order your 30-days-trial for free and without any commitment.

Jetzt informieren

ATZelektronik

Die Fachzeitschrift ATZelektronik bietet für Entwickler und Entscheider in der Automobil- und Zulieferindustrie qualitativ hochwertige und fundierte Informationen aus dem gesamten Spektrum der Pkw- und Nutzfahrzeug-Elektronik.

Lassen Sie sich jetzt unverbindlich 2 kostenlose Ausgabe zusenden.

Jetzt informieren

Nur mit Berechtigung zugänglich

We define conventional cross-lingual fine-tuning as fine-tuning the pre-trained cross-lingual model with the labeled training set in the source language only (typically English) or with labeled training sets in all languages.

Implemented by .detach() in PyTorch.

https://github.com/google-research/xtreme

https://github.com/facebookresearch/MUSE

X-STILTs [39] uses additional SQuAD v1.1 English training data for the TyDiQA-GoldP dataset, while we prefer a cleaner setting here.

FILTER directly selects the best model on the test set of XQuAD and TyDiQA-GoldP. Under this setting, we can obtain 83.1/69.7 for XQuAD, 75.5/61.1 for TyDiQA-GoldP.

For span extraction datasets, to align the labels, the answers are enclosed in quotes before translating, which makes it easy to extract answers from translated context [30]. This method can also be applied to NER tasks. However, aligning label information requires complex post-processing, and there can be alignment errors.

Paragraphs in XQuAD contains more question-answer pairs than MLQA.

Aghajanyan A, Shrivastava A, Gupta A, et al (2020) Better fine-tuning by reducing representational collapse. CoRR. arXiv:2008.03156

Artetxe M, Ruder S, Yogatama D (2020) On the cross-lingual transferability of monolingual representations. In: Jurafsky D, Chai J, Schluter N, et al (eds) Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5–10, 2020. Association for Computational Linguistics, pp 4623–4637. https://www.aclweb.org/anthology/2020.acl-main.421/

Athiwaratkun B, Finzi M, Izmailov P, et al (2019) There are many consistent explanations of unlabeled data: why you should average. In: 7th international conference on learning representations, ICLR 2019, New Orleans, LA, USA, May 6–9. OpenReview.net, https://openreview.net/forum?id=rkgKBhA5Y7

Carmon Y, Raghunathan A, Schmidt L, et al (2019) Unlabeled data improves adversarial robustness. In: Wallach HM, Larochelle H, Beygelzimer A, et al (eds) Advances in neural information processing systems 32: annual conference on neural information processing systems 2019, NeurIPS 2019, 8–14 December 2019, Vancouver, BC, Canada, pp 11190–11201. http://papers.nips.cc/paper/9298-unlabeled-data-improves-adversarial-robustness

Chi Z, Dong L, Wei F, et al (2020) InfoXLM: an information-theoretic framework for cross-lingual language model pre-training. CoRR. arXiv:2007.07834

Chi Z, Dong L, Zheng B, et al (2021) Improving pretrained cross-lingual language models via self-labeled word alignment. In: Zong C, Xia F, Li W, et al (eds) Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, ACL/IJCNLP 2021, (vol 1: Long Papers), Virtual Event, August 1–6, 2021. Association for Computational Linguistics, pp 3418–3430. https://doi.org/10.18653/v1/2021.acl-long.265

Chi Z, Huang S, Dong L, et al (2022) XLM-E: cross-lingual language model pre-training via ELECTRA. In: Muresan S, Nakov P, Villavicencio A (eds) Proceedings of the 60th annual meeting of the association for computational linguistics (vol 1: Long Papers), ACL 2022, Dublin, Ireland, May 22–27, 2022. Association for Computational Linguistics, pp 6170–6182. https://doi.org/10.18653/v1/2022.acl-long.427

Chung HW, Garrette D, Tan KC, et al (2020) Improving multilingual models with language-clustered vocabularies. In: Webber B, Cohn T, He Y, et al (eds) Proceedings of the 2020 conference on empirical methods in natural language processing, EMNLP 2020, Online, November 16–20, 2020. Association for Computational Linguistics, pp 4536–4546. https://doi.org/10.18653/v1/2020.emnlp-main.367

Clark JH, Palomaki J, Nikolaev V, et al (2020) Tydi QA: a benchmark for information-seeking question answering in typologically diverse languages. Trans Assoc Comput Linguist 8:454–470. https://transacl.org/ojs/index.php/tacl/article/view/1929

10.

Conneau A, Lample G (2019) Cross-lingual language model pretraining. In: Wallach HM, Larochelle H, Beygelzimer A, et al (eds) Advances in neural information processing systems 32: annual conference on neural information processing systems 2019, NeurIPS 2019, 8–14 December 2019, Vancouver, BC, Canada, pp 7057–7067. http://papers.nips.cc/paper/8928-cross-lingual-language-model-pretraining

11.

Conneau A, Rinott R, Lample G, et al (2018) XNLI: evaluating cross-lingual sentence representations. In: Riloff E, Chiang D, Hockenmaier J, et al (eds) Proceedings of the 2018 conference on empirical methods in natural language processing, Brussels, Belgium, October 31–November 4, 2018. Association for Computational Linguistics, pp 2475–2485. https://doi.org/10.18653/v1/d18-1269

12.

Conneau A, Khandelwal K, Goyal N, et al (2020a) Unsupervised cross-lingual representation learning at scale. In: Jurafsky D, Chai J, Schluter N, et al (eds) Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5–10, 2020. Association for Computational Linguistics, pp 8440–8451. http://www.aclweb.org/anthology/2020.acl-main.747/

13.

Conneau A, Wu S, Li H, et al (2020b) Emerging cross-lingual structure in pretrained language models. In: Jurafsky D, Chai J, Schluter N, et al (eds) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5–10, 2020. Association for Computational Linguistics, pp 6022–6034. https://www.aclweb.org/anthology/2020.acl-main.536/

14.

Devlin J, Chang M, Lee K, et al (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein J, Doran C, Solorio T (eds) Proceedings of the 2019 conference of the North American chapter of the Association for computational linguistics: human language technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, vol 1 (Long and Short Papers). Association for Computational Linguistics, pp 4171–4186. https://doi.org/10.18653/v1/n19-1423

15.

Fang Y, Wang S, Gan Z, et al (2020) FILTER: an enhanced fusion method for cross-lingual language understanding. CoRR. arXiv:2009.05166

16.

Faruqui M, Dyer C (2014) Improving vector space word representations using multilingual correlation. In: Bouma G, Parmentier Y (eds) Proceedings of the 14th conference of the European chapter of the association for computational linguistics, EACL 2014, April 26–30, 2014, Gothenburg, Sweden. The Association for Computer Linguistics, pp 462–471. https://doi.org/10.3115/v1/e14-1049

17.

Fei H, Zhang M, Ji D (2020) Cross-lingual semantic role labeling with high-quality translated training corpus. In: Jurafsky D, Chai J, Schluter N, et al (eds) Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5–10, 2020. Association for Computational Linguistics, pp 7014–7026. http://www.aclweb.org/anthology/2020.acl-main.627/

18.

Gao T, Han X, Xie R, et al (2020) Neural snowball for few-shot relation learning. In: The thirty-fourth AAAI conference on artificial intelligence, AAAI 2020, the thirty-second innovative applications of artificial intelligence conference, IAAI 2020, the tenth AAAI symposium on educational advances in artificial intelligence, EAAI 2020, New York, NY, USA, February 7–12, 2020. AAAI Press, pp 7772–7779. http://ojs.aaai.org/index.php/AAAI/article/view/6281

19.

Guo J, Che W, Yarowsky D, et al (2015) Cross-lingual dependency parsing based on distributed representations. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing of the Asian federation of natural language processing, ACL 2015, July 26–31, 2015, Beijing, China, vol 1: Long Papers. The Association for Computer Linguistics, pp 1234–1244. https://doi.org/10.3115/v1/p15-1119

20.

Hou Y, Che W, Lai Y, et al (2020) Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network. In: Jurafsky D, Chai J, Schluter N, et al (eds) Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5-10, 2020. Association for Computational Linguistics, pp 1381–1393. https://doi.org/10.18653/v1/2020.acl-main.128

21.

Hou Y, Mao J, Lai Y, et al (2020) Fewjoint: a few-shot learning benchmark for joint language understanding. CoRR. arXiv:2009.08138

22.

Hu J, Ruder S, Siddhant A, et al (2020) XTREME: A massively multilingual multi-task benchmark for evaluating cross-lingual generalisation. In: Proceedings of the 37th international conference on machine learning, ICML 2020, 13–18 July 2020, virtual event, proceedings of machine learning research, vol 119. PMLR, pp 4411–4421. http://proceedings.mlr.press/v119/hu20b.html

23.

Hu J, Johnson M, Firat O, et al (2021) Explicit alignment objectives for multilingual bidirectional encoders. In: Toutanova K, Rumshisky A, Zettlemoyer L, et al (eds) Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2021, Online, June 6–11, 2021. Association for Computational Linguistics, pp 3633–3643. https://doi.org/10.18653/v1/2021.naacl-main.284

24.

Hu W, Miyato T, Tokui S, et al (2017) Learning discrete representations via information maximizing self-augmented training. In: Precup D, Teh YW (eds) Proceedings of the 34th international conference on machine learning, ICML 2017, Sydney, NSW, Australia, 6–11 August 2017, proceedings of machine learning research, vol 70. PMLR, pp 1558–1567. http://proceedings.mlr.press/v70/hu17b.html

25.

Jiang H, He P, Chen W, et al (2020) SMART: Robust and efficient fine-tuning for pre-trained natural language models through principled regularized optimization. In: Jurafsky D, Chai J, Schluter N, et al (eds) Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5–10, 2020. Association for Computational Linguistics, pp 2177–2190. https://www.aclweb.org/anthology/2020.acl-main.197/

26.

Kudo T (2018) Subword regularization: Improving neural network translation models with multiple subword candidates. In: Gurevych I, Miyao Y (eds) Proceedings of the 56th annual meeting of the association for computational linguistics, ACL 2018, Melbourne, Australia, July 15–20, 2018, vol 1: long papers. Association for Computational Linguistics, pp 66–75. https://doi.org/10.18653/v1/P18-1007. https://www.aclweb.org/anthology/P18-1007/

27.

Kudo T, Richardson J (2018) Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. In: Blanco E, Lu W (eds) Proceedings of the 2018 conference on empirical methods in natural language processing, EMNLP 2018: system demonstrations, Brussels, Belgium, October 31–November 4, 2018. Association for Computational Linguistics, pp 66–71. https://doi.org/10.18653/v1/d18-2012

28.

Lample G, Conneau A, Denoyer L, et al (2018) Unsupervised machine translation using monolingual corpora only. In: 6th international conference on learning representations, ICLR 2018, Vancouver, BC, Canada, April 30–May 3, 2018, conference track proceedings. OpenReview.net, http://openreview.net/forum?id=rkYTTf-AZ

29.

Lauscher A, Ravishankar V, Vulic I, et al (2020) From zero to hero: On the limitations of zero-shot language transfer with multilingual transformers. In: Webber B, Cohn T, He Y, et al (eds) Proceedings of the 2020 conference on empirical methods in natural language processing, EMNLP 2020, Online, November 16–20, 2020. Association for Computational Linguistics, pp 4483–4499. https://doi.org/10.18653/v1/2020.emnlp-main.363

30.

Lewis PSH, Oguz B, Rinott R, et al (2020) MLQA: evaluating cross-lingual extractive question answering. In: Jurafsky D, Chai J, Schluter N, et al (eds) Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5–10, 2020. Association for Computational Linguistics, pp 7315–7330. http://www.aclweb.org/anthology/2020.acl-main.653/

31.

Li H, Yan H, Li Y, et al (2023) Distinguishability calibration to in-context learning. CoRR. https://doi.org/10.48550/arXiv.2302.06198. arXiv:2302.06198

32.

Liu X, Cheng H, He P, et al (2020) Adversarial training for large neural language models. CoRR. arXiv:2004.08994

33.

Luo F, Wang W, Liu J, et al (2020) VECO: Variable encoder-decoder pre-training for cross-lingual understanding and generation. arXiv:2010.16046

34.

Lv X, Gu Y, Han X, et al (2019) Adapting meta knowledge graph information for multi-hop reasoning over few-shot relations. In: Inui K, Jiang J, Ng V, et al (eds) Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3–7, 2019. Association for Computational Linguistics, pp 3374–3379. https://doi.org/10.18653/v1/D19-1334

35.

Mikolov T, Le QV, Sutskever I (2013) Exploiting similarities among languages for machine translation. CoRR. arXiv:1309.4168

36.

Miyato T, Maeda S, Koyama M et al (2019) Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE Trans Pattern Anal Mach Intell 41(8):1979–1993. https://doi.org/10.1109/TPAMI.2018.2858821CrossRef

37.

Nivre J, Blokland R, Partanen N, et al (2018) Universal dependencies 2.2

38.

Pan X, Zhang B, May J, et al (2017) Cross-lingual name tagging and linking for 282 languages. In: Barzilay R, Kan M (eds) Proceedings of the 55th annual meeting of the association for computational linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, volume 1: long papers. Association for Computational Linguistics, pp 1946–1958. https://doi.org/10.18653/v1/P17-1178

39.

Phang J, Htut PM, Pruksachatkun Y, et al (2020) English intermediate-task training improves zero-shot cross-lingual transfer too. CoRR. arXiv:2005.13013

40.

Provilkov I, Emelianenko D, Voita E (2020) BPE-dropout: simple and effective subword regularization. In: Jurafsky D, Chai J, Schluter N, et al (eds) Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5–10, 2020. Association for Computational Linguistics, pp 1882–1892. https://www.aclweb.org/anthology/2020.acl-main.170/

41.

Qin L, Ni M, Zhang Y, et al (2020) CoSDA-ML: multi-lingual code-switching data augmentation for zero-shot cross-lingual NLP. In: Bessiere C (eds) Proceedings of the twenty-ninth international joint conference on artificial intelligence, IJCAI 2020. ijcai.org, pp 3853–3860. https://doi.org/10.24963/ijcai.2020/533

42.

Shah DJ, Gupta R, Fayazi AA, et al (2019) Robust zero-shot cross-domain slot filling with example values. In: Korhonen A, Traum DR, Màrquez L (eds) Proceedings of the 57th conference of the association for computational linguistics, ACL 2019, Florence, Italy, July 28–August 2, 2019, vol 1: long papers. Association for Computational Linguistics, pp 5484–5490. https://doi.org/10.18653/v1/p19-1547

43.

Singh J, McCann B, Keskar NS, et al (2019) XLDA: cross-lingual data augmentation for natural language inference and question answering. CoRR. arXiv:1905.11471

44.

Sun S, Sun Q, Zhou K, et al (2019) Hierarchical attention prototypical networks for few-shot text classification. In: Inui K, Jiang J, Ng V, et al (eds) Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3–7, 2019. Association for Computational Linguistics, pp 476–485. https://doi.org/10.18653/v1/D19-1045

45.

Tarvainen A, Valpola H (2017) Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In: 5th international conference on learning representations, ICLR 2017, Toulon, France, April 24–26, 2017, Workshop Track Proceedings. OpenReview.net, http://openreview.net/forum?id=ry8u21rtl

46.

Wang Y, Che W, Guo J, et al (2019) Cross-lingual BERT transformation for zero-shot dependency parsing. In: Inui K, Jiang J, Ng V, et al (eds) Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3–7, 2019. Association for Computational Linguistics, pp 5720–5726. https://doi.org/10.18653/v1/D19-1575

47.

Xie Q, Dai Z, Hovy EH, et al (2020) Unsupervised data augmentation for consistency training. In: Larochelle H, Ranzato M, Hadsell R, et al (eds) Advances in neural information processing systems 33: annual conference on neural information processing systems 2020, NeurIPS 2020, December 6–12, 2020, virtual. http://proceedings.neurips.cc/paper/2020/hash/44feb0096faa8326192570788b38c1d1-Abstract.html

48.

Xu H, Murray K (2022) Por qué não utiliser alla språk? mixed training with gradient optimization in few-shot cross-lingual transfer. CoRR. arXiv:2204.13869

49.

Xu R, Yang Y, Otani N, et al (2018) Unsupervised cross-lingual transfer of word embedding spaces. In: Riloff E, Chiang D, Hockenmaier J, et al (eds) Proceedings of the 2018 conference on empirical methods in natural language processing, Brussels, Belgium, October 31–November 4, 2018. Association for Computational Linguistics, pp 2465–2474. https://doi.org/10.18653/v1/d18-1268

50.

Yan H, Gui L, He Y (2022) Hierarchical interpretation of neural text classification. Comput Linguist 48(4):987–1020. https://doi.org/10.1162/coli_a_00459CrossRef

51.

Yan H, Gui L, Li W, et al (2022b) Addressing token uniformity in transformers via singular value transformation. In: Cussens J, Zhang K (eds) Uncertainty in artificial intelligence, proceedings of the thirty-eighth conference on uncertainty in artificial intelligence, UAI 2022, 1–5 August 2022, Eindhoven, The Netherlands, proceedings of machine learning research, vol 180. PMLR, pp 2181–2191. http://proceedings.mlr.press/v180/yan22b.html

52.

Yan L, Zheng Y, Cao J (2018) Few-shot learning for short text classification. Multimed Tools Appl 77(22):29799–29810. https://doi.org/10.1007/s11042-018-5772-4CrossRef

53.

Yang H, Chen H, Zhou H et al (2022) Enhancing cross-lingual transfer by manifold mixup. In: The 10th International conference on learning representations, ICLR 2022. Virtual Event. April 25-29, 2022

54.

Yang Y, Zhang Y, Tar C, et al (2019) PAWS-X: A cross-lingual adversarial dataset for paraphrase identification. In: Inui K, Jiang J, Ng V, et al (eds) Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3–7, 2019. Association for Computational Linguistics, pp 3685–3690. https://doi.org/10.18653/v1/D19-1382

55.

Ye M, Zhang X, Yuen PC, et al (2019) Unsupervised embedding learning via invariant and spreading instance feature. In: IEEE conference on computer vision and pattern recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, 2019. Computer Vision Foundation/IEEE, pp 6210–6219. https://doi.org/10.1109/CVPR.2019.00637. http://openaccess.thecvf.com/content_CVPR_2019/html/Ye_Unsupervised_Embedding_Learning_via_Invariant_and_Spreading_Instance_Feature_CVPR_2019_paper.html

56.

Yu M, Guo X, Yi J, et al (2018) Diverse few-shot text classification with multiple metrics. In: Walker MA, Ji H, Stent A (eds) Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1–6, 2018, vol 1 (long papers). Association for Computational Linguistics, pp 1206–1215. https://doi.org/10.18653/v1/n18-1109

57.

Zhang M, Zhang Y, Fu G (2019) Cross-lingual dependency parsing using code-mixed treebank. In: Inui K, Jiang J, Ng V, et al (eds) Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3–7, 2019. Association for Computational Linguistics, pp 997–1006. https://doi.org/10.18653/v1/D19-1092

58.

Zhao M, Zhu Y, Shareghi E, et al (2021) A closer look at few-shot crosslingual transfer: the choice of shots matters. In: Zong C, Xia F, Li W, et al (eds) Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, ACL/IJCNLP 2021, (vol 1: long papers), virtual event, August 1–6, 2021. Association for Computational Linguistics, pp 5751–5767. https://doi.org/10.18653/v1/2021.acl-long.447

59.

Zhao W, Eger S, Bjerva J, et al (2021) Inducing language-agnostic multilingual representations. In: Nastase V, Vulic I (eds) Proceedings of *SEM 2021: the tenth joint conference on lexical and computational semantics, *SEM 2021, Online, August 5–6, 2021. Association for Computational Linguistics, pp 229–240. https://doi.org/10.18653/v1/2021.starsem-1.22

60.

Zheng B, Dong L, Huang S, et al (2021) Allocating large vocabulary capacity for cross-lingual language model pre-training. In: Moens M, Huang X, Specia L, et al (eds) Proceedings of the 2021 conference on empirical methods in natural language processing, EMNLP 2021, virtual event/Punta Cana, Dominican Republic, 7–11 November, 2021. Association for Computational Linguistics, pp 3203–3215. https://doi.org/10.18653/v1/2021.emnlp-main.257

61.

Zheng S, Song Y, Leung T, et al (2016) Improving the robustness of deep neural networks via stability training. In: 2016 IEEE conference on computer vision and pattern recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30, 2016. IEEE Computer Society, pp 4480–4488. https://doi.org/10.1109/CVPR.2016.485

62.

Zhu C, Cheng Y, Gan Z, et al (2020) FreeLB: enhanced adversarial training for natural language understanding. In: 8th international conference on learning representations, ICLR 2020, Addis Ababa, Ethiopia, April 26–30, 2020. OpenReview.net, https://openreview.net/forum?id=BygzbyHFvB

Titel: Improving cross-lingual language understanding with consistency regularization-based fine-tuning
verfasst von: Bo Zheng
Wanxiang Che
Publikationsdatum: 27.05.2023
Verlag: Springer Berlin Heidelberg
Erschienen in: International Journal of Machine Learning and Cybernetics / Ausgabe 10/2023
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI: https://doi.org/10.1007/s13042-023-01854-1

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Die Gewinner und Laudatoren des Sustainability Award in Automotive 2024/© Uli Regenscheit | ATZlive, Search Icon, Banner Hanser, Additiv gefertigte Teile/© Marina_Skoropadskaya | Getty Images | iStock, Warnschild "Land unter"/© Bluedesign / Fotolia, Gardiner von Trapp/© Alpega Group, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade, chassis.tech plus 2023/© [M] ATZlive / TÜV SÜD PRODUCT SERVICE GMBH, adäsion-Webinar-Matinee/© krystiannawrocki_ Getty Images

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

ATZelectronics worldwide

ATZelektronik

Weitere Artikel der Ausgabe 10/2023

An efficient planning method based on deep reinforcement learning with hybrid actions for autonomous driving on highway

OSAGGAN: one-shot unsupervised image-to-image translation using attention-guided generative adversarial networks

Multi-layered semantic representation network for multi-label image classification

learning anomalous human actions using frames of interest and decoderless deep embedded clustering

LSTM with spatiotemporal attention for IoT-based wireless sensor collected hydrological time-series forecasting

OUBoost: boosting based over and under sampling technique for handling imbalanced data

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.