nach oben

Neural Processing Letters

Erschienen in:

08.10.2021

ParsBERT: Transformer-based Model for Persian Language Understanding

verfasst von: Mehrdad Farahani, Mohammad Gharachorloo, Marzieh Farahani, Mohammad Manthouri

Erschienen in: Neural Processing Letters | Ausgabe 6/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The surge of pre-trained language models has begun a new era in the field of Natural Language Processing (NLP) by allowing us to build powerful language models. Among these models, Transformer-based models such as BERT have become increasingly popular due to their state-of-the-art performance. However, these models are usually focused on English, leaving other languages to multilingual models with limited resources. This paper proposes a monolingual BERT for the Persian language (ParsBERT), which shows its state-of-the-art performance compared to other architectures and multilingual models. Also, since the amount of data available for NLP tasks in Persian is very restricted, a massive dataset for different NLP tasks as well as pre-training the model is composed. ParsBERT obtains higher scores in all datasets, including existing ones and gathered ones, and improves the state-of-the-art performance by outperforming both multilingual BERT and other prior works in Sentiment Analysis, Text Classification, and Named Entity Recognition tasks.

Nächster Artikel Self-Supervised Convolutional Subspace Clustering Network with the Block Diagonal Regularizer

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. arXiv:1310.4546

Pennington J, Socher R, Manning C (2014) GloVe: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, pp 1532–1543. https://doi.org/10.3115/v1/D14-1162

Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. arXiv:1802.05365

Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805

adford A (2018) Improving language understanding by generative pre-training. In: OpenAI

Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized bert pretraining approach. arXiv:1907.11692

Yang Z, Dai Z, Yang Y, Carbonell J G, Salakhutdinov R, Le Q V (2019) Xlnet: generalized autoregressive pretraining for language understanding. In NeurIPS

Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu P J (2019) Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv:1910.10683

Conneau A, Khandelwal K, Goyal N, Chaudhary V, Wenzek G, Guzmán F, Grave E, Ott M, Zettlemoyer L, Stoyanov V (2019) Unsupervised cross-lingual representation learning at scale. arXiv:1911.02116

10.

Wang W, Bao F, Gao G (2019) Learning morpheme representation for mongoliannamed entity recognition.Neural Process Lett 50(3):2647–2664

11.

Huang Gengshi, Haifeng Hu (2018) c-rnn: a fine-grained language model for image captioning. Neural Process Letts 49:683–691CrossRef

12.

Niu Jinghao, Yang Yehui, Zhang Siheng, Sun Zhengya, Zhang Wensheng (2018) Multi-task character-level attentional networks for medical concept normalization. Neural Process Letts 49:1239–1256CrossRef

13.

Dai Andrew M, Le Quoc V (2015) Semi-supervised sequence learning. arXiv:1511.01432,

14.

Ramachandran P, Liu P J, Le Q V (2016) Unsupervised pretraining for sequence to sequence learning. arXiv:1611.02683

15.

Sutskever I, Vinyals O, Le Quoc V (2014) Sequence to sequence learning with neural networks. arXiv:1409.3215

16.

Howard J, Ruder S (2018) Universal language model fine-tuning for text classification. In ACL

17.

Hochreiter Sepp, Schmidhuber Jürgen (1997) Long short-term memory. Neural Comput 9:1735–1780CrossRef

18.

Vaswani A, Shazeer N, Parmar N , Uszkoreit J, Jones L, Gomez A N, Kaiser L, Polosukhin I (2017) Attention is all you need. arXiv:1706.03762

19.

Lample G, Conneau A (2019) Cross-lingual language model pretraining. arXiv:1901.07291

20.

Lan Z-Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R (2020) Albert: A lite bert for self-supervised learning of language representations. arXiv:1909.11942

21.

Wang A, Singh A, Michael J, Hill F , Levy O, Bowman SR (2018) GLUE: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint. arXiv:1804.07461

22.

Rajpurkar P, Jia R, Liang P (2018) Know what you don’t know: Unanswerable questions for squad. arXiv:1806.03822

23.

Wietse de Vries, Andreas van C, Arianna B, Tommaso C, Gertjan van N, and Malvina N. Bertje: A dutch bert model. arXiv:1912.09582, 2019

24.

Polignano M, Basile P, Degemmis M, Semeraro G, Basile V (2019) Alberto: Italian bert language understanding model for nlp challenging tasks based on tweets. In CLiC-it

25.

Antoun W, Baly F, Hajj H M (2020) Arabert: transformer-based model for arabic language understanding. arXiv:2003.00104

26.

Virtanen A, Kanerva J, Ilo R, Luoma J, Luotolahti J, Salakoski T , Ginter F, Pyysalo (2019) Multilingual is not enough: Bert for finnish. arXiv:1912.07076

27.

Kuratov Y, Arkhipov M (2019) Adaptation of deep bidirectional multilingual transformers for russian language. arXiv:1905.07213

28.

de Souza Fábio Barbosa, Nogueira R, de Alencar Lotufo R (2019) Portuguese named entity recognition using bert-crf. arXiv:1909.10649

29.

Grave E, Bojanowski P, Gupta P, Joulin A, Mikolov T (2018) Learning word vectors for 157 languages. arXiv:1802.06893

30.

Zahedi M S, Bokaei M H, Shoeleh F, Yadollahi M M, Doostmohammadi E, Farhoodi M (2018) Persian word embedding evaluation benchmarks. Electrical Engineering (ICEE), Iranian Conference on, pp. 1583–1588

31.

Saravani SHH, Bahrani M, Veisi H, Besharati S (2018) Persian language modeling using recurrent neural networks. 2018 9th International Symposium on Telecommunications (IST), pp. 207–210

32.

Ahmadi F, Moradi H (2015) A hybrid method for persian named entity recognition. 2015 7th Conference on Information and Knowledge Technology (IKT), pp. 1–7

33.

Dashtipour K, Gogate M, Adeel A, Algarafi A, Howard N, Hussain A (2017) Persian named entity recognition. 2017 IEEE 16th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC), pp. 79–83

34.

Bokaei M H, Mahmoudi M (2018) Improved deep persian named entity recognition. 2018 9th International Symposium on Telecommunications (IST), pp. 381–386

35.

Taher E, Hoseini S A, Shamsfard M (2020) Beheshti-ner: Persian named entity recognition using bert. arXiv:2003.08875

36.

Dastgheib MB, Koleini S, Rasti F (2020) The application of deep learning in persian documents sentiment analysis. Int J Inf Sci Manag 18:1–15

37.

Bijari K, Zare H, Kebriaei E, Veisi H (2020) Leveraging deep graph-based text representation for sentiment polarity applications. Expert Syst Appl 144: 113090

38.

Sharami JPR, Sarabestani P A, Mirroshandel S A (2020) Deepsentipers: Novel deep learning models trained over proposed augmented persian sentiment corpus. arXiv:2004.05328

39.

Hosseini P, Ramaki AA, Maleki H, Anvari M , & Mirroshandel SA (2018) Sentipers: A sentiment analysis corpus for persian. arXiv:1801.07737

40.

Goldhahn D, Eckart T, Quasthoff U et al (2012) Building large monolingual dictionaries at the Leipzig Corpora collection: from 100 to 200 languages. In: LREC, vol 29, pp 31–43

41.

Javier Ortiz Suárez Pedro, Sagot Benoît, Romary Laurent (2019) Asynchronous pipeline for processing huge corpora on medium to low resource infrastructures. In CMLC-7

42.

Sabeti B, Firouzjaee HA, Choobbasti AJ, Najafabadi SHEM , Vaheb A (2018) Mirastext: An automatically generated text corpus for persian. In: Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018)

43.

Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. CoRR, arXiv:1412.6980

44.

Kudo T (2018) Subword regularization: Improving neural network translation models with multiple subword candidates. arXiv preprint. arXiv:1804.10959

45.

Sennrich R, Haddow B, Birch A (2016) Neural machine translation of rare words with subword units. arXiv:1508.07909

46.

Shahshahani MS, Mohseni M, Shakery A, Faili H (2018) Peyma: a tagged corpus for persian named entities. arXiv:1801.09936

47.

Poostchi H, Borzeshi EZ, Piccardi M (2018) Bilstm-crf for persian named-entity recognition armanpersonercorpus: the first entity-annotated persian dataset. In LREC

48.

Hafezi L, Rezaeian M (2018) Neural architecture for persian named entity recognition. 2018 4th Iranian Conference on Signal Processing and Intelligent Systems (ICSPIS), pp. 61–64

49.

Poostchi H, Borzeshi EZ, Abdous M, Piccardi M (2016) PersoNER: Persian named-entity recognition. In: COLING 2016-26th international conference on computational linguistics, Proceedings of COLING 2016: Technical Papers

Titel: ParsBERT: Transformer-based Model for Persian Language Understanding
verfasst von: Mehrdad Farahani
Mohammad Gharachorloo
Marzieh Farahani
Mohammad Manthouri
Publikationsdatum: 08.10.2021
Verlag: Springer US
Erschienen in: Neural Processing Letters / Ausgabe 6/2021
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI: https://doi.org/10.1007/s11063-021-10528-4

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Frank Urbansky/© Peter Eichler / Leipzig, CO2-Fußabdruck/© Jenny Sturm / stock.adobe.com, Interview Entropie Bild 1/© Bernhard Weßling, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 6/2021

-TSVM: A Robust Transductive Support Vector Machine and its Application to the Detection of COVID-19 Infected Patients

BERT Based Semi-Supervised Hybrid Approach for Aspect and Sentiment Classification

Intermittent Control Based Exponential Synchronization of Inertial Neural Networks with Mixed Delays

Stability and Hopf Bifurcation Analysis of a General Tri-diagonal BAM Neural Network with Delays

Fuzzy Coefficient of Impulsive Intensity in a Nonlinear Impulsive Control System

CT-UNet: Context-Transfer-UNet for Building Segmentation in Remote Sensing Images

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.