Skip to main content
Erschienen in: Neural Processing Letters 6/2021

08.10.2021

ParsBERT: Transformer-based Model for Persian Language Understanding

verfasst von: Mehrdad Farahani, Mohammad Gharachorloo, Marzieh Farahani, Mohammad Manthouri

Erschienen in: Neural Processing Letters | Ausgabe 6/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The surge of pre-trained language models has begun a new era in the field of Natural Language Processing (NLP) by allowing us to build powerful language models. Among these models, Transformer-based models such as BERT have become increasingly popular due to their state-of-the-art performance. However, these models are usually focused on English, leaving other languages to multilingual models with limited resources. This paper proposes a monolingual BERT for the Persian language (ParsBERT), which shows its state-of-the-art performance compared to other architectures and multilingual models. Also, since the amount of data available for NLP tasks in Persian is very restricted, a massive dataset for different NLP tasks as well as pre-training the model is composed. ParsBERT obtains higher scores in all datasets, including existing ones and gathered ones, and improves the state-of-the-art performance by outperforming both multilingual BERT and other prior works in Sentiment Analysis, Text Classification, and Named Entity Recognition tasks.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. arXiv:1310.4546 Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. arXiv:​1310.​4546
2.
Zurück zum Zitat Pennington J, Socher R, Manning C (2014) GloVe: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, pp 1532–1543. https://doi.org/10.3115/v1/D14-1162 Pennington J, Socher R, Manning C (2014) GloVe: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, pp 1532–1543. https://​doi.​org/​10.​3115/​v1/​D14-1162
3.
Zurück zum Zitat Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. arXiv:1802.05365 Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. arXiv:​1802.​05365
4.
Zurück zum Zitat Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805 Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:​1810.​04805
5.
Zurück zum Zitat adford A (2018) Improving language understanding by generative pre-training. In: OpenAI adford A (2018) Improving language understanding by generative pre-training. In: OpenAI
6.
Zurück zum Zitat Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized bert pretraining approach. arXiv:1907.11692 Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized bert pretraining approach. arXiv:​1907.​11692
7.
Zurück zum Zitat Yang Z, Dai Z, Yang Y, Carbonell J G, Salakhutdinov R, Le Q V (2019) Xlnet: generalized autoregressive pretraining for language understanding. In NeurIPS Yang Z, Dai Z, Yang Y, Carbonell J G, Salakhutdinov R, Le Q V (2019) Xlnet: generalized autoregressive pretraining for language understanding. In NeurIPS
8.
Zurück zum Zitat Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu P J (2019) Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv:1910.10683 Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu P J (2019) Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv:​1910.​10683
9.
Zurück zum Zitat Conneau A, Khandelwal K, Goyal N, Chaudhary V, Wenzek G, Guzmán F, Grave E, Ott M, Zettlemoyer L, Stoyanov V (2019) Unsupervised cross-lingual representation learning at scale. arXiv:1911.02116 Conneau A, Khandelwal K, Goyal N, Chaudhary V, Wenzek G, Guzmán F, Grave E, Ott M, Zettlemoyer L, Stoyanov V (2019) Unsupervised cross-lingual representation learning at scale. arXiv:​1911.​02116
10.
Zurück zum Zitat Wang W, Bao F, Gao G (2019) Learning morpheme representation for mongoliannamed entity recognition.Neural Process Lett 50(3):2647–2664 Wang W, Bao F, Gao G (2019) Learning morpheme representation for mongoliannamed entity recognition.Neural Process Lett 50(3):2647–2664
11.
Zurück zum Zitat Huang Gengshi, Haifeng Hu (2018) c-rnn: a fine-grained language model for image captioning. Neural Process Letts 49:683–691CrossRef Huang Gengshi, Haifeng Hu (2018) c-rnn: a fine-grained language model for image captioning. Neural Process Letts 49:683–691CrossRef
12.
Zurück zum Zitat Niu Jinghao, Yang Yehui, Zhang Siheng, Sun Zhengya, Zhang Wensheng (2018) Multi-task character-level attentional networks for medical concept normalization. Neural Process Letts 49:1239–1256CrossRef Niu Jinghao, Yang Yehui, Zhang Siheng, Sun Zhengya, Zhang Wensheng (2018) Multi-task character-level attentional networks for medical concept normalization. Neural Process Letts 49:1239–1256CrossRef
14.
16.
Zurück zum Zitat Howard J, Ruder S (2018) Universal language model fine-tuning for text classification. In ACL Howard J, Ruder S (2018) Universal language model fine-tuning for text classification. In ACL
17.
Zurück zum Zitat Hochreiter Sepp, Schmidhuber Jürgen (1997) Long short-term memory. Neural Comput 9:1735–1780CrossRef Hochreiter Sepp, Schmidhuber Jürgen (1997) Long short-term memory. Neural Comput 9:1735–1780CrossRef
18.
Zurück zum Zitat Vaswani A, Shazeer N, Parmar N , Uszkoreit J, Jones L, Gomez A N, Kaiser L, Polosukhin I (2017) Attention is all you need. arXiv:1706.03762 Vaswani A, Shazeer N, Parmar N , Uszkoreit J, Jones L, Gomez A N, Kaiser L, Polosukhin I (2017) Attention is all you need. arXiv:​1706.​03762
20.
Zurück zum Zitat Lan Z-Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R (2020) Albert: A lite bert for self-supervised learning of language representations. arXiv:1909.11942 Lan Z-Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R (2020) Albert: A lite bert for self-supervised learning of language representations. arXiv:​1909.​11942
21.
Zurück zum Zitat Wang A, Singh A, Michael J, Hill F , Levy O, Bowman SR (2018) GLUE: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint. arXiv:1804.07461 Wang A, Singh A, Michael J, Hill F , Levy O, Bowman SR (2018) GLUE: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint. arXiv:​1804.​07461
23.
Zurück zum Zitat Wietse de Vries, Andreas van C, Arianna B, Tommaso C, Gertjan van N, and Malvina N. Bertje: A dutch bert model. arXiv:1912.09582, 2019 Wietse de Vries, Andreas van C, Arianna B, Tommaso C, Gertjan van N, and Malvina N. Bertje: A dutch bert model. arXiv:​1912.​09582, 2019
24.
Zurück zum Zitat Polignano M, Basile P, Degemmis M, Semeraro G, Basile V (2019) Alberto: Italian bert language understanding model for nlp challenging tasks based on tweets. In CLiC-it Polignano M, Basile P, Degemmis M, Semeraro G, Basile V (2019) Alberto: Italian bert language understanding model for nlp challenging tasks based on tweets. In CLiC-it
25.
26.
Zurück zum Zitat Virtanen A, Kanerva J, Ilo R, Luoma J, Luotolahti J, Salakoski T , Ginter F, Pyysalo (2019) Multilingual is not enough: Bert for finnish. arXiv:1912.07076 Virtanen A, Kanerva J, Ilo R, Luoma J, Luotolahti J, Salakoski T , Ginter F, Pyysalo (2019) Multilingual is not enough: Bert for finnish. arXiv:​1912.​07076
27.
28.
Zurück zum Zitat de Souza Fábio Barbosa, Nogueira R, de Alencar Lotufo R (2019) Portuguese named entity recognition using bert-crf. arXiv:1909.10649 de Souza Fábio Barbosa, Nogueira R, de Alencar Lotufo R (2019) Portuguese named entity recognition using bert-crf. arXiv:​1909.​10649
30.
Zurück zum Zitat Zahedi M S, Bokaei M H, Shoeleh F, Yadollahi M M, Doostmohammadi E, Farhoodi M (2018) Persian word embedding evaluation benchmarks. Electrical Engineering (ICEE), Iranian Conference on, pp. 1583–1588 Zahedi M S, Bokaei M H, Shoeleh F, Yadollahi M M, Doostmohammadi E, Farhoodi M (2018) Persian word embedding evaluation benchmarks. Electrical Engineering (ICEE), Iranian Conference on, pp. 1583–1588
31.
Zurück zum Zitat Saravani SHH, Bahrani M, Veisi H, Besharati S (2018) Persian language modeling using recurrent neural networks. 2018 9th International Symposium on Telecommunications (IST), pp. 207–210 Saravani SHH, Bahrani M, Veisi H, Besharati S (2018) Persian language modeling using recurrent neural networks. 2018 9th International Symposium on Telecommunications (IST), pp. 207–210
32.
Zurück zum Zitat Ahmadi F, Moradi H (2015) A hybrid method for persian named entity recognition. 2015 7th Conference on Information and Knowledge Technology (IKT), pp. 1–7 Ahmadi F, Moradi H (2015) A hybrid method for persian named entity recognition. 2015 7th Conference on Information and Knowledge Technology (IKT), pp. 1–7
33.
Zurück zum Zitat Dashtipour K, Gogate M, Adeel A, Algarafi A, Howard N, Hussain A (2017) Persian named entity recognition. 2017 IEEE 16th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC), pp. 79–83 Dashtipour K, Gogate M, Adeel A, Algarafi A, Howard N, Hussain A (2017) Persian named entity recognition. 2017 IEEE 16th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC), pp. 79–83
34.
Zurück zum Zitat Bokaei M H, Mahmoudi M (2018) Improved deep persian named entity recognition. 2018 9th International Symposium on Telecommunications (IST), pp. 381–386 Bokaei M H, Mahmoudi M (2018) Improved deep persian named entity recognition. 2018 9th International Symposium on Telecommunications (IST), pp. 381–386
35.
36.
Zurück zum Zitat Dastgheib MB, Koleini S, Rasti F (2020) The application of deep learning in persian documents sentiment analysis. Int J Inf Sci Manag 18:1–15 Dastgheib MB, Koleini S, Rasti F (2020) The application of deep learning in persian documents sentiment analysis. Int J Inf Sci Manag 18:1–15
37.
Zurück zum Zitat Bijari K, Zare H, Kebriaei E, Veisi H (2020) Leveraging deep graph-based text representation for sentiment polarity applications. Expert Syst Appl 144: 113090 Bijari K, Zare H, Kebriaei E, Veisi H (2020) Leveraging deep graph-based text representation for sentiment polarity applications. Expert Syst Appl 144: 113090
38.
Zurück zum Zitat Sharami JPR, Sarabestani P A, Mirroshandel S A (2020) Deepsentipers: Novel deep learning models trained over proposed augmented persian sentiment corpus. arXiv:2004.05328 Sharami JPR, Sarabestani P A, Mirroshandel S A (2020) Deepsentipers: Novel deep learning models trained over proposed augmented persian sentiment corpus. arXiv:​2004.​05328
39.
Zurück zum Zitat Hosseini P, Ramaki AA, Maleki H, Anvari M , & Mirroshandel SA (2018) Sentipers: A sentiment analysis corpus for persian. arXiv:1801.07737 Hosseini P, Ramaki AA, Maleki H, Anvari M , & Mirroshandel SA (2018) Sentipers: A sentiment analysis corpus for persian. arXiv:​1801.​07737
40.
Zurück zum Zitat Goldhahn D, Eckart T, Quasthoff U et al (2012) Building large monolingual dictionaries at the Leipzig Corpora collection: from 100 to 200 languages. In: LREC, vol 29, pp 31–43 Goldhahn D, Eckart T, Quasthoff U et al (2012) Building large monolingual dictionaries at the Leipzig Corpora collection: from 100 to 200 languages. In: LREC, vol 29, pp 31–43
41.
Zurück zum Zitat Javier Ortiz Suárez Pedro, Sagot Benoît, Romary Laurent (2019) Asynchronous pipeline for processing huge corpora on medium to low resource infrastructures. In CMLC-7 Javier Ortiz Suárez Pedro, Sagot Benoît, Romary Laurent (2019) Asynchronous pipeline for processing huge corpora on medium to low resource infrastructures. In CMLC-7
42.
Zurück zum Zitat Sabeti B, Firouzjaee HA, Choobbasti AJ, Najafabadi SHEM , Vaheb A (2018) Mirastext: An automatically generated text corpus for persian. In: Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018) Sabeti B, Firouzjaee HA, Choobbasti AJ, Najafabadi SHEM , Vaheb A (2018) Mirastext: An automatically generated text corpus for persian. In: Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018)
44.
Zurück zum Zitat Kudo T (2018) Subword regularization: Improving neural network translation models with multiple subword candidates. arXiv preprint. arXiv:1804.10959 Kudo T (2018) Subword regularization: Improving neural network translation models with multiple subword candidates. arXiv preprint. arXiv:​1804.​10959
46.
47.
Zurück zum Zitat Poostchi H, Borzeshi EZ, Piccardi M (2018) Bilstm-crf for persian named-entity recognition armanpersonercorpus: the first entity-annotated persian dataset. In LREC Poostchi H, Borzeshi EZ, Piccardi M (2018) Bilstm-crf for persian named-entity recognition armanpersonercorpus: the first entity-annotated persian dataset. In LREC
48.
Zurück zum Zitat Hafezi L, Rezaeian M (2018) Neural architecture for persian named entity recognition. 2018 4th Iranian Conference on Signal Processing and Intelligent Systems (ICSPIS), pp. 61–64 Hafezi L, Rezaeian M (2018) Neural architecture for persian named entity recognition. 2018 4th Iranian Conference on Signal Processing and Intelligent Systems (ICSPIS), pp. 61–64
49.
Zurück zum Zitat Poostchi H, Borzeshi EZ, Abdous M, Piccardi M (2016) PersoNER: Persian named-entity recognition. In: COLING 2016-26th international conference on computational linguistics, Proceedings of COLING 2016: Technical Papers Poostchi H, Borzeshi EZ, Abdous M, Piccardi M (2016) PersoNER: Persian named-entity recognition. In: COLING 2016-26th international conference on computational linguistics, Proceedings of COLING 2016: Technical Papers
Metadaten
Titel
ParsBERT: Transformer-based Model for Persian Language Understanding
verfasst von
Mehrdad Farahani
Mohammad Gharachorloo
Marzieh Farahani
Mohammad Manthouri
Publikationsdatum
08.10.2021
Verlag
Springer US
Erschienen in
Neural Processing Letters / Ausgabe 6/2021
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-021-10528-4

Weitere Artikel der Ausgabe 6/2021

Neural Processing Letters 6/2021 Zur Ausgabe

Neuer Inhalt