nach oben

Erschienen in:

2020 | OriginalPaper | Buchkapitel

Pseudo Transfer Learning by Exploiting Monolingual Corpus: An Experiment on Roman Urdu Transliteration

verfasst von : Muhammad Yaseen Khan, Tafseer Ahmed

Erschienen in: Intelligent Technologies and Applications

Verlag: Springer Singapore

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

This paper shares two things: an efficient experiment for “pseudo” transfer learning by using huge monolingual (mono-script) dataset in a sequence-to-sequence LSTM model; and application of proposed methodology to improve Roman Urdu transliteration. The research involves echoing monolingual dataset, such that in the pre-training phase, the input and output sequences are ditto, to learn the target language. This process gives target language based initialized weights to the LSTM model before training the network with the original parallel data. The method is beneficial for reducing the requirement of more training data or more computational resources because these are usually not available to many research groups. The experiment is performed for the character-based Romanized Urdu script to standard (i.e., modified Perso-Arabic) Urdu script transliteration. Initially, a sequence-to-sequence encoder-decoder model is trained (echoed) for 100 epochs on 306.9K distinct words in standard Urdu script. Then, the trained model (with the weights tuned by echoing) is used for learning transliteration. At this stage, the parallel corpus comprises 127K pairs of Roman Urdu and standard Urdu tokens. The results are quite impressive, the proposed methodology shows BLEU accuracy of 80.1% in 100 epochs of training parallel data (preceded by echoing the mono-script data for 100 epochs), whereas, the baseline model trained solely on parallel corpus yields ≈76% BLEU accuracy in 200 epochs.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Secure NoSQL Over Cloud Using Data Decomposition and Queryable Encryption

Nächstes Kapitel Classification and Prediction Analysis of Diseases and Other Datasets Using Machine Learning

https://github.com/keras-team/keras/blob/master/examples/lstm_Seq2Seq.py.

http://ijunoon.com.

https://www.crummy.com/software/BeautifulSoup/.

https://github.com/dwyl/english-words.

http://crcl.dsu.edu.pk/clt14/shared_call.html.

Ahmed, T.: Roman to Urdu transliteration using wordlist. In: Proceedings of the Conference on Language and Technology, vol. 305, p. 309 (2009)

Alam, M., ul Hussain, S.: Sequence to sequence networks for Roman-Urdu to Urdu transliteration. In: 2017 International Multi-Topic Conference (IN-MIC), pp. 1–7. IEEE (2017)

Arik, S.Ö., et al.: Deep voice: real-time neural text-to-speech. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 195–204. JMLR.org (2017)

Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)

Baxter, J., Caruana, R., Mitchell, T., Pratt, L.Y., Silver, D.L., Thrun, S.: Learning to learn: knowledge consolidation and transfer in inductive systems. In: NIPS Workshop (1995). http://plato.acadiau.ca/courses/comp/dsilver/NIPS95_LTL/transfer.workshop

Bishop, C.M., et al.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)MATH

Bögel, T.: Urdu-Roman transliteration via finite state transducers. In: 10th International Workshop on Finite State Methods and Natural Language Processing, FSMNLP 2012, pp. 25–29 (2012)

Burlot, F., Yvon, F.: Using monolingual data in neural machine translation: a systematic study. arXiv preprint arXiv:1903.11437 (2019)

Canziani, A., Paszke, A., Culurciello, E.: An analysis of deep neural network models for practical applications. arXiv preprint arXiv:1605.07678 (2016)

10.

Cho, K., Van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder-decoder approaches. arXiv preprint arXiv:1409.1259 (2014)

11.

Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)

12.

Chollet, F., et al.: Keras (2015)

13.

Cireşan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. arXiv preprint arXiv:1202.2745 (2012)

14.

Currey, A., Barone, A.V.M., Heafield, K.: Copied monolingual data improves low-resource neural machine translation. In: Proceedings of the Second Conference on Machine Translation, pp. 148–156 (2017)

15.

Daud, A., Khan, W., Che, D.: Urdu language processing: a survey. Artif. Intell. Rev. 47(3), 279–311 (2017). https://doi.org/10.1007/s10462-016-9482-xCrossRef

16.

Graves, A.: Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850 (2013)

17.

Hunt, K.J., Sbarbaro, D., Żbikowski, R., Gawthrop, P.J.: Neural networks for control systems – a survey. Automatica 28(6), 1083–1112 (1992)MathSciNetCrossRef

18.

Hussain, S.: Resources for Urdu language processing. In: Proceedings of the 6th Workshop on Asian Language Resources (2008)

19.

Javed, I., Afzal, H.: Opinion analysis of bi-lingual event data from social networks. In: ESSEM@ AI* IA, pp. 164–172. Citeseer (2013)

20.

Kachru, B.B., Kachru, Y., Sridhar, S.N.: Language in South Asia. Cambridge University Press, Cambridge (2008)CrossRef

21.

Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3128–3137 (2015)

22.

Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

23.

Malik, M.K., et al.: Transliterating Urdu for a broad-coverage Urdu/Hindi LFG grammar. In: Seventh International Conference on Language Resources and Evaluation, LREC 2010, pp. 2921–2927 (2010)

24.

McCulloch, W.S., Pitts, W.: A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5(4), 115–133 (1943). https://doi.org/10.1007/BF02478259MathSciNetCrossRefMATH

25.

McEnery, T., Baker, P., Burnard, L.: Corpus resources and minority language engineering. In: LREC (2000)

26.

Mikolov, T., Karafiát, M., Burget, L., Černockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: Eleventh Annual Conference of the International Speech Communication Association (2010)

27.

Mukund, S., Ghosh, D., Srihari, R.K.: Using cross-lingual projections to generate semantic role labeled corpus for Urdu: a resource poor language. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 797–805. Association for Computational Linguistics (2010)

28.

Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2009)CrossRef

29.

Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics (2002)

30.

Simons, G.F., Fennig, C.D.: Ethnologue: Languages of Asia. SIL International, Dallas (2017)

31.

Socher, R., Bengio, Y., Manning, C.D.: Deep learning for NLP (without magic). In: Tutorial Abstracts of ACL 2012, p. 5. Association for Computational Linguistics (2012)

32.

Sorokin, A., Forsyth, D.: Utility data annotation with Amazon mechanical turk. In: 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–8. IEEE (2008)

33.

Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)

34.

Weiss, K., Khoshgoftaar, T.M., Wang, D.: A survey of transfer learning. J. Big Data 3(1), 9 (2016)CrossRef

35.

Zahid, M.A., Rao, N.I., Siddiqui, A.M.: English to Urdu transliteration: an application of soundex algorithm. In: 2010 International Conference on Information and Emerging Technologies, pp. 1–5. IEEE (2010)

Titel: Pseudo Transfer Learning by Exploiting Monolingual Corpus: An Experiment on Roman Urdu Transliteration
verfasst von: Muhammad Yaseen Khan
Tafseer Ahmed
Verlag: Springer Singapore
Buch: Intelligent Technologies and Applications
Print ISBN: 978-981-15-5231-1

Electronic ISBN: 978-981-15-5232-8

Copyright-Jahr: 2020
DOI: https://doi.org/10.1007/978-981-15-5232-8_36

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner