Skip to main content

2018 | OriginalPaper | Buchkapitel

Statistical vs. Rule-Based Machine Translation: A Comparative Study on Indian Languages

verfasst von : S. Sreelekha, Pushpak Bhattacharyya, D. Malathi

Erschienen in: International Conference on Intelligent Computing and Applications

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we present our work on a case study between statistical machine translation (SMT) and rule-based machine translation (RBMT) systems on English-Indian language and Indian to Indian language perspective. Main objective of our study is to make a five-way performance comparison: such as, (a) SMT and RBMT; (b) SMT on English–Indian language; (c) RBMT on English–Indian language; (d) SMT on Indian to Indian language perspective; (e) RBMT on Indian to Indian language perspective. Through a detailed analysis, we describe the rule-based and the statistical machine translation system developments and its evaluations. Further, with a detailed error analysis, we point out the relative strengths and weaknesses of both the systems. The observations based on our study are: (a) SMT systems outperform RBMT; (b) In the case of SMT: English to Indian language MT systems perform better than Indian to English language MT systems; (c) In the case of RBMT: English to Indian language MT systems perform better than Indian to English language MT systems; (d) SMT systems perform better for Indian to Indian language MT systems compared to RBMT. Effectively, we shall see that even with a small amount of training corpus SMT system has many advantages for high-quality domain-specific machine translation over that of a rule-based counterpart.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Anoop Kunchukuttan and Pushpak Bhattacharyya. 2012. Partially modelling word reordering as a sequence labeling problem, COLING 2012. Anoop Kunchukuttan and Pushpak Bhattacharyya. 2012. Partially modelling word reordering as a sequence labeling problem, COLING 2012.
2.
Zurück zum Zitat Anoop Kunchukuttan Abhijit Mishra, Rajen Chatterjee, Ritesh Shah and Pushpak Bhattacharyya, Shata-Anuvadak: Tackling Multiway Translation of Indian Languages, LREC 2014, Rekjyavik, Iceland. Anoop Kunchukuttan Abhijit Mishra, Rajen Chatterjee, Ritesh Shah and Pushpak Bhattacharyya, Shata-Anuvadak: Tackling Multiway Translation of Indian Languages, LREC 2014, Rekjyavik, Iceland.
3.
Zurück zum Zitat Sreelekha. S., Piyush Dungarwal, Pushpak Bhattacharyya, Malathi D., Solving Data Sparsity by Morphology Injection in Factored SMT, International Conference on Natural Language Processing, ICON 2015. Sreelekha. S., Piyush Dungarwal, Pushpak Bhattacharyya, Malathi D., Solving Data Sparsity by Morphology Injection in Factored SMT, International Conference on Natural Language Processing, ICON 2015.
4.
Zurück zum Zitat Sreelekha, Pushpak Bhattacharyya, Malathi D. Lexical Resources for Hindi—Marathi MT, WIDRE Proceedings, LREC 2014. Sreelekha, Pushpak Bhattacharyya, Malathi D. Lexical Resources for Hindi—Marathi MT, WIDRE Proceedings, LREC 2014.
5.
Zurück zum Zitat Sreelekha, Pushpak Bhattacharyya. Lexical Resources to enrich English-Malayalam Machine Translation, LREC—International Conference on Lexical Resources and Evaluation, Slovenia, 2016. Sreelekha, Pushpak Bhattacharyya. Lexical Resources to enrich English-Malayalam Machine Translation, LREC—International Conference on Lexical Resources and Evaluation, Slovenia, 2016.
6.
Zurück zum Zitat Sreelekha S., Pushpak Bhattacharyya, Malathi D., “A Case study on English-Malayalam Machine Translation”, iDravidian Proceedings, International Journal of Engineering Sciences, 2015. Sreelekha S., Pushpak Bhattacharyya, Malathi D., “A Case study on English-Malayalam Machine Translation”, iDravidian Proceedings, International Journal of Engineering Sciences, 2015.
7.
Zurück zum Zitat Sreelekha, Raj Dabre, Pushpak Bhattacharyya 2013. Comparison of SMT and RBMT, The Requirement of Hybridization for Marathi—Hindi MT ICON, 10th International conference on NLP, December 2013. Sreelekha, Raj Dabre, Pushpak Bhattacharyya 2013. Comparison of SMT and RBMT, The Requirement of Hybridization for Marathi—Hindi MT ICON, 10th International conference on NLP, December 2013.
8.
Zurück zum Zitat Shachi Dave, Jignashu Parikh and Pushpak Bhattacharyya. 2002. Interlingua based English-Hindi Machine Translation and Language Divergence, JMT 2002. Shachi Dave, Jignashu Parikh and Pushpak Bhattacharyya. 2002. Interlingua based English-Hindi Machine Translation and Language Divergence, JMT 2002.
9.
Zurück zum Zitat Arafat Ahsan, Prasanth Kolachina, Sudheer Kolachina, Dipti Misra Sharma and Rajeev Sangal. 2010. Coupling Statistical Machine Translation with Rule-based Transfer and Generation. amta2010.amtaweb.org. Arafat Ahsan, Prasanth Kolachina, Sudheer Kolachina, Dipti Misra Sharma and Rajeev Sangal. 2010. Coupling Statistical Machine Translation with Rule-based Transfer and Generation. amta2010.amtaweb.org.
10.
Zurück zum Zitat Latha R. Nair, David Peter S., Renjith Ravindran. 2012. Design and Development of a Malayalam to English Translator—A Transfer based Approach, International Journal of Computational Linguistics, Volume (3): Issue (1), 2012. Latha R. Nair, David Peter S., Renjith Ravindran. 2012. Design and Development of a Malayalam to English Translator—A Transfer based Approach, International Journal of Computational Linguistics, Volume (3): Issue (1), 2012.
11.
Zurück zum Zitat Franz Josef Och and Hermann Ney. A Systematic Comparison of Various Statistical Alignment Models. Computational Linguistics, 2003. Franz Josef Och and Hermann Ney. A Systematic Comparison of Various Statistical Alignment Models. Computational Linguistics, 2003.
12.
Zurück zum Zitat Franz Josef Och and Hermann Ney. 2001. Statistical Multi Source Translation. MT Summit 2001. Franz Josef Och and Hermann Ney. 2001. Statistical Multi Source Translation. MT Summit 2001.
13.
Zurück zum Zitat Peter E. Brown, Stephen A. Della Pietra. Vincent J. Della Pietra, and Robert L. Mercer*. The Mathematics of Statistical Machine Translation: Parameter Estimationn. ACL 1993. Peter E. Brown, Stephen A. Della Pietra. Vincent J. Della Pietra, and Robert L. Mercer*. The Mathematics of Statistical Machine Translation: Parameter Estimationn. ACL 1993.
14.
Zurück zum Zitat Bonnie J. Dorr. 1994. Machine Translation Divergences: A Formal Description and Proposed Solution. Computational Linguistics, 1994. Bonnie J. Dorr. 1994. Machine Translation Divergences: A Formal Description and Proposed Solution. Computational Linguistics, 1994.
15.
Zurück zum Zitat Kevin Knight. 1999. Decoding complexity in word-replacement translation models, Computational Linguistics, 1999. Kevin Knight. 1999. Decoding complexity in word-replacement translation models, Computational Linguistics, 1999.
16.
Zurück zum Zitat Ananthakrishnan Ramananthan, Pushpak Bhattacharyya, Karthik Visweswariah, Kushal Ladha, and Ankur Gandhe. 2011. Clause-Based Reordering Constraints to Improve Statistical Machine Translation. IJCNLP, 2011. Ananthakrishnan Ramananthan, Pushpak Bhattacharyya, Karthik Visweswariah, Kushal Ladha, and Ankur Gandhe. 2011. Clause-Based Reordering Constraints to Improve Statistical Machine Translation. IJCNLP, 2011.
17.
Zurück zum Zitat Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, Evan Herbst. 2007. Moses: Open Source Toolkit for Statistical Machine Translation, Annual Meeting of the ACL, demonstration session, Prague, Czech Republic, June 2007. Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, Evan Herbst. 2007. Moses: Open Source Toolkit for Statistical Machine Translation, Annual Meeting of the ACL, demonstration session, Prague, Czech Republic, June 2007.
18.
Zurück zum Zitat Kishore Papineni, Salim Roukos, Todd Ward and Wei-Jing Zhu. 2002. BLEU: a Method for Automatic Evaluation of Machine Translation, Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, July 2002, pp. 311–318. Kishore Papineni, Salim Roukos, Todd Ward and Wei-Jing Zhu. 2002. BLEU: a Method for Automatic Evaluation of Machine Translation, Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, July 2002, pp. 311–318.
19.
Zurück zum Zitat Ganesh Bhosale, Subodh Kembhavi, Archana Amberkar, Supriya Mhatre, Lata Popale and Pushpak Bhattacharyya. 2011. Processing of Participle (Krudanta) in Marathi. ICON 2011, Chennai, December, 2011. Ganesh Bhosale, Subodh Kembhavi, Archana Amberkar, Supriya Mhatre, Lata Popale and Pushpak Bhattacharyya. 2011. Processing of Participle (Krudanta) in Marathi. ICON 2011, Chennai, December, 2011.
20.
Zurück zum Zitat Antony P. J. 2013. Machine Translation Approaches and Survey for Indian Languages, The Association for Computational Linguistics and Chinese Language Processing, Vol. 18, No. 1, March 2013, pp. 47–78. Antony P. J. 2013. Machine Translation Approaches and Survey for Indian Languages, The Association for Computational Linguistics and Chinese Language Processing, Vol. 18, No. 1, March 2013, pp. 47–78.
Metadaten
Titel
Statistical vs. Rule-Based Machine Translation: A Comparative Study on Indian Languages
verfasst von
S. Sreelekha
Pushpak Bhattacharyya
D. Malathi
Copyright-Jahr
2018
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-5520-1_59