Skip to main content

2020 | OriginalPaper | Buchkapitel

Tamil Paraphrase Detection Using Encoder-Decoder Neural Networks

verfasst von : B. Senthil Kumar, D. Thenmozhi, S. Kayalvizhi

Erschienen in: Computational Intelligence in Data Science

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Detecting paraphrases in Indian languages require critical analysis on the lexical, syntactic and semantic features. Since the structure of Indian languages differ from the other languages like English, the usage of lexico-syntactic features vary between the Indian languages and plays a critical role in determining the performance of the system. Instead of using various lexico-syntactic similarity features, we aim to apply a complete end-to-end system using deep learning networks with no lexico-syntactic features. In this paper we exploited the encoder-decoder model of deep neural network to analyze the paraphrase sentences in Tamil language and to classify. In this encoder-decoder model, LSTM, GRU units and gNMT are used as layers along with attention mechanism. Using this end-to-end model, there is an increase in f1-measure by 0.5% for the subtask-1 when compared to the state-of-the-art systems. The system was trained and evaluated on DPIL@FIRE2016 Shared Task dataset. To our knowledge, ours is the first deep learning model which validates the training instances of both the subtask-1 and subtask-2 dataset of DPIL shared task.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014) Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:​1409.​0473 (2014)
2.
Zurück zum Zitat Bhargava, R., Sharma, G., Sharma, Y.: Deep paraphrase detection in Indian languages. In: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017, pp. 1152–1159. ACM (2017) Bhargava, R., Sharma, G., Sharma, Y.: Deep paraphrase detection in Indian languages. In: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017, pp. 1152–1159. ACM (2017)
3.
Zurück zum Zitat Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014) Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:​1406.​1078 (2014)
7.
Zurück zum Zitat Kong, L., Chen, K., Tian, L., Hao, Z., Han, Z., Qi, H.: HIT 2016@ DPIL-FIRE2016: detecting paraphrases in Indian languages based on gradient tree boosting. In: FIRE (Working Notes), pp. 260–265 (2016) Kong, L., Chen, K., Tian, L., Hao, Z., Han, Z., Qi, H.: HIT 2016@ DPIL-FIRE2016: detecting paraphrases in Indian languages based on gradient tree boosting. In: FIRE (Working Notes), pp. 260–265 (2016)
8.
Zurück zum Zitat Konstas, I., Iyer, S., Yatskar, M., Choi, Y., Zettlemoyer, L.: Neural AMR: sequence-to-sequence models for parsing and generation. arXiv preprint arXiv:1704.08381 (2017) Konstas, I., Iyer, S., Yatskar, M., Choi, Y., Zettlemoyer, L.: Neural AMR: sequence-to-sequence models for parsing and generation. arXiv preprint arXiv:​1704.​08381 (2017)
9.
Zurück zum Zitat Kumar, M.A., Singh, S., Kavirajan, B., Soman, K.: DPIL@ FIRE 2016: overview of shared task on detecting paraphrases in Indian languages (DPIL), vol. 1737, pp. 233–238 (2016) Kumar, M.A., Singh, S., Kavirajan, B., Soman, K.: DPIL@ FIRE 2016: overview of shared task on detecting paraphrases in Indian languages (DPIL), vol. 1737, pp. 233–238 (2016)
12.
Zurück zum Zitat Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 (2015) Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:​1508.​04025 (2015)
13.
Zurück zum Zitat Mahalakshmi, S., Anand Kumar, M., Soman, K.: Paraphrase detection for Tamil language using deep learning algorithm. Int. J. Appl. Eng. Res 10(17), 13929–13934 (2015) Mahalakshmi, S., Anand Kumar, M., Soman, K.: Paraphrase detection for Tamil language using deep learning algorithm. Int. J. Appl. Eng. Res 10(17), 13929–13934 (2015)
14.
15.
Zurück zum Zitat Saikh, T., Naskar, S.K., Bandyopadhyay, S.: JU\(\_\)NLP@ DPIL-FIRE2016: paraphrase detection in Indian languages-a machine learning approach. In: FIRE (Working Notes), pp. 275–278 (2016) Saikh, T., Naskar, S.K., Bandyopadhyay, S.: JU\(\_\)NLP@ DPIL-FIRE2016: paraphrase detection in Indian languages-a machine learning approach. In: FIRE (Working Notes), pp. 275–278 (2016)
17.
Zurück zum Zitat Sarkar, K.: KS\(\_\)JU@ DPIL-FIRE2016: detecting paraphrases in Indian languages using multinomial logistic regression model. arXiv preprint arXiv:1612.08171 (2016) Sarkar, K.: KS\(\_\)JU@ DPIL-FIRE2016: detecting paraphrases in Indian languages using multinomial logistic regression model. arXiv preprint arXiv:​1612.​08171 (2016)
18.
Zurück zum Zitat Senthil Kumar, B., Thenmozhi, D., Aravindan, C., Kayalvizhi, S.: Tamil paraphrase detection using long-short term memory networks. In: Proceedings of Tamil Internet Conference - TIC2019, Chennai, India, pp. 4–10 (2019). ISSN 2313–4887 Senthil Kumar, B., Thenmozhi, D., Aravindan, C., Kayalvizhi, S.: Tamil paraphrase detection using long-short term memory networks. In: Proceedings of Tamil Internet Conference - TIC2019, Chennai, India, pp. 4–10 (2019). ISSN 2313–4887
19.
Zurück zum Zitat Sutskever, I., Vinyals, O., Le, Q.: Sequence to sequence learning with neural networks. In: Advances in NIPS (2014) Sutskever, I., Vinyals, O., Le, Q.: Sequence to sequence learning with neural networks. In: Advances in NIPS (2014)
20.
Zurück zum Zitat Thenmozhi, D., Kumar, S., Aravindan, C.: SSN\(\_\)NLP@ IECSIL-FIRE-2018: deep learning approach to named entity recognition and relation extraction for conversational systems in Indian languages. In: FIRE (Working Notes), pp. 187–201 (2018) Thenmozhi, D., Kumar, S., Aravindan, C.: SSN\(\_\)NLP@ IECSIL-FIRE-2018: deep learning approach to named entity recognition and relation extraction for conversational systems in Indian languages. In: FIRE (Working Notes), pp. 187–201 (2018)
21.
Zurück zum Zitat Vinyals, O., Kaiser, Ł., Koo, T., Petrov, S., Sutskever, I., Hinton, G.: Grammar as a foreign language. In: Advances in Neural Information Processing Systems, pp. 2773–2781 (2015) Vinyals, O., Kaiser, Ł., Koo, T., Petrov, S., Sutskever, I., Hinton, G.: Grammar as a foreign language. In: Advances in Neural Information Processing Systems, pp. 2773–2781 (2015)
Metadaten
Titel
Tamil Paraphrase Detection Using Encoder-Decoder Neural Networks
verfasst von
B. Senthil Kumar
D. Thenmozhi
S. Kayalvizhi
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-63467-4_3