Top

Published in:

2021 | OriginalPaper | Chapter

A Deep Network Model for Paraphrase Detection in Punjabi

Authors : Arwinder Singh, Gurpreet Singh Josan

Published in: Recent Innovations in Computing

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Paraphrase refers to the text which tells the same meanings but with different expressions. It is important in NLP as it deals with many applications such as information retrieval, information extraction, machine translation, query expansion, question answering, summarization and plagiarism. Paraphrase detection is to find that given two texts are semantically similar or not similar. Though paraphrase detection has wide literature, there is no proper algorithm for paraphrase detection in Punjabi language. A new paraphrase detection model for Punjabi language is developed in this paper. We use two deep learning methods to map sentences as vectors, and these vectors are further used to detect paraphrases. Despite other implementations of paraphrase detection, our model is simple and efficient to detect paraphrases. Qualitative and quantitative evaluations prove the efficiency of the model and can be applied to various NLP applications. The proposed model is trained on Quora’s question pair dataset which makes new directions for paraphrasing in Indian languages.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Microstrip Patch Antenna with Truncated Edges for Bandwidth Improvement for Wireless Applications

next chapter Design and Development of Software and Hardware Modules of Bioimpedance System Using LTSpice

Achananuparp, P., Hu, X., Zhou, X., Zhang, X.: Utilizing sentence similarity and question type similarity to response to similar questions in knowledge-sharing community. In: Proceedings of QAWeb 2008 Workshop, Beijing, China (to appear, 2008) (2008).

Agarwal, B., Ramampiaro, H., Langseth, H., Ruocco, M.: A deep network model for paraphrase detection in short text messages. Inf. Process. Manag. 54, 922–937 (2018)CrossRef

Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings (2015)

Bowman, S.R., Vilnis, L., Vinyals, O., Dai, A.M., Józefowicz, R., Bengio, S.: Generating sentences from a continuous space. In: Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, pp. 10–21. ACL (2016)

Cao, H., Jiang, D., Pei, J., He, Q., Liao, Z., Chen, E., Li, H.: Context-aware query suggestion by mining click-through and session data. In: KDD ’08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 875–883 (2008)

Chatterjee, N., Mohan, S.: Extraction-based single-document summarization using random indexing. In: Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence, vol. 02, pp. 448–455. IEEE Computer Society (2007)

Chung, J., Kastner, K., Dinh, L., Goel, K., Courville, A.C., Bengio, Y.: A recurrent latent variable model for sequential data. Adv. Neural. Inf. Process. Syst. 28, 2980–2988 (2015)

Das, D., Smith, N.A.: Paraphrase identification as probabilistic quasi-synchronous recognition. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pp. 468–476. Association for Computational Linguistics, Suntec, Singapore (2009)

Deerwester, S.: Improving information retrieval with latent semantic indexing. In: Proceedings of the 51st Annual Meeting of the American Society for Information Science, vol. 25, pp. 36–40 (1988)

10.

Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(41), 391–407 (1990)CrossRef

11.

Gharavi, E., Bijari, K., Zahirnia, K., Veisi, H.: A deep learning approach to Persian plagiarism detection. In: Working notes of FIRE 2016—Forum for Information Retrieval Evaluation, Kolkata, India, December 7–10, 2016, vol. 1737, pp. 154–159 (2016)

12.

Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef

13.

Huang, E.: Paraphrase detection using recursive autoencoder. In: Stanford NLP Group, Natural Language Processing, Final Projects Reports (Stanford University, Stanford, CA, 2011) (2011); Huang, E.: Paraphrase detection using recursive autoencoder. In: Stanford NLP Group, Natural Language Processing, Final Projects Reports (Stanford University, Stanford, CA, 2011) (2011)

14.

El Desouki, M.I., Gomaa, W.H.: Exploring the recent trends of paraphrase detection. Int. J. Comput. Appl. 182, 1–5 (2019). https://doi.org/10.5120/ijca2019918317

15.

Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 655–665. Association for Computational Linguistics (2014)

16.

Kenter, T., Borisov, A., de Rijke, M.: Siamese CBOW: optimizing word embeddings for sentence representations. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 941–951. Association for Computational Linguistics, Berlin, Germany (2016)

17.

Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25–29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 1746–1751 (2014)

18.

Kiros, R., Zhu, Y., Salakhutdinov, R., Zemel, R.S., Urtasun, R., Torralba, A., Fidler, S.: Skip-thought vectors. In: NIPS, pp. 3294–3302 (2015)

19.

Lin, R., Liu, S., Yang, M., Li, M., Zhou, M., Li, S.: Hierarchical recurrent neural network for document modeling. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 899–907. The Association for Computational Linguistics (2015)

20.

Madnani, N., Tetreault, J., Chodorow, M.: Re-examining machine translation metrics for paraphrase identification. In: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 182–190. Association for Computational Linguistics (2012)

21.

Mani, I.: Summarization evaluation: An overview. In: In Proceedings of the North American chapter of the association for computational linguistics (NAACL). Workshop on Automatic Summarization (2001)

22.

Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013)

23.

Periwal, M.: Generating semantic sentences. In: Published in SSRN Electronics Journal (2017)

24.

Rus, V., McCarthy, P., Lintean, M., McNamara, D., Graesser, A.: Paraphrase identification with lexico-syntactic graph subsumption. In: Proceedings of the 21th International Florida Artificial Intelligence Research Society Conference, FLAIRS-21, pp. 201–206 (2008)

25.

Sahlgren, M.: An introduct ion to random indexing. In: Methods and Applications of Semantic Indexing Workshop at the 7th International Conference on Terminology and Knowledge Engineering, TKE 2005 (2005)

26.

Sahlgren, M.: The word-space model: using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces (2006)

27.

Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)CrossRef

28.

Schütze, H.: Word space. Adv. Neur. Inf. Process. Syst. 5, 895–902 (1993)

29.

Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. Adv. Neural. Inf. Process. Syst. 27, 3104–3112 (2014)

30.

Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree structured long short-term memory networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1556–1566. Association for Computational Linguistics, Beijing, China (2015)

31.

Tellex, S., Katz, B., Lin, J., Fern, A., Marton, G.: Quantitative evaluation of passage retrieval algorithms for question answering. In: Proceedings of the 26th Annual International ACM SIGIR Conference, pp. 41–47 (2003)

32.

White, L., Togneri, R., Liu, W., Bennamoun, M.: How well sentence embeddings capture meaning. In: ADCS, pp. 9:1–9:8. ACM (2015)

33.

Yang, R., Zhang, J., Gao, X., Ji, F., Chen, H.: Simple and effective text matching with richer alignment features. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4699–4709. Association for Computational Linguistics, Florence, Italy (2019)

34.

Tau Yih, W., Toutanova, K., Platt, J.C., Meek, C.: Learning discriminative projections for text similarity measures. In: CoNLL, pp. 247–256. ACL (2011).

35.

Yin, W., Schütze, H., Xiang, B., Zhou, B.: Abcnn: Attention-based convolutional neural network for modelling sentence pairs. Trans. Assoc. Comput. Linguist. 4, 259–272 (2016)CrossRef

36.

Zhang, C., Sah, S., Nguyen, T., Peri, D., Loui, A., Salvaggio, C., Ptucha, R.W.: Semantic sentence embeddings for paraphrasing and text summarization. In: 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP) abs/1809.10267, pp. 705–709 (2018)

37.

Xu, S., Shen, X., Fukumoto, F., Li, J., Suzuki, Y., Nishizaki, H.: Paraphrase identification with Lexical, syntactic and sentential encodings. Appl. Sci. 10, 4144 (2020)

38.

Yinfei, Y., Yuan, Z., Chris, T., Jason, B.: PAWS-X: a Cross-lingual Adversarial Dataset for Paraphrase Identification. CoRR, Abs/ 1908(11828), 1–6 (2019)

39.

Mohamed, I., Hosam, W.: Exploring the recent trends of paraphrase detection. Int. J. Comput. Appl. 182, 1–5 (2019)

40.

Dhall, D., Kaur R., Juneja M.: Machine learning: a review of the algorithms and its applications. In: Singh, P., Kar, A.,Singh, Y., Kolekar, M., Tranwar, S. (eds) Proceedings of ICRIC 2019. Lecture Notes in Electrical Engineering, vol. 597, pp. 47–63. Springer, Cham

Title: A Deep Network Model for Paraphrase Detection in Punjabi
Authors: Arwinder Singh
Gurpreet Singh Josan
Publisher: Springer Singapore
Book: Recent Innovations in Computing
Print ISBN: 978-981-15-8296-7

Electronic ISBN: 978-981-15-8297-4

Copyright Year: 2021
DOI: https://doi.org/10.1007/978-981-15-8297-4_15

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner