Skip to main content
Top

2021 | OriginalPaper | Chapter

A Single Document Assamese Text Summarization Using a Combination of Statistical Features and Assamese WordNet

Authors : Nomi Baruah, Shikhar Kr. Sarma, Surajit Borkotokey

Published in: Progress in Advanced Computing and Intelligent Engineering

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this paper, an extractive text summarization approach using Assamese WordNet is proposed, and the difficulties faced while extracting summary in the Assamese document are discussed. The Assamese language is a low-level language. Synset is applied from Assamese WordNet. The various features used for identifying the most salient sentences to generate effective summary aspects such as TF-IDF, sentence length, sentence position and numerical identification are considered. Automatic Text Summarization in the Assamese language is still in an early stage and this language does not have its own approach. So, the text summarization approach is compared to the approaches applied in Bengali and Bangla language approaches as these languages share a script that is quite similar having slight variations in certain letters. The effectiveness of our proposed approach is demonstrated through a set of experiments carried out using ROUGE measure, and the evaluation is depicted in terms of Precision, Recall and F1-score.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
2.
go back to reference Kuper, J., et al.: Intelligent multimedia indexing and retrieval through multi-source information extraction and merging. In: 18th International Joint Conference of Artificial Intelligence (IJCAI), pp. 409–414. Morgan Kaufman Publishers, San Francisco (2003) Kuper, J., et al.: Intelligent multimedia indexing and retrieval through multi-source information extraction and merging. In: 18th International Joint Conference of Artificial Intelligence (IJCAI), pp. 409–414. Morgan Kaufman Publishers, San Francisco (2003)
3.
go back to reference Aone, C., Okurowski, M.E., Gorlinsky, J.: Trainable, scalable summarization using robust NLP and machine learning. In: Proceeding ACL ‘98/COLING ’98 Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational linguistics, pp. 62–66 (1998) Aone, C., Okurowski, M.E., Gorlinsky, J.: Trainable, scalable summarization using robust NLP and machine learning. In: Proceeding ACL ‘98/COLING ’98 Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational linguistics, pp. 62–66 (1998)
4.
go back to reference Hovy, E.H., Lin, C.Y.: Automating text summarization in SUMMARIST. In: Mani, I., Maybury, M. (eds.) Advances in Automated Text Summarization. MIT Press (1998) Hovy, E.H., Lin, C.Y.: Automating text summarization in SUMMARIST. In: Mani, I., Maybury, M. (eds.) Advances in Automated Text Summarization. MIT Press (1998)
5.
go back to reference Kupiec, J., Pederson, J., Chen, F.: A Trainable document summarizer. In: SIGIR’95 Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 68–73. Seattle, USA (1995) Kupiec, J., Pederson, J., Chen, F.: A Trainable document summarizer. In: SIGIR’95 Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 68–73. Seattle, USA (1995)
6.
go back to reference Das, A., Bandyopadhyay, S.: Topic-based Bengali opinion summarization. In: Proceedings of COLING 2010, pp. 232–240 (2010) Das, A., Bandyopadhyay, S.: Topic-based Bengali opinion summarization. In: Proceedings of COLING 2010, pp. 232–240 (2010)
8.
go back to reference Thomas, S.V., Jayvardhan, Yadav, M.L.: Text summarization using synset ranking. Int. J. Eng. Res. Technol. (IJERT) 2(6), 962–964 (2013) Thomas, S.V., Jayvardhan, Yadav, M.L.: Text summarization using synset ranking. Int. J. Eng. Res. Technol. (IJERT) 2(6), 962–964 (2013)
9.
go back to reference Kolhe, P., Kumbhare, A.: Optimizing accuracy of document summarization using rule mining. Int. J. Comput. Sci. Mob. Comput. 6(6), 207–216 (2017) Kolhe, P., Kumbhare, A.: Optimizing accuracy of document summarization using rule mining. Int. J. Comput. Sci. Mob. Comput. 6(6), 207–216 (2017)
11.
go back to reference Mohamed, S.S., Hariharan, S.: A summarizer for Tamil language using centroid approach. Int. J. Inf. Retr. Res. 6(1), 1–15 (2016) Mohamed, S.S., Hariharan, S.: A summarizer for Tamil language using centroid approach. Int. J. Inf. Retr. Res. 6(1), 1–15 (2016)
12.
go back to reference Kalita, C., Saharia, N., Sharma, U.: An extractive approach of text summarization of Assamese using WordNet. In: Proceedings of 6th International Global WordNet Conference (GWC 12), pp. 9–13. Japan (2012) Kalita, C., Saharia, N., Sharma, U.: An extractive approach of text summarization of Assamese using WordNet. In: Proceedings of 6th International Global WordNet Conference (GWC 12), pp. 9–13. Japan (2012)
13.
go back to reference Biswas, S., Acharya, S., Dash, S.: Automatic text summarization for Oriya language. Int. J. Comput. Appl. 132(1), 19–26 (2015) Biswas, S., Acharya, S., Dash, S.: Automatic text summarization for Oriya language. Int. J. Comput. Appl. 132(1), 19–26 (2015)
14.
go back to reference Bhosale, et al.: Marathi e-Newspaper text summarization using automatic keyword extraction technique. Int. J. Adv. Eng. Res. Dev. 5(3), 789–792 (2018) Bhosale, et al.: Marathi e-Newspaper text summarization using automatic keyword extraction technique. Int. J. Adv. Eng. Res. Dev. 5(3), 789–792 (2018)
15.
go back to reference Kanitha, D.K., Mubarak, D.M.N., Shanavas, S.A.: Malayalam text summarization using graph based method. Int. J. Comput. Sci. Inf. Technol. 9(2), 40–44 (2018) Kanitha, D.K., Mubarak, D.M.N., Shanavas, S.A.: Malayalam text summarization using graph based method. Int. J. Comput. Sci. Inf. Technol. 9(2), 40–44 (2018)
16.
go back to reference Sowmya, N.S., Mala, T.: Tamil document summarization using latent Dirichlet allocation (2011) Sowmya, N.S., Mala, T.: Tamil document summarization using latent Dirichlet allocation (2011)
17.
go back to reference Sarma, S.K., Medhi, R., Gogoi, M., Saikia, U.: Foundation and structure of developing an Assamese WordNet. In: Proceedings of 5th International Global WordNet Conference (2010) Sarma, S.K., Medhi, R., Gogoi, M., Saikia, U.: Foundation and structure of developing an Assamese WordNet. In: Proceedings of 5th International Global WordNet Conference (2010)
18.
go back to reference Hussain, I., Saharia, N., Sharma, U.: Development of Assamese WordNet. Machine Intelligence: Recent Advances, Narosa Publishing House (2011) Hussain, I., Saharia, N., Sharma, U.: Development of Assamese WordNet. Machine Intelligence: Recent Advances, Narosa Publishing House (2011)
19.
go back to reference Sarma, J., Saharia, N., Sarma, S. K.: A novel approach for document classification using Assamese WordNet. In: Proceedings of 6th International Global WordNet Conference (GWC 12), pp. 324–329. Japan (2012) Sarma, J., Saharia, N., Sarma, S. K.: A novel approach for document classification using Assamese WordNet. In: Proceedings of 6th International Global WordNet Conference (GWC 12), pp. 324–329. Japan (2012)
21.
go back to reference Sarkar, K.: Bengali text summarization by sentence extraction. In: Proceedings of International Conference on Business and Information Management (ICBIM-2012), pp. 233–245 (2012) Sarkar, K.: Bengali text summarization by sentence extraction. In: Proceedings of International Conference on Business and Information Management (ICBIM-2012), pp. 233–245 (2012)
22.
go back to reference Sobh, I.: An optimized dual classification system for Arabic extractive generic text summarization. M.Sc Thesis: Department of Computer Engineering, Cairo University, Giza, Egypt (2009) Sobh, I.: An optimized dual classification system for Arabic extractive generic text summarization. M.Sc Thesis: Department of Computer Engineering, Cairo University, Giza, Egypt (2009)
23.
go back to reference Gupta, V., Kaur, N.: A novel hybrid text summarization system for Punjabi Text. Cogn. Comput. 8, 261–277 (2016)CrossRef Gupta, V., Kaur, N.: A novel hybrid text summarization system for Punjabi Text. Cogn. Comput. 8, 261–277 (2016)CrossRef
24.
go back to reference Sarma, S.K., Bharali, H., Gogoi, A., Deka, R., Barman, A.K..: A structured approach for building Assamese corpus: insights, applications and challenges. In: Proceedings of ALR@COLING 2012, pp. 21–28 (2012) Sarma, S.K., Bharali, H., Gogoi, A., Deka, R., Barman, A.K..: A structured approach for building Assamese corpus: insights, applications and challenges. In: Proceedings of ALR@COLING 2012, pp. 21–28 (2012)
26.
go back to reference Haque, M.M., Pervin, S., Begum, Z.: An innovative approach of Bangla text summarization by introducing pronoun replacement and improved sentence ranking. J. Inf. Process. Syst. 13(4), 752–777 (2017) Haque, M.M., Pervin, S., Begum, Z.: An innovative approach of Bangla text summarization by introducing pronoun replacement and improved sentence ranking. J. Inf. Process. Syst. 13(4), 752–777 (2017)
Metadata
Title
A Single Document Assamese Text Summarization Using a Combination of Statistical Features and Assamese WordNet
Authors
Nomi Baruah
Shikhar Kr. Sarma
Surajit Borkotokey
Copyright Year
2021
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-15-6353-9_12