Skip to main content
Erschienen in: Soft Computing 7/2022

05.01.2022 | Optimization

A developed framework for multi-document summarization using softmax regression and spider monkey optimization methods

verfasst von: Praveen K. Wilson, J. R. Jeba

Erschienen in: Soft Computing | Ausgabe 7/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In everyday life, multi-document summarization (MDS) methods are becoming tremendous attention in different fields, especially for online documents, because this online document conveys information to users by generating succinct and comprehensive summary. The summarized document contains the summary of various documents with same topic. Here, Spider Monkey Optimization (SMO) algorithm is introduced for summary generation. Before that, multi-documents are compressed into single document and different pre-processing methods are applied to remove the unwanted word from the document. Then, semantic and syntactic features are extracted from the document using different methods. The mined features are then provided into the softmax regression (SR) technique for further processing. Finally, SMO algorithm is proposed to generate the summary about whole document. The proposed text summarization process is implemented in Python platform using the BBC news dataset, DUC (Document Understanding Conference) 2002, 2006, and 2007 datasets. During pre-processing, the tokenization is performed by Natural Language Tool Kit (NLTK) tool and the lemmatization in WordNet lemmatizer. The terms recall, F-measure and precision are offered in this work for performance evaluation, and the accuracy of this method is found to be better than the other existing MDS techniques.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Abdi A, Idris N, Alguliyev RM, Aliguliyev RM (2017) Query-based multi-documents summarization using linguistic knowledge and content word expansion. Soft Comput 21(7):1785–1801CrossRef Abdi A, Idris N, Alguliyev RM, Aliguliyev RM (2017) Query-based multi-documents summarization using linguistic knowledge and content word expansion. Soft Comput 21(7):1785–1801CrossRef
Zurück zum Zitat Abdi A, Shamsuddin SM, Aliguliyev RM (2018) QMOS: query-based multi-documents opinion-oriented summarization. Inf Process Manage 54(2):318–338CrossRef Abdi A, Shamsuddin SM, Aliguliyev RM (2018) QMOS: query-based multi-documents opinion-oriented summarization. Inf Process Manage 54(2):318–338CrossRef
Zurück zum Zitat Anjaneyulu M, Sarma SSVN, Reddy PVP, Chander KP, Nagaprasad S (2018) Sentence similarity using syntactic and semantic features for multi-document summarization. In: Bhattacharyya S, Hassanien A, Gupta D, Khanna A, Pan I (eds) International conference on innovative computing and communications. Lecture Notes in Networks and Systems. Springer, Singapore, pp 471–485 Anjaneyulu M, Sarma SSVN, Reddy PVP, Chander KP, Nagaprasad S (2018) Sentence similarity using syntactic and semantic features for multi-document summarization. In: Bhattacharyya S, Hassanien A, Gupta D, Khanna A, Pan I (eds) International conference on innovative computing and communications. Lecture Notes in Networks and Systems. Springer, Singapore, pp 471–485
Zurück zum Zitat Anjaneyulu M, Sarma SSVN, Reddy PVP, Chander KP, Nagaprasad S (2019) Sentence similarity using syntactic and semantic features for multi-document summarization. In: Bhattacharyya S, Hassanien A, Gupta D, Khanna A, Pan I (eds) International conference on innovative computing and communications. Springer, Singapore, pp 471–485CrossRef Anjaneyulu M, Sarma SSVN, Reddy PVP, Chander KP, Nagaprasad S (2019) Sentence similarity using syntactic and semantic features for multi-document summarization. In: Bhattacharyya S, Hassanien A, Gupta D, Khanna A, Pan I (eds) International conference on innovative computing and communications. Springer, Singapore, pp 471–485CrossRef
Zurück zum Zitat Babar SA, Patil PD (2015) Improving performance of text summarization. Procedia Comput Sci 46:354–363CrossRef Babar SA, Patil PD (2015) Improving performance of text summarization. Procedia Comput Sci 46:354–363CrossRef
Zurück zum Zitat Bansal JC, Sharma H, Jadon SS, Clerc M (2014) Spider monkey optimization algorithm for numerical optimization. Memet Comput 6(1):31–47CrossRef Bansal JC, Sharma H, Jadon SS, Clerc M (2014) Spider monkey optimization algorithm for numerical optimization. Memet Comput 6(1):31–47CrossRef
Zurück zum Zitat Batista J, Lins RD, Lima R, Oliveira H, Riss M, Simske SJ (2018) Automatic cohesive summarization with pronominal anaphora resolution. Comput Speech Lang 52:141–164CrossRef Batista J, Lins RD, Lima R, Oliveira H, Riss M, Simske SJ (2018) Automatic cohesive summarization with pronominal anaphora resolution. Comput Speech Lang 52:141–164CrossRef
Zurück zum Zitat Bidoki M, Moosavi MR, Fakhrahmad M (2020) A semantic approach to extractive multi-document summarization: applying sentence expansion for tuning of conceptual densities. Inf Process Manag 57(6):102341CrossRef Bidoki M, Moosavi MR, Fakhrahmad M (2020) A semantic approach to extractive multi-document summarization: applying sentence expansion for tuning of conceptual densities. Inf Process Manag 57(6):102341CrossRef
Zurück zum Zitat Cao Z, Li W, Li S, Wei F (2017) Improving multi-document summarization via text classification. In: AAAI, pp 3053–3059 Cao Z, Li W, Li S, Wei F (2017) Improving multi-document summarization via text classification. In: AAAI, pp 3053–3059
Zurück zum Zitat Dief NA, Al-Desouky AE, Eldin AA, El-Said AM (2017) An adaptive semantic descriptive model for multi-document representation to enhance generic summarization. Int J Softw Eng Knowl Eng 27(01):23–24CrossRef Dief NA, Al-Desouky AE, Eldin AA, El-Said AM (2017) An adaptive semantic descriptive model for multi-document representation to enhance generic summarization. Int J Softw Eng Knowl Eng 27(01):23–24CrossRef
Zurück zum Zitat Fang C, Mu D, Deng Z, Wu Z (2017) Word-sentence co-ranking for automatic extractive text summarization. Expert Syst Appl 72:189–195CrossRef Fang C, Mu D, Deng Z, Wu Z (2017) Word-sentence co-ranking for automatic extractive text summarization. Expert Syst Appl 72:189–195CrossRef
Zurück zum Zitat Garcia R, Lima R, Espinasse B, Oliveira H (2018, April) Towards coherent single-document summarization: an integer linear programming-based approach. In: Proceedings of the 33rd annual ACM symposium on applied computing. ACM, pp 712–719 Garcia R, Lima R, Espinasse B, Oliveira H (2018, April) Towards coherent single-document summarization: an integer linear programming-based approach. In: Proceedings of the 33rd annual ACM symposium on applied computing. ACM, pp 712–719
Zurück zum Zitat Greene D, Cunningham P (2006, June) Practical solutions to the problem of diagonal dominance in kernel document clustering. In: Proceedings of the 23rd international conference on machine learning. ACM, pp 377–384 Greene D, Cunningham P (2006, June) Practical solutions to the problem of diagonal dominance in kernel document clustering. In: Proceedings of the 23rd international conference on machine learning. ACM, pp 377–384
Zurück zum Zitat Jaafar Y, Bouzoubaa K (2018) Towards a new hybrid approach for abstractive summarization. Procedia Comput Sci 142:286–293CrossRef Jaafar Y, Bouzoubaa K (2018) Towards a new hybrid approach for abstractive summarization. Procedia Comput Sci 142:286–293CrossRef
Zurück zum Zitat Jiang M, Liang Y, Feng X, Fan X, Pei Z, Xue Y, Guan R (2018) Text classification based on deep belief network and softmax regression. Neural Comput Appl 29(1):61–70CrossRef Jiang M, Liang Y, Feng X, Fan X, Pei Z, Xue Y, Guan R (2018) Text classification based on deep belief network and softmax regression. Neural Comput Appl 29(1):61–70CrossRef
Zurück zum Zitat Kumar S, Sharma B, Sharma VK, Sharma H, Bansal JC (2018) Plant leaf disease identification using exponential spider monkey optimization. Sustain Comput Inform Syst 28:100283 Kumar S, Sharma B, Sharma VK, Sharma H, Bansal JC (2018) Plant leaf disease identification using exponential spider monkey optimization. Sustain Comput Inform Syst 28:100283
Zurück zum Zitat Li P, Lam W, Bing L, Wang Z (2017) Deep recurrent generative decoder for abstractive text summarization. arXiv preprint arXiv:1708.00625 Li P, Lam W, Bing L, Wang Z (2017) Deep recurrent generative decoder for abstractive text summarization. arXiv preprint arXiv:​1708.​00625
Zurück zum Zitat Manzoor U, Kordjamshidi P (2018) Anaphora resolution for improving spatial relation extraction from text. In: Proceedings of the first international workshop on spatial language understanding, pp 53–62 Manzoor U, Kordjamshidi P (2018) Anaphora resolution for improving spatial relation extraction from text. In: Proceedings of the first international workshop on spatial language understanding, pp 53–62
Zurück zum Zitat Marujo L, Ling W, Ribeiro R, Gershman A, Carbonell J, de Matos DM, Neto JP (2016) Exploring events and distributed representations of text in multi-document summarization. Knowl-Based Syst 94:33–42CrossRef Marujo L, Ling W, Ribeiro R, Gershman A, Carbonell J, de Matos DM, Neto JP (2016) Exploring events and distributed representations of text in multi-document summarization. Knowl-Based Syst 94:33–42CrossRef
Zurück zum Zitat Mosa MA (2020) A novel hybrid particle swarm optimization and gravitational search algorithm for multi-objective optimization of text mining. Appl Soft Comput 90:106189CrossRef Mosa MA (2020) A novel hybrid particle swarm optimization and gravitational search algorithm for multi-objective optimization of text mining. Appl Soft Comput 90:106189CrossRef
Zurück zum Zitat Mosa MA, Anwar AS, Hamouda A (2018) A survey of multiple types of text summarization based on swarm intelligence optimization techniques Mosa MA, Anwar AS, Hamouda A (2018) A survey of multiple types of text summarization based on swarm intelligence optimization techniques
Zurück zum Zitat Nallapati R, Zhai F, Zhou B (2017, February) Summarunner: a recurrent neural network based sequence model for extractive summarization of documents. In: AAAI, pp 3075–3081 Nallapati R, Zhai F, Zhou B (2017, February) Summarunner: a recurrent neural network based sequence model for extractive summarization of documents. In: AAAI, pp 3075–3081
Zurück zum Zitat Peyrard M, Eckle-Kohler J (2017) Supervised learning of automatic pyramid for optimization-based multi-document summarization. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers), vol 1, pp 1084–1094 Peyrard M, Eckle-Kohler J (2017) Supervised learning of automatic pyramid for optimization-based multi-document summarization. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers), vol 1, pp 1084–1094
Zurück zum Zitat Rainarli E, Dewi KE (2018) Relevance Vector Machine for Summarization. IOP Conf Ser: Mater Sci Eng 407(1):012075CrossRef Rainarli E, Dewi KE (2018) Relevance Vector Machine for Summarization. IOP Conf Ser: Mater Sci Eng 407(1):012075CrossRef
Zurück zum Zitat Rautray R, Balabantaray RC (2017) Cat swarm optimization based evolutionary framework for multi document summarization. Phys a: Stat Mech Appl 477:174–186CrossRef Rautray R, Balabantaray RC (2017) Cat swarm optimization based evolutionary framework for multi document summarization. Phys a: Stat Mech Appl 477:174–186CrossRef
Zurück zum Zitat Rautray R, Balabantaray RC (2018) An evolutionary framework for multi document summarization using Cuckoo search approach: MDSCSA. Appl Comput Inform 14(2):134–144CrossRef Rautray R, Balabantaray RC (2018) An evolutionary framework for multi document summarization using Cuckoo search approach: MDSCSA. Appl Comput Inform 14(2):134–144CrossRef
Zurück zum Zitat Sanchez-Gomez JM, Vega-Rodríguez MA, Perez CJ (2020) A decomposition-based multi-objective optimization approach for extractive multi-document text summarization. Appl Soft Comput 91:106231CrossRef Sanchez-Gomez JM, Vega-Rodríguez MA, Perez CJ (2020) A decomposition-based multi-objective optimization approach for extractive multi-document text summarization. Appl Soft Comput 91:106231CrossRef
Zurück zum Zitat Sangaiah AK, Fakhry AE, Abdel-Basset M and El-henawy I (2018) Arabic text clustering using improved clustering algorithms with dimensionality reduction. Cluster Computing. 1–15. Sangaiah AK, Fakhry AE, Abdel-Basset M and El-henawy I (2018) Arabic text clustering using improved clustering algorithms with dimensionality reduction. Cluster Computing. 1–15.
Zurück zum Zitat Simón JR, Ledeneva Y, Hernández RAG (2018) Calculating the upper bounds for multi-document summarization using genetic algorithms. Computación y Sistemas 22(1):11–26 Simón JR, Ledeneva Y, Hernández RAG (2018) Calculating the upper bounds for multi-document summarization using genetic algorithms. Computación y Sistemas 22(1):11–26
Zurück zum Zitat Tan J, Wan X, Xiao J (2017) Abstractive document summarization with a graph-based attentional neural model. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers), vol 1, pp 1171–1181 Tan J, Wan X, Xiao J (2017) Abstractive document summarization with a graph-based attentional neural model. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers), vol 1, pp 1171–1181
Zurück zum Zitat Tohalino JV, Amancio DR (2018) Extractive multi-document summarization using multilayer networks. Physica A 503:526–539CrossRef Tohalino JV, Amancio DR (2018) Extractive multi-document summarization using multilayer networks. Physica A 503:526–539CrossRef
Zurück zum Zitat Wahib A, Arifin AZ, Purwitasari D (2016) Improving multi-document summary method based on sentence distribution. TELKOMNIKA (Telecommun Comput Electron Control) 14(1):286–293CrossRef Wahib A, Arifin AZ, Purwitasari D (2016) Improving multi-document summary method based on sentence distribution. TELKOMNIKA (Telecommun Comput Electron Control) 14(1):286–293CrossRef
Zurück zum Zitat Wang L, Raghavan H, Castelli V, Florian R, Cardie C (2016) A sentence compression based framework to query-focused multi-document summarization. arXiv preprint arXiv:1606.07548 Wang L, Raghavan H, Castelli V, Florian R, Cardie C (2016) A sentence compression based framework to query-focused multi-document summarization. arXiv preprint arXiv:​1606.​07548
Zurück zum Zitat Yao JG, Wan X, Xiao J (2017) Recent advances in document summarization. Knowl Inf Syst 53(2):297–336CrossRef Yao JG, Wan X, Xiao J (2017) Recent advances in document summarization. Knowl Inf Syst 53(2):297–336CrossRef
Zurück zum Zitat Yasunaga M, Zhang R, Meelu K, Pareek A, Srinivasan K and Radev D (2017) Graph-based neural multi-document summarization. arXiv preprint, arXiv:1706.06681 Yasunaga M, Zhang R, Meelu K, Pareek A, Srinivasan K and Radev D (2017) Graph-based neural multi-document summarization. arXiv preprint, arXiv:​1706.​06681
Zurück zum Zitat Zhang J, Tan J, Wan X (2018) Towards a neural network approach to abstractive multi-document summarization. arXiv preprint, arXiv:1804.09010 Zhang J, Tan J, Wan X (2018) Towards a neural network approach to abstractive multi-document summarization. arXiv preprint, arXiv:​1804.​09010
Metadaten
Titel
A developed framework for multi-document summarization using softmax regression and spider monkey optimization methods
verfasst von
Praveen K. Wilson
J. R. Jeba
Publikationsdatum
05.01.2022
Verlag
Springer Berlin Heidelberg
Erschienen in
Soft Computing / Ausgabe 7/2022
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-021-06694-1

Weitere Artikel der Ausgabe 7/2022

Soft Computing 7/2022 Zur Ausgabe

Soft computing in decision making and in modeling in economics

A hybrid retrieval strategy for case-based reasoning using soft likelihood functions

Premium Partner