Enhancing multi-document summarization using concepts

Rao, Pattabhi R K; Lalitha Devi, S

doi:10.1007/s12046-018-0789-y

Enhancing multi-document summarization using concepts

Published: 10 March 2018

Volume 43, article number 27, (2018)
Cite this article

Sādhanā Aims and scope Submit manuscript

232 Accesses
3 Citations
Explore all metrics

Abstract

In this paper we propose a methodology to mine concepts from documents and use these concepts to generate an objective summary of all relevant documents. We use the conceptual graph (CG) formalism as proposed by Sowa to represent the concepts and their relationships in the documents. In the present work we have modified and extended the definition of the concept given by Sowa. The modified and extended definition is discussed in detail in section 2 of this paper. A CG of a set of relevant documents can be considered as a semantic network. The semantic network is generated by automatically extracting CG for each document and merging them into one. We discuss (i) generation of semantic network using CGs and (ii) generation of multi-document summary. Here we use restricted Boltzmann machines, a deep learning technique, for automatically extracting CGs. We have tested our methodology using MultiLing 2015 corpus. We have obtained encouraging results, which are comparable to those from the state of the art systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Mani I 2001 Summarization evaluation: an overview. In: Proceedings of NTCIR
Luhn H P 1958 The automatic creation of literature abstracts. IBM J. Res. Dev. 2(2): 159–165
Article MathSciNet Google Scholar
Lin C Y and Hovy E H 2002 From single to multi-document summarization: a prototype system and its evaluation. In: Proceedings of ACL-2002, pp. 457–464
Radev D, Jing H, Stys M and Tam D 2004 Centroid-based summarization of multiple documents. Inf. Process. Manage. 40: 919–938
Article MATH Google Scholar
Kleinberg 1999 Authoritative sources in a hyperlinked environment. J. ACM 46(5): 604–632
Article MathSciNet MATH Google Scholar
Brin S and Page L 1998 The anatomy of a large scale hypertextual Web search engine. Comput. Netw. ISDN Syst. 30: 1–7
Article Google Scholar
Erkan G and Radev D 2004 Lexpagerank: prestige in multi-document text summarization. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain, July
Mihalcea R Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: Proceedings of ACL 2004 on Interactive Poster and Demonstration Sessions (ACLdemo 2004), Barcelona, Spain
Mihalcea R and Tarau P 2004 TextRank – bringing order into texts. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2004), Barcelona, Spain
Mihalcea R, Tarau P and Figa E 2004 PageRank on semantic networks, with application to word sense disambiguation. In: Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland
McKeownand K and Radev D 1995 Generating summaries of multiple news articles. In: Proceedings of the 18th Annual International ACM, Seattle, WA, pp. 74–82
Virendra G and Tanveer J S 2012 Multi-document summarization using sentence clustering. In: IEEE Proceedings of the 4th International Conference on Intelligent Human–Computer Interaction, Kharagpur, India, pp. 314–318
Sowa J F 1984 Conceptual structures, information processing in mind and machine. Addison Wesley, Boston, MA, USA
MATH Google Scholar
Edward E S and Douglas L M 1981 Categories and concepts Cambridge, Massachusetts–London, England: Harvard University Press
Google Scholar
Sowa J F 1976 Conceptual graphs for a data base interface. IBM J. Res. Dev. 20(4): 336–357
Article MathSciNet MATH Google Scholar
Ivan A S, Baldwin T, Bond F, Copestake A and Flickinger D 2002 Multiword expressions: a pain in the neck for NLP. In: Proceedings of the 3rd International Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2002), Mexico City, Mexico, pp. 1–15
Brill E 1994 Some advances in transformation based part of speech tagging. In: Proceedings of the Twelfth International Conference on Artificial Intelligence (AAAI-94), Seattle, WA, pp. 722–727
Ngai G and Florian R Transformation-based learning in the fast lane. In: Proceedings of NAACL’2001, Pittsburgh, PA, pp. 40–47
Lafferty J, McCallum A and Pereira F 2001 Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th International Conference on Machine Learning (ICML-2001), pp. 282–289
Hinton G and Salakhutdinov R 2006 Reducing the dimensionality of data with neural networks. Science 313(5786): 504–507
Article MathSciNet MATH Google Scholar
Srivastava N, Salakhutdinov R R and Hinton G E 2013 Modeling documents with a deep Boltzmann machine. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI)
Rao P R K and Lalitha Devi S 2015 Automatic identification of conceptual structures using deep Boltzmann machines. In: Proceedings of the Forum for Information Reterival and Evaluation, ACM DL, Gandhinagar, India, pp. 16–80
Mikolov T, Chen K, Corrado G and Dean J 2013 Efficient estimation of word representations in vector space. In: Proceedings of the Workshop at ICLR
Blum N 2001 A simplified realization of the Hopcroft–Karp approach to maximum matching in general graphs. Tech. Rep. 895549-CS, Computer Science Department, University of Bonn
Hopcroft J E and Karp R M 1973 An n5/2 algorithm for maximum matchings in bipartite graphs. SIAM J. Comput. 2(4): 225–231, https://doi.org/10.1137/0202019
Article MathSciNet MATH Google Scholar
Giannakopoulos G, Kubina J, John M C, Steinberger J, Favre B, Kabadjov M, Kruschwitz U and Poesio M 2015 Multiling 2015: multilingual summarization of single and multi-documents, on-line fora, and call-center conversations. In: Proceedings of SIGDIAL, Prague, pp. 270–274
Yang S Y and Soo V W 2012 Extract conceptual graphs from plain texts in patent claims. J. Eng. Appl. Artif. Intell. 25(4): 874–887
Article Google Scholar
Rao P R K, Lalitha Devi S and Rosso P 2013 Automatic identification of concepts and conceptual relations from patents using machine learning methods. In: Proceedings of the 10th International Conference on Natural Language Processing (ICON 2013), Noida, India
Lin C Y 2004 ROUGE: a package for automatic evaluation of summaries. In: Proceedings of the Workshop on Text Summarization Branches Out, Barcelona, Spain
Lin C Y and Hovy E 2003 Automatic evaluation of summaries using n-gram co-occurrence. In: Proceedings of the 2003 Language Technology Conference (HLT-NAACL 2003), Edmonton, Canada, pp. 71–78

Download references

Author information

Authors and Affiliations

AU-KBC Research Centre, MIT Campus of Anna University, Chennai, 600044, India
Pattabhi R K Rao & S Lalitha Devi

Authors

Pattabhi R K Rao
View author publications
You can also search for this author in PubMed Google Scholar
S Lalitha Devi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Pattabhi R K Rao or S Lalitha Devi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rao, P.R.K., Lalitha Devi, S. Enhancing multi-document summarization using concepts. Sādhanā 43, 27 (2018). https://doi.org/10.1007/s12046-018-0789-y

Download citation

Received: 21 November 2016
Revised: 12 May 2017
Accepted: 19 June 2017
Published: 10 March 2018
DOI: https://doi.org/10.1007/s12046-018-0789-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhancing multi-document summarization using concepts

Abstract

Access this article

Similar content being viewed by others

A deep learning framework for multi-document summarization using LSTM with improved Dingo Optimizer (IDO)

CoMSum and SIBERT: A Dataset and Neural Model for Query-Based Multi-document Summarization

FHSI-GNN: Fusion Hierarchical Structure Information Graph Neural Network for Extractive Long Documents Summarization

References

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Enhancing multi-document summarization using concepts

Abstract

Access this article

Similar content being viewed by others

A deep learning framework for multi-document summarization using LSTM with improved Dingo Optimizer (IDO)

CoMSum and SIBERT: A Dataset and Neural Model for Query-Based Multi-document Summarization

FHSI-GNN: Fusion Hierarchical Structure Information Graph Neural Network for Extractive Long Documents Summarization

References

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation