ABSTRACT
Query-based multi-document summarization aims to create a short summary given a collection of documents and a query. Most of the existing methods treat the query as one single sentence and rank the sentences in the documents based on their similarities with the query sentence. However, these methods lack of intensive analysis on the given query which typically consist of several topic aspects. In this paper, we propose a topic aspect extraction method to discover the aspect words and sentences contained in the query narrative texts and the input documents, and then incorporate these aspect words and sentences into a cross propagation model based on the sentence-term bipartite graph for document summarization. Experiments on DUC benchmark data show the effectiveness of our proposed approach on the topic-driven document summarization task.
- J. Blitzer, R. McDonald, and F. Pereira. Domain adaptation with structural correspondence learning. In EMNLP, 2006. Google ScholarDigital Library
- H. Daumé III and D. Marcu. Bayesian query-focused summarization. In ACL-COLING, 2006. Google ScholarDigital Library
- C. Ding, T. Li, and D. Wang. Label Propagation on K-partite Graphs. In ICMLA, 2009. Google ScholarDigital Library
- G. Erkan and D. Radev. Lexpagerank: Prestige in multi-document text summarization. In EMNLP, 2004.Google Scholar
- Y. Gong and X. Liu. Generic text summarization using relevance measure and latent semantic analysis. In SIGIR, 2001. Google ScholarDigital Library
- A. Haghighi and L. Vanderwende. Exploring content models for multi-document summarization. In NAACL, 2009. Google ScholarDigital Library
- T. Li, C. Ding, Y. Zhang, and B. Shao. Knowledge transformation from word space to document space. In SIGIR, 2008. Google ScholarDigital Library
- C. Lin and E. Hovy. Automatic evaluation of summaries using n-gram co-occurrence statistics. In NAACL, 2003. Google ScholarDigital Library
- S. Park, J. Lee, C. Ahn, J. Hong, and S. Chun. Query based summarization using non-negative matrix factorization. In Knowledge-Based Intelligent Information and Engineering Systems, 2006. Google ScholarDigital Library
- D. Radev, H. Jing, M. Sty, and D. Tam. Centroid-based summarization of multiple documents. Information Processing & Management, 40(6):919--938, 2004. Google ScholarDigital Library
- J. Tang, L. Yao, and D. Chen. Multi-topic based Query-oriented Summarization. In SDM, 2009.Google ScholarCross Ref
- X. Wan. Topic analysis for topic-focused multi-document summarization. In CIKM, 2009. Google ScholarDigital Library
- X. Wan, J. Yang, and J. Xiao. Manifold-ranking based topic-focused multi-document summarization. In IJCAI, 2007. Google ScholarDigital Library
- D. Wang, T. Li, S. Zhu, and C. Ding. Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization. In SIGIR, 2008. Google ScholarDigital Library
- X. Zhu and A. B. Goldberg. Introduction to Semi-Supervised Learning. Morgan and Claypool, 2009. Google ScholarDigital Library
Index Terms
- Topic aspect analysis for multi-document summarization
Recommendations
Topic analysis for topic-focused multi-document summarization
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementTopic-focused multi-document summarization has been a challenging task because the created summary is required to be biased to the given topic or query. Existing methods consider the given topic as a single coarse unit and then directly incorporate the ...
Research on Multi-document Summarization Based on LDA Topic Model
IHMSC '14: Proceedings of the 2014 Sixth International Conference on Intelligent Human-Machine Systems and Cybernetics - Volume 02Compared with VSM (Vector Space Model) and graph-ranking models, LDA (Latent Dirichlet Allocation) Model can discover latent topics in the corpus and latent topics are beneficial to use sentence-ranking mechanisms to form a good summary. In the paper, ...
Manifold-ranking based topic-focused multi-document summarization
IJCAI'07: Proceedings of the 20th international joint conference on Artifical intelligenceTopic-focused multi-document summarization aims to produce a summary biased to a given topic or user profile. This paper presents a novel extractive approach based on manifold-ranking of sentences to this summarization task. The manifold-ranking process ...
Comments