ABSTRACT
We consider the problem of question-focused sentence retrieval from complex news articles describing multi-event stories published over time. Annotators generated a list of questions central to understanding each story in our corpus. Because of the dynamic nature of the stories, many questions are time-sensitive (e.g. "How many victims have been found?") Judges found sentences providing an answer to each question. To address the sentence retrieval problem, we apply a stochastic, graph-based method for comparing the relative importance of the textual units, which was previously used successfully for generic summarization. Currently, we present a topic-sensitive version of our method and hypothesize that it can outperform a competitive baseline, which compares the similarity of each sentence to the input question via IDF-weighted word overlap. In our experiments, the method achieves a TRDR score that is significantly higher than that of the baseline.
- James Allan, Courtney Wade, and Alvaro Bolivar. 2003. Retrieval and novelty detection at the sentence level. In SIGIR '03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 314--321. ACM Press. Google ScholarDigital Library
- Enrique Amigo, Julio Gonzalo, Victor Peinado, Anselmo Peñas, and Felisa Verdejo. 2004. An Empirical Study of Information Synthesis Task. In Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL'04), Main Volume, pages 207--214, Barcelona, Spain, July. Google ScholarDigital Library
- Sergey Brin and Lawrence Page. 1998. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems, 30(1--7): 107--117. Google ScholarDigital Library
- Jean Carletta. 1996. Assessing Agreement on Classification Tasks: The Kappa Statistic. CL, 22(2):249--254. Google ScholarDigital Library
- Gunes Erkan and Dragomir Radev. 2004. LexRank: Graph-based Lexical Centrality as Salience in Text. JAIR, 22:457--479. Google ScholarDigital Library
- Robert Gaizauskas, Mark Hepple, and Mark Greenwood. 2004. Information Retrieval for Question Answering: a SIGIR 2004 Workshop. In SIGIR 2004 Workshop on Information Retrieval for Question Answering. Google ScholarDigital Library
- Oren Kurland and Lillian Lee. 2005. PageRank without hyperlinks: Structural re-ranking using links induced by language models. In SIGIR 2005, Salvador, Brazil, August. Google ScholarDigital Library
- L. Page, S. Brin, R. Motwani, and T. Winograd. 1998. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford University, Stanford, CA.Google Scholar
- Bo Pang and Lillian Lee. 2004. A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts. In Association for Computational Linguistics. Google ScholarDigital Library
- Dragomir Radev, Weiguo Fan, Hong Qi, Harris Wu, and Amardeep Grewal. 2005. Probabilistic Question Answering on the Web. Journal of the American Society for Information Science and Technology, 56(3), March. Google ScholarDigital Library
- Stephen E. Robertson, Steve Walker, Micheline Hancock-Beaulieu, Aarron Gull, and Marianna Lau. 1992. Okapi at TREC. In Text REtrieval Conference, pages 21--30.Google Scholar
- G. Salton, J. Allan, and C. Buckley. 1993. Approaches to Passage REtrieval in Full Text Information Systems. In Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 49--58. Google ScholarDigital Library
- E. Seneta. 1981. Non-negative matrices and markov chains. Springer-Verlag, New York.Google Scholar
- Ellen Voorhees and Dawn Tice. 2000. The TREC-8 Question Answering Track Evaluation. In Text Retrieval Conference TREC-8, Gaithersburg, MD.Google ScholarCross Ref
- Harris Wu, Dragomir R. Radev, and Weiguo Fan. 2004. Towards Answer-focused Summarization Using Search Engines. New Directions in Question Answering.Google Scholar
- Using random walks for question-focused sentence retrieval
Recommendations
Using Cross-Document Random Walks for Topic-Focused Multi-Document
WI '06: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web IntelligenceGraph-ranking based methods have been developed for generic multi-document summarization in recent years and they make uniform use of the relationships between sentences to extract salient sentences. This paper proposes to integrate the relevance of the ...
Subtopic-Focused Sentence Scoring in Multi-document Summarization
ALPIT '07: Proceedings of the Sixth International Conference on Advanced Language Processing and Web Information Technology (ALPIT 2007)In previous works, subtopics are seldom mentioned in multi-document summarization while only one topic is focused to extract summary. In this paper, we propose a subtopic- focused model to score sentences in the extractive summarization task. Different ...
Sentence Retrieval with Sentiment-specific Topical Anchoring for Review Summarization
CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge ManagementWe propose Topic Anchoring-based Review Summarization (TARS), a two-step extractive summarization method, which creates review summaries from the sentences that represent the most important aspects of a review. In the first step, the proposed method ...
Comments