research-article

Summarizing the differences in multilingual news

Authors:
Xiaojun Wan

Peking University, Beijing, China

Peking University, Beijing, China
View Profile

,
Houping Jia

Peking University, Beijing, China

Peking University, Beijing, China
View Profile

,
Shanshan Huang

Peking University, Beijing, China

Peking University, Beijing, China
View Profile

,
Jianguo Xiao

Peking University, Beijing, China

Peking University, Beijing, China
View Profile

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information RetrievalJuly 2011Pages 735–744https://doi.org/10.1145/2009916.2010015

Published:24 July 2011Publication History

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

Pages 735–744

ABSTRACT

There usually exist many news articles written in different languages about a hot news event. The news articles in different languages are written in different ways to reflect different standpoints. For example, the Chinese news agencies and the Western news agencies have published many articles to report the same news of "Liu Xiaobo's Nobel Prize" in Chinese and English languages, respectively. The Chinese news articles and the English news articles share something about the news fact in common, but they focus on different aspects in order to reflect different standpoints about the event. In this paper, we investigate the task of multilingual news summarization for the purpose of finding and summarizing the major differences between the news articles about the same event in the Chinese and English languages. We propose a novel constrained co-ranking (C-CoRank) method for addressing this special task. The C-CoRank method adds the constraints between the difference score and the common score of each sentence to the co-ranking process. Evaluation results on the manually labeled test set with 15 news topics show the effectiveness of our proposed method, and the constrained co-ranking method can outperform a few baselines and the typical co-ranking method.

References

A. Aker, T. Cohn, and R. Gaizauskas. Multi-document summarization using A* search and discriminative training. In Proceedings of EMNLP2010. Google ScholarDigital Library
M. R. Amini, P. Gallinari. The Use of Unlabeled Data to Improve Supervised Learning for Text Summarization. In Proceedings of SIGIR2002. Google ScholarDigital Library
F. Boudin, M. El-Bèze, J.-M. Torres-Moreno. The LIA update summarization systems at TAC-2008. In Proceedings of TAC2008.Google Scholar
J. Carbonell and J. Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of SIGIR1998. Google ScholarDigital Library
G. de Chalendar, R. Besançon, O. Ferret, G. Grefenstette, and O. Mesnard. Crosslingual summarization with thematic extraction, syntactic sentence simplification, and bilingual generation. In Workshop on Crossing Barriers in Text Summarization Research, 5th International Conference on Recent Advances in Natural Language Processing (RANLP2005).Google Scholar
A. Celikyilmaz and D. Hakkani-Tur. A hybrid hierarchical model for multi-document summarization. In Proceedings of ACL2010. Google ScholarDigital Library
H. T. Dang and K. Owczarzak. Overview of the TAC 2008 update summarization task. In Proceedings of TAC2008.Google Scholar
P. Du, J. Guo, J. Zhang, X. Cheng. Manifold ranking with sink points for update summarization. In Proceedings of CIKM2010. Google ScholarDigital Library
G. ErKan, D. R. Radev. LexPageRank. Prestige in Multi-Document Text Summarization. In Proceedings of EMNLP2004.Google Scholar
S. Fisher and B. Roark. Query-focused supervised sentence ranking for update summaries. In Proceeding of TAC2008.Google Scholar
S. Harabagiu and F. Lacatusu. Topic themes for multi-document summarization. In Proceedings of SIGIR2005. Google ScholarDigital Library
H. D. Kim and C. Zhai. Generating comparative summaries of contradictory opinions in text. In Proceedings of CIKM2009. Google ScholarDigital Library
J. Kupiec, J. Pedersen, F. Chen. A.Trainable Document Summarizer. In Proceedings of SIGIR1995. Google ScholarDigital Library
A. Leuski, C.-Y. Lin, L. Zhou, U. Germann, F. J. Och, E. Hovy. Cross-lingual C*ST*RD: English access to Hindi information. ACM Transactions on Asian Language Information Processing, 2(3): 245--269, 2003. Google ScholarDigital Library
W. Li, F. Wei Q. Lu and Y. He. PNR2: Ranking sentences with positive and negative reinforcement for query-oriented update summarization. In Proceedings of COLING2008. Google ScholarDigital Library
C. Y. Lin, E. Hovy. The Automated Acquisition of Topic Signatures for Text Summarization. In Proceedings of COLING2000. Google ScholarDigital Library
C..-Y. Lin and E.. H. Hovy. From Single to Multi-document Summarization: A Prototype System and its Evaluation. In Proceedings of ACL2002. Google ScholarDigital Library
C.-Y. Lin and E.H. Hovy. Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics. In Proceedings of HLT-NAACL -2003. Google ScholarDigital Library
C.-Y. Lin, L. Zhou, and E. Hovy. Multilingual summarization evaluation 2005: automatic evaluation report. In Proceedings of MSE (ACL2005 Workshop).Google Scholar
M. Litvak, M. Last, and M. Friedman. A new approach to improving multilingual summarization using a genetic algorithm. In Proceedings of ACL2010. Google ScholarDigital Library
H. P. Luhn. The Automatic Creation of literature Abstracts. IBM Journal of Research and Development, 2(2), 1969. Google ScholarDigital Library
I. Mani and E. Bloedorn. Summarizing similarities and differences among related documents. Information Retrieval, 1: 35--67, 1999. Google ScholarDigital Library
R. Mihalcea, P. Tarau. TextRank: Bringing Order into Texts. In Proceedings of EMNLP2004.Google Scholar
R. Mihalcea and P. Tarau. A language independent algorithm for single and multiple document summarization. In Proceedings of IJCNLP-2005.Google Scholar
V. Nastase, K. Filippova, S. P. Ponzetto. Generating update summaries with spreading activation. In Proceedings of TAC2008.Google Scholar
A. Nenkova and A. Louis. Can you summarize this? Identifying correlates of input difficulty for generic multi-document summarization. In Proceedings of ACL-2008:HLT.Google Scholar
M. J. Paul, C. Zhai, and R. Girju. Summarizing contrastive viewpoints in opinionated text. In Proceedings of EMNLP2010. Google ScholarDigital Library
D. R. Radev, H. Y. Jing, M. Stys and D. Tam. Centroid-based summarization of multiple documents. Information Processing and Management, 40: 919--938, 2004. Google ScholarDigital Library
A. Siddharthan and K. McKeown. Improving multilingual summarization: using redundancy in the input to correct MT errors. In Proceedings of HLT/EMNLP-2005. Google ScholarDigital Library
X. Wan. Towards a unified approach to simultaneous single-document and multi-document summarizations. In Proceedings of COLING2010. Google ScholarDigital Library
X. Wan, H. Li and J. Xiao. Cross-language document summarization based on machine translation quality prediction. In Proceedings of ACL2010. Google ScholarDigital Library
X. Wan and J. Yang. Multi-document summarization using cluster-based link analysis. In Proceedings of SIGIR-2008. Google ScholarDigital Library
X. Wan, J. Yang and J. Xiao. Towards an Iterative Reinforcement Approach for Simultaneous Document Summarization and Keyword Extraction. In Proceedings of ACL2007.Google Scholar
X. Wan, J. Yang and J. Xiao. Manifold-ranking based topic-focused multi-document summarization. In Proceedings of IJCAI-2007. Google ScholarDigital Library
D. Wang, S. Zhu, T. Li, and Y. Gong. Comparative document summarization via discriminative sentence selection. In Proceedings of CIKM2009. Google ScholarDigital Library
D. Wang, T. Li. Document update summarization using incremental hierarchical clustering. In Proceedings of CIKM2010. Google ScholarDigital Library
K.-F. Wong, M. Wu and W. Li. Extractive summarization using supervised and semi-supervised learning. In Proceedings of COLING-2008. Google ScholarDigital Library
H. Y. Zha. Generic Summarization and Keyphrase Extraction Using Mutual Reinforcement Principle and Sentence Clustering. In Proceedings of SIGIR2002. Google ScholarDigital Library
Y. Zhang, X. Ji, C.-H. Chu, and H. Zha. Correlating summarization of multi-source news with K-way graph bi-clustering. SIGKDD Explorations, 6(2), 2004. Google ScholarDigital Library

Index Terms

Summarizing the differences in multilingual news

Recommendations

ELSA: A Multilingual Document Summarization Algorithm Based on Frequent Itemsets and Latent Semantic Analysis

Sentence-based summarization aims at extracting concise summaries of collections of textual documents. Summaries consist of a worthwhile subset of document sentences. The most effective multilingual strategies rely on Latent Semantic Analysis (LSA) and ...
Read More
Research on Multi-document Summarization Based on LDA Topic Model
IHMSC '14: Proceedings of the 2014 Sixth International Conference on Intelligent Human-Machine Systems and Cybernetics - Volume 02

Compared with VSM (Vector Space Model) and graph-ranking models, LDA (Latent Dirichlet Allocation) Model can discover latent topics in the corpus and latent topics are beneficial to use sentence-ranking mechanisms to form a good summary. In the paper, ...
Read More
Summarizing Opinions with Sentiment Analysis from Multiple Reviews on Travel Destinations

Recently, the web has been crowded with growing volumes of various texts on every aspect of human life. It is difficult to rapidly access, analyze, and compose important decisions using efficient methods for raw textual data in the form of social media, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
July 2011
1374 pages
ISBN:9781450307574
DOI:10.1145/2009916
General Chairs:
Wei-Ying Ma
Microsoft Research Asia, China
,
Jian-Yun Nie
University of Montreal, Canada
,
Program Chairs:
Ricardo Baeza-Yates
Yahoo! Research, Spain
,
Tat-Seng Chua
National University of Singapore
,
W. Bruce Croft
University of Massachusetts, Amherst, USA
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 July 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
constrained co-ranking
multi-document summarization
multilingual summarization
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 13
  Total Citations
  View Citations
- 838
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Summarizing the differences in multilingual news

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

ELSA: A Multilingual Document Summarization Algorithm Based on Frequent Itemsets and Latent Semantic Analysis

Research on Multi-document Summarization Based on LDA Topic Model

Summarizing Opinions with Sentiment Analysis from Multiple Reviews on Travel Destinations