Skip to main content
Erschienen in: Knowledge and Information Systems 2/2019

17.01.2018 | Short Paper

Cross-language document summarization via extraction and ranking of multiple summaries

verfasst von: Xiaojun Wan, Fuli Luo, Xue Sun, Songfang Huang, Jin-ge Yao

Erschienen in: Knowledge and Information Systems | Ausgabe 2/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The task of cross-language document summarization aims to produce a summary in a target language (e.g., Chinese) for a given document set in a different source language (e.g., English). Previous studies focus on ranking and selection of translated sentences in the target language. In this paper, we propose a new framework for addressing the task by extraction and ranking of multiple summaries in the target language. First, we extract multiple candidate summaries by proposing several schemes for improving the upper-bound quality of the summaries. Then, we propose a new ensemble ranking method for ranking the candidate summaries by making use of bilingual features. Extensive experiments have been conducted on a benchmark dataset and the results verify the effectiveness of our proposed framework, which outperforms a variety of baselines, including supervised baselines.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Burges C, Shaked T, Renshaw E, Lazier A, Deeds M, Hamilton N, Hullender G (2005) Learning to rank using gradient descent. In: Proceedings of the 22nd international conference on machine learning. pp 89–96 Burges C, Shaked T, Renshaw E, Lazier A, Deeds M, Hamilton N, Hullender G (2005) Learning to rank using gradient descent. In: Proceedings of the 22nd international conference on machine learning. pp 89–96
3.
Zurück zum Zitat Cao Z, Qin T, Liu T-Y, Tsai M-F, Li H (2007) Learning to rank: from pairwise approach to listwise approach. In: Proceedings of the 24th international conference on machine learning. pp 129–136 Cao Z, Qin T, Liu T-Y, Tsai M-F, Li H (2007) Learning to rank: from pairwise approach to listwise approach. In: Proceedings of the 24th international conference on machine learning. pp 129–136
4.
Zurück zum Zitat Cao Z, Wei F, Dong L, Li S, Zhou M (2015) Ranking with recursive neural networks and its application to multi-document summarization. In: Proceedings of AAAI. pp 2153–2159 Cao Z, Wei F, Dong L, Li S, Zhou M (2015) Ranking with recursive neural networks and its application to multi-document summarization. In: Proceedings of AAAI. pp 2153–2159
5.
Zurück zum Zitat Erkan G, Radev D (2004) LexPageRank: Prestige in multi-document text summarization. In: Proceedings of EMNLP. pp 365–371 Erkan G, Radev D (2004) LexPageRank: Prestige in multi-document text summarization. In: Proceedings of EMNLP. pp 365–371
6.
Zurück zum Zitat Filippova K (2010) Multi-sentence compression: finding shortest paths in word graphs. In: Proceedings of the 23rd international conference on computational linguistics (Coling 2010). pp 322–330 Filippova K (2010) Multi-sentence compression: finding shortest paths in word graphs. In: Proceedings of the 23rd international conference on computational linguistics (Coling 2010). pp 322–330
7.
Zurück zum Zitat Freund Y, Iyer R, Schapire RE, Singer Y (2003) An efficient boosting algorithm for combining preferences. J Mach Learn Res 4:933–969MathSciNetMATH Freund Y, Iyer R, Schapire RE, Singer Y (2003) An efficient boosting algorithm for combining preferences. J Mach Learn Res 4:933–969MathSciNetMATH
9.
Zurück zum Zitat Gillick D, Favre B, Hakkani-Tur D (2008) The ICSI summarization system at TAC 2008. In: Proceedings of the text understanding conference Gillick D, Favre B, Hakkani-Tur D (2008) The ICSI summarization system at TAC 2008. In: Proceedings of the text understanding conference
10.
Zurück zum Zitat Hong K, Marcus M, Nenkova A (2015) System combination for multi-document summarization. In: Proceedings of EMNLP. pp 107–117 Hong K, Marcus M, Nenkova A (2015) System combination for multi-document summarization. In: Proceedings of EMNLP. pp 107–117
11.
Zurück zum Zitat Joachims T (2002) Optimizing search engines using clickthrough data. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining. pp 133–142 Joachims T (2002) Optimizing search engines using clickthrough data. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining. pp 133–142
12.
Zurück zum Zitat Lin CY (2004) Rouge: a package for automatic evaluation of summaries. In: Proceedings of the ACL-04 workshop on text summarization branches out Lin CY (2004) Rouge: a package for automatic evaluation of summaries. In: Proceedings of the ACL-04 workshop on text summarization branches out
13.
Zurück zum Zitat Li J, Li L, Li T (2012) Multi-document summarization via submodularity. Appl Intell 37(3):420–430CrossRef Li J, Li L, Li T (2012) Multi-document summarization via submodularity. Appl Intell 37(3):420–430CrossRef
14.
Zurück zum Zitat Lin H, Bilmes J (2010) Multi-document summarization via budgeted maximization of submodular functions. In: Human language technologies: the 2010 annual conference of the North American chapter of the association for computational linguistics. Association for Computational Linguistics, pp 912–920 Lin H, Bilmes J (2010) Multi-document summarization via budgeted maximization of submodular functions. In: Human language technologies: the 2010 annual conference of the North American chapter of the association for computational linguistics. Association for Computational Linguistics, pp 912–920
15.
Zurück zum Zitat Lin H, Bilmes J (2011) A class of submodular functions for document summarization. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies-volume 1, Association for computational linguistics. pp 510–520 Lin H, Bilmes J (2011) A class of submodular functions for document summarization. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies-volume 1, Association for computational linguistics. pp 510–520
16.
Zurück zum Zitat Orasan C, Chiorean OA (2008) Evaluation of a cross-lingual romanian-english multi-document summariser. In: Proceedings of LREC Orasan C, Chiorean OA (2008) Evaluation of a cross-lingual romanian-english multi-document summariser. In: Proceedings of LREC
17.
Zurück zum Zitat Ouyang Y, Li W, Li S, Lu Q (2011) Applying regression models to query-focused multi-document summarization. Inf Process Manag 47:227–237CrossRef Ouyang Y, Li W, Li S, Lu Q (2011) Applying regression models to query-focused multi-document summarization. Inf Process Manag 47:227–237CrossRef
18.
Zurück zum Zitat Ouyang Y, Li S, Li W (2007) Developing learning strategies for topic-based summarization. In: Proceedings of the Sixteenth ACM conference on information and knowledge management, ACM. pp 79–86 Ouyang Y, Li S, Li W (2007) Developing learning strategies for topic-based summarization. In: Proceedings of the Sixteenth ACM conference on information and knowledge management, ACM. pp 79–86
19.
Zurück zum Zitat Pingali P, Jagarlamudi J, Varma V (2007) Experiments in cross language query focused multi-document summarization. In: Workshop on cross lingual information access addressing the information need of multilingual societies in IJCAI2007 Pingali P, Jagarlamudi J, Varma V (2007) Experiments in cross language query focused multi-document summarization. In: Workshop on cross lingual information access addressing the information need of multilingual societies in IJCAI2007
20.
Zurück zum Zitat Radev D, Jing H, Styś M, Tam D (2004) Centroid-based summarization of multiple documents. Inf Process Manag 40(6):919–938CrossRefMATH Radev D, Jing H, Styś M, Tam D (2004) Centroid-based summarization of multiple documents. Inf Process Manag 40(6):919–938CrossRefMATH
21.
Zurück zum Zitat Shen D, Sun JT, Li H, Yang Q, Chen Z (2007) Document summarization using conditional random fields. In: Proceedings of IJCAI. pp 2862–2867 Shen D, Sun JT, Li H, Yang Q, Chen Z (2007) Document summarization using conditional random fields. In: Proceedings of IJCAI. pp 2862–2867
22.
Zurück zum Zitat Wan X (2011) Using bilingual information for cross-language document summarization. In: Proceedings of ACL. pp 1546–1555 Wan X (2011) Using bilingual information for cross-language document summarization. In: Proceedings of ACL. pp 1546–1555
23.
Zurück zum Zitat Wan X, Li H, Xiao J (2010) Cross-language document summarization based on machine translation quality prediction. In: Proceedings of ACL. pp 917–926 Wan X, Li H, Xiao J (2010) Cross-language document summarization based on machine translation quality prediction. In: Proceedings of ACL. pp 917–926
24.
Zurück zum Zitat Wan X, Yang J, Xiao J (2007) Manifold-ranking based topic-focused multi-document summarization. In: Proceedings of IJCAI. pp 2903–2908 Wan X, Yang J, Xiao J (2007) Manifold-ranking based topic-focused multi-document summarization. In: Proceedings of IJCAI. pp 2903–2908
25.
Zurück zum Zitat Wan X, Yang J (2008) Multi-document summarization using cluster-based link analysis. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval. pp 299–306 Wan X, Yang J (2008) Multi-document summarization using cluster-based link analysis. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval. pp 299–306
26.
27.
Zurück zum Zitat Yao JG, Wan X, Xiao J (2015) Phrase-based compressive cross-language summarization. In: Proceedings of EMNLP. pp 118–127 Yao JG, Wan X, Xiao J (2015) Phrase-based compressive cross-language summarization. In: Proceedings of EMNLP. pp 118–127
Metadaten
Titel
Cross-language document summarization via extraction and ranking of multiple summaries
verfasst von
Xiaojun Wan
Fuli Luo
Xue Sun
Songfang Huang
Jin-ge Yao
Publikationsdatum
17.01.2018
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 2/2019
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-018-1152-7

Weitere Artikel der Ausgabe 2/2019

Knowledge and Information Systems 2/2019 Zur Ausgabe