Skip to main content

2018 | OriginalPaper | Buchkapitel

Citation Based Collaborative Summarization of Scientific Publications by a New Sentence Similarity Measure

verfasst von : Chengzhe Yuan, Dingding Li, Jia Zhu, Yong Tang, Shahbaz Wasti, Chaobo He, Hai Liu, Ronghua Lin

Erschienen in: Collaborative Computing: Networking, Applications and Worksharing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Next-generation network offers unrestricted access for researchers to all kinds of scientific publications, collaborative summarization systems are now being contemplated as a service that can help researchers gain information when they read scientific articles. One way to develop a collaborative summarization system is to measure semantic similarity between sentences to improve its quality. In this paper, we introduce a new sentence similarity measure for summarizing scientific articles with citation context. Our work is based on recent work in document distance metric called the word mover’s distance (WMD). Compared to traditional similarity measures, WMD based sentence similarity measure has better performance by capturing the semantic relation between two sentences. Experiments on 2016 version of ACL Anthology Reference Corpus show that our approach outperforms several other baselines by ROUGE metrics.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Nenkova, A., McKeown, K.: Automatic summarization. Found Trends® Inf. Retr. 5(2–3), 103–233 (2011)CrossRef Nenkova, A., McKeown, K.: Automatic summarization. Found Trends® Inf. Retr. 5(2–3), 103–233 (2011)CrossRef
2.
Zurück zum Zitat Khan, A., Salim, N.: A review on abstractive summarization methods. J. Theor. Appl. Inf. Technol. 59(1), 64–72 (2014) Khan, A., Salim, N.: A review on abstractive summarization methods. J. Theor. Appl. Inf. Technol. 59(1), 64–72 (2014)
3.
Zurück zum Zitat Gupta, V., Lehal, G.S.: A survey of text summarization extractive techniques. J. Emerg. Technol. Web Intell. 2(3), 258–268 (2010) Gupta, V., Lehal, G.S.: A survey of text summarization extractive techniques. J. Emerg. Technol. Web Intell. 2(3), 258–268 (2010)
4.
Zurück zum Zitat Elkiss, A., et al.: Blind men and elephants: what do citation summaries tell us about a research article? J. Assoc. Inf. Sci. Technol. 59(1), 51–62 (2008)CrossRef Elkiss, A., et al.: Blind men and elephants: what do citation summaries tell us about a research article? J. Assoc. Inf. Sci. Technol. 59(1), 51–62 (2008)CrossRef
5.
Zurück zum Zitat Cohan, A., Goharian, N.: Scientific article summarization using citation-context and article’s discourse structure (2017). ArXiv preprint arXiv:1704.06619 Cohan, A., Goharian, N.: Scientific article summarization using citation-context and article’s discourse structure (2017). ArXiv preprint arXiv:​1704.​06619
6.
Zurück zum Zitat Chen, J., Zhuge, H.: Summarization of scientific documents by detecting common facts in citations. Future Gener. Comput. Syst. 32, 246–252 (2014)CrossRef Chen, J., Zhuge, H.: Summarization of scientific documents by detecting common facts in citations. Future Gener. Comput. Syst. 32, 246–252 (2014)CrossRef
7.
Zurück zum Zitat Kusner, M., et al.: From word embeddings to document distances. In: International Conference on Machine Learning (2015) Kusner, M., et al.: From word embeddings to document distances. In: International Conference on Machine Learning (2015)
8.
Zurück zum Zitat Qazvinian, V., Radev, D.R.: Scientific paper summarization using citation summary networks. In: Proceedings of the 22nd International Conference on Computational Linguistics, vol. 1. Association for Computational Linguistics (2008) Qazvinian, V., Radev, D.R.: Scientific paper summarization using citation summary networks. In: Proceedings of the 22nd International Conference on Computational Linguistics, vol. 1. Association for Computational Linguistics (2008)
9.
Zurück zum Zitat Agarwal, N., et al.: Towards multi-document summarization of scientific articles: making interesting comparisons with SciSumm. In: Proceedings of the Workshop on Automatic Summarization for Different Genres, Media, and Languages. Association for Computational Linguistics (2011) Agarwal, N., et al.: Towards multi-document summarization of scientific articles: making interesting comparisons with SciSumm. In: Proceedings of the Workshop on Automatic Summarization for Different Genres, Media, and Languages. Association for Computational Linguistics (2011)
10.
Zurück zum Zitat Abu-Jbara, A., Radev, D.: Coherent citation-based summarization of scientific papers. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1. Association for Computational Linguistics (2011) Abu-Jbara, A., Radev, D.: Coherent citation-based summarization of scientific papers. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1. Association for Computational Linguistics (2011)
12.
Zurück zum Zitat Qazvinian, V., Radev, D.R., Özgür, A.: Citation summarization through keyphrase extraction. In: Proceedings of the 23rd International Conference on Computational Linguistics. Association for Computational Linguistics (2010) Qazvinian, V., Radev, D.R., Özgür, A.: Citation summarization through keyphrase extraction. In: Proceedings of the 23rd International Conference on Computational Linguistics. Association for Computational Linguistics (2010)
13.
Zurück zum Zitat Sollaci, L.B., Pereira, M.G.: The introduction, methods, results, and discussion (IMRAD) structure: a fifty-year survey. J. Med. Library Assoc. 92(3), 364 (2004) Sollaci, L.B., Pereira, M.G.: The introduction, methods, results, and discussion (IMRAD) structure: a fifty-year survey. J. Med. Library Assoc. 92(3), 364 (2004)
15.
Zurück zum Zitat Goldberg, Y., Levy, O.: word2vec explained: deriving Mikolov et al.’s negative-sampling word-embedding method (2014). ArXiv preprint arXiv:1402.3722 Goldberg, Y., Levy, O.: word2vec explained: deriving Mikolov et al.’s negative-sampling word-embedding method (2014). ArXiv preprint arXiv:​1402.​3722
16.
Zurück zum Zitat Lu, J.-F., et al.: Hierarchical initialization approach for K-Means clustering. Pattern Recognit. Lett. 29(6), 787–795 (2008)CrossRef Lu, J.-F., et al.: Hierarchical initialization approach for K-Means clustering. Pattern Recognit. Lett. 29(6), 787–795 (2008)CrossRef
17.
Zurück zum Zitat Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM (1998) Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM (1998)
18.
Zurück zum Zitat Lin, C.-Y., Och, F.: Looking for a few good metrics: ROUGE and its evaluation. In: NTCIR Workshop (2004) Lin, C.-Y., Och, F.: Looking for a few good metrics: ROUGE and its evaluation. In: NTCIR Workshop (2004)
19.
Zurück zum Zitat Haghighi, A., Vanderwende, L.: Exploring content models for multi-document summarization. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics (2009) Haghighi, A., Vanderwende, L.: Exploring content models for multi-document summarization. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics (2009)
20.
Zurück zum Zitat Erkan, G., Radev, D.R.: Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–479 (2004)CrossRef Erkan, G., Radev, D.R.: Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–479 (2004)CrossRef
21.
Zurück zum Zitat Steinberger, J., Křišťan, M.: LSA-based multi-document summarization. In: Proceedings of 8th International Workshop on Systems and Control (2007) Steinberger, J., Křišťan, M.: LSA-based multi-document summarization. In: Proceedings of 8th International Workshop on Systems and Control (2007)
22.
Zurück zum Zitat Mihalcea, R., Tarau, P.: TextRank: bringing order into text. In: EMNLP (2004) Mihalcea, R., Tarau, P.: TextRank: bringing order into text. In: EMNLP (2004)
23.
Zurück zum Zitat Nenkova, A., Vanderwende, L.: The impact of frequency on summarization. Technical report MSR-TR-2005-101. Microsoft Research, Redmond, Washington (2005) Nenkova, A., Vanderwende, L.: The impact of frequency on summarization. Technical report MSR-TR-2005-101. Microsoft Research, Redmond, Washington (2005)
Metadaten
Titel
Citation Based Collaborative Summarization of Scientific Publications by a New Sentence Similarity Measure
verfasst von
Chengzhe Yuan
Dingding Li
Jia Zhu
Yong Tang
Shahbaz Wasti
Chaobo He
Hai Liu
Ronghua Lin
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-00916-8_62

Neuer Inhalt