Skip to main content
Erschienen in: International Journal on Digital Libraries 2-3/2018

09.05.2017

Scientific document summarization via citation contextualization and scientific discourse

verfasst von: Arman Cohan, Nazli Goharian

Erschienen in: International Journal on Digital Libraries | Ausgabe 2-3/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The rapid growth of scientific literature has made it difficult for the researchers to quickly learn about the developments in their respective fields. Scientific summarization addresses this challenge by providing summaries of the important contributions of scientific papers. We present a framework for scientific summarization which takes advantage of the citations and the scientific discourse structure. Citation texts often lack the evidence and context to support the content of the cited paper and are even sometimes inaccurate. We first address the problem of inaccuracy of the citation texts by finding the relevant context from the cited paper. We propose three approaches for contextualizing citations which are based on query reformulation, word embeddings, and supervised learning. We then train a model to identify the discourse facets for each citation. We finally propose a method for summarizing scientific papers by leveraging the faceted citations and their corresponding contexts. We evaluate our proposed method on two scientific summarization datasets in the biomedical and computational linguistics domains. Extensive evaluation results show that our methods can improve over the state of the art by large margins.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
3
Term Frequency - Inverted Document Frequency.
 
4
we indexed up to 3 consecutive sentences in our experiments.
 
5
We empirically set this threshold to 1.9 and 2.2 for the TAC and CL-SciSum datasets, respectively.
 
7
MEdical Subject Headings.
 
10
National Institute of Standards and Technology.
 
13
We do not report results of supervised model on TAC dataset because the TAC data do not have separate train and test sets.
 
14
The cut-off point has similar effect on all the models.
 
Literatur
1.
Zurück zum Zitat Abu-Jbara, A., Ezra, J., Radev, D.R.: Purpose and polarity of citation: towards nlp-based bibliometrics. In: NAACL-HLT, pp. 596–606 (2013) Abu-Jbara, A., Ezra, J., Radev, D.R.: Purpose and polarity of citation: towards nlp-based bibliometrics. In: NAACL-HLT, pp. 596–606 (2013)
2.
Zurück zum Zitat Abu-Jbara, A., Radev, D.: Coherent citation-based summarization of scientific papers. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 500–509. Association for Computational Linguistics (2011) Abu-Jbara, A., Radev, D.: Coherent citation-based summarization of scientific papers. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 500–509. Association for Computational Linguistics (2011)
3.
Zurück zum Zitat Abu-Jbara, A., Radev, D.: Reference scope identification in citing sentences. In: NAACL-HLT, pp. 80–90. ACL (2012) Abu-Jbara, A., Radev, D.: Reference scope identification in citing sentences. In: NAACL-HLT, pp. 80–90. ACL (2012)
5.
Zurück zum Zitat Bendersky, M., Croft, W.B.: Discovering key concepts in verbose queries. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 491–498. ACM (2008) Bendersky, M., Croft, W.B.: Discovering key concepts in verbose queries. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 491–498. ACM (2008)
6.
Zurück zum Zitat Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013). doi:10.1109/TPAMI.2013.50 Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013). doi:10.​1109/​TPAMI.​2013.​50
7.
Zurück zum Zitat Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)MATH Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)MATH
8.
Zurück zum Zitat Berg-Kirkpatrick, T., Gillick, D., Klein, D.: Jointly learning to extract and compress. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 481–490. Association for Computational Linguistics (2011) Berg-Kirkpatrick, T., Gillick, D., Klein, D.: Jointly learning to extract and compress. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 481–490. Association for Computational Linguistics (2011)
9.
Zurück zum Zitat Bertin, M., Atanassova, I., Gingras, Y., Larivière, V.: The invariant distribution of references in scientific articles. J. Assoc. Inf. Sci. Technol. 67(1), 164–177 (2016). doi:10.1002/asi.23367 CrossRef Bertin, M., Atanassova, I., Gingras, Y., Larivière, V.: The invariant distribution of references in scientific articles. J. Assoc. Inf. Sci. Technol. 67(1), 164–177 (2016). doi:10.​1002/​asi.​23367 CrossRef
10.
Zurück zum Zitat Bodenreider, O.: The unified medical language system (umls): integrating biomedical terminology. Nucl. Acids Res. 32(suppl 1), D267–D270 (2004)CrossRef Bodenreider, O.: The unified medical language system (umls): integrating biomedical terminology. Nucl. Acids Res. 32(suppl 1), D267–D270 (2004)CrossRef
11.
Zurück zum Zitat Bornmann, L., Mutz, R.: Growth rates of modern science: a bibliometric analysis based on the number of publications and cited references. J. Assoc. Inf. Sci. Technol. 66(11), 2215–2222 (2015)CrossRef Bornmann, L., Mutz, R.: Growth rates of modern science: a bibliometric analysis based on the number of publications and cited references. J. Assoc. Inf. Sci. Technol. 66(11), 2215–2222 (2015)CrossRef
12.
Zurück zum Zitat Cao, G., Nie, J.Y., Gao, J., Robertson, S.: Selecting good expansion terms for pseudo-relevance feedback. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 243–250. ACM (2008) Cao, G., Nie, J.Y., Gao, J., Robertson, S.: Selecting good expansion terms for pseudo-relevance feedback. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 243–250. ACM (2008)
13.
Zurück zum Zitat Cao, Z., Li, W., Wu, D.: Polyu at cl-scisumm 2016. In: BIRNDL 2016 Joint Workshop on Bibliometric-enhanced Information Retrieval and NLP for Digital Libraries (2016) Cao, Z., Li, W., Wu, D.: Polyu at cl-scisumm 2016. In: BIRNDL 2016 Joint Workshop on Bibliometric-enhanced Information Retrieval and NLP for Digital Libraries (2016)
14.
Zurück zum Zitat Carbonell, J., Goldstein, J.: The use of mmr, diversity-based reranking for reordering documents and producing summaries. In: SIGIR, pp. 335–336. ACM (1998) Carbonell, J., Goldstein, J.: The use of mmr, diversity-based reranking for reordering documents and producing summaries. In: SIGIR, pp. 335–336. ACM (1998)
15.
Zurück zum Zitat Celikyilmaz, A., Hakkani-Tur, D.: A hybrid hierarchical model for multi-document summarization. In: ACL, pp. 815–824. Association for Computational Linguistics (2010) Celikyilmaz, A., Hakkani-Tur, D.: A hybrid hierarchical model for multi-document summarization. In: ACL, pp. 815–824. Association for Computational Linguistics (2010)
16.
Zurück zum Zitat Chakraborty, T., Krishna, A., Singh, M., Ganguly, N., Goyal, P., Mukherjee, A.: Ferosa: A faceted recommendation system for scientific articles. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 528–541. Springer (2016) Chakraborty, T., Krishna, A., Singh, M., Ganguly, N., Goyal, P., Mukherjee, A.: Ferosa: A faceted recommendation system for scientific articles. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 528–541. Springer (2016)
17.
Zurück zum Zitat Chakraborty, T., Narayanam, R.: All fingers are not equal: intensity of references in scientific articles. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1348–1358. Association for Computational Linguistics, Austin, Texas (2016). https://aclweb.org/anthology/D16-1142 Chakraborty, T., Narayanam, R.: All fingers are not equal: intensity of references in scientific articles. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1348–1358. Association for Computational Linguistics, Austin, Texas (2016). https://​aclweb.​org/​anthology/​D16-1142
18.
19.
Zurück zum Zitat Chopra, S., Auli, M., Rush, A.M.: Abstractive sentence summarization with attentive recurrent neural networks. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 93–98. Association for Computational Linguistics, San Diego, California (2016). http://www.aclweb.org/anthology/N16-1012 Chopra, S., Auli, M., Rush, A.M.: Abstractive sentence summarization with attentive recurrent neural networks. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 93–98. Association for Computational Linguistics, San Diego, California (2016). http://​www.​aclweb.​org/​anthology/​N16-1012
21.
Zurück zum Zitat Cohan, A., Goharian, N.: Scientific article summarization using citation-context and article’s discourse structure. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 390–400. Association for Computational Linguistics, Lisbon, Portugal (2015). https://aclweb.org/anthology/D/D15/D15-1045 Cohan, A., Goharian, N.: Scientific article summarization using citation-context and article’s discourse structure. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 390–400. Association for Computational Linguistics, Lisbon, Portugal (2015). https://​aclweb.​org/​anthology/​D/​D15/​D15-1045
22.
Zurück zum Zitat Cohan, A., Goharian, N.: Contextualizing citations for scientific summarization using word embeddings and domain knowledge. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’17. ACM, New York, NY, USA (2017). http://doi.acm.org/10.1145/3077136.3080740 Cohan, A., Goharian, N.: Contextualizing citations for scientific summarization using word embeddings and domain knowledge. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’17. ACM, New York, NY, USA (2017). http://​doi.​acm.​org/​10.​1145/​3077136.​3080740
23.
Zurück zum Zitat Cohan, A., Soldaini, L., Goharian, N.: Matching citation text and cited spans in biomedical literature: a search-oriented approach. In: Proceedings of the 2015 NAACL-HLT, pp. 1042–1048. Association for Computational Linguistics (2015). http://aclweb.org/anthology/N15-1110 Cohan, A., Soldaini, L., Goharian, N.: Matching citation text and cited spans in biomedical literature: a search-oriented approach. In: Proceedings of the 2015 NAACL-HLT, pp. 1042–1048. Association for Computational Linguistics (2015). http://​aclweb.​org/​anthology/​N15-1110
24.
Zurück zum Zitat Conroy, J.M., Davis, S.T.: Vector space and language models for scientific document summarization. In: Proceedings of NAACL-HLT, pp. 186–191 (2015) Conroy, J.M., Davis, S.T.: Vector space and language models for scientific document summarization. In: Proceedings of NAACL-HLT, pp. 186–191 (2015)
25.
Zurück zum Zitat Conroy, J.M., Schlesinger, J.D., Kubina, J., Rankel, P.A., OLeary, D.P.: Classy 2011 at tac: Guided and multi-lingual summaries and evaluation metrics. In: Proceedings of the Text Analysis Conference (2011) Conroy, J.M., Schlesinger, J.D., Kubina, J., Rankel, P.A., OLeary, D.P.: Classy 2011 at tac: Guided and multi-lingual summaries and evaluation metrics. In: Proceedings of the Text Analysis Conference (2011)
26.
Zurück zum Zitat De Waard, A., Maat, H.P.: Epistemic modality and knowledge attribution in scientific discourse: a taxonomy of types and overview of features. In: Proceedings of the Workshop on Detecting Structure in Scholarly Discourse, pp. 47–55. Association for Computational Linguistics (2012) De Waard, A., Maat, H.P.: Epistemic modality and knowledge attribution in scientific discourse: a taxonomy of types and overview of features. In: Proceedings of the Workshop on Detecting Structure in Scholarly Discourse, pp. 47–55. Association for Computational Linguistics (2012)
27.
Zurück zum Zitat Durrett, G., Berg-Kirkpatrick, T., Klein, D.: Learning-based single-document summarization with compression and anaphoricity constraints. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, vol. 1, Long Papers. Association for Computational Linguistics, Berlin, Germany (2016) Durrett, G., Berg-Kirkpatrick, T., Klein, D.: Learning-based single-document summarization with compression and anaphoricity constraints. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, vol. 1, Long Papers. Association for Computational Linguistics, Berlin, Germany (2016)
28.
Zurück zum Zitat Elkiss, A., Shen, S., Fader, A., Erkan, G., States, D., Radev, D.: Blind men and elephants: what do citation summaries tell us about a research article? J. Am. Soc. Inf. Sci. Technol. 59(1), 51–62 (2008)CrossRef Elkiss, A., Shen, S., Fader, A., Erkan, G., States, D., Radev, D.: Blind men and elephants: what do citation summaries tell us about a research article? J. Am. Soc. Inf. Sci. Technol. 59(1), 51–62 (2008)CrossRef
29.
Zurück zum Zitat Erkan, G., Radev, D.R.: Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. (JAIR) 22(1), 457–479 (2004)CrossRef Erkan, G., Radev, D.R.: Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. (JAIR) 22(1), 457–479 (2004)CrossRef
30.
Zurück zum Zitat Erkan, G., Radev, D.R.: Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–479 (2004)CrossRef Erkan, G., Radev, D.R.: Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–479 (2004)CrossRef
31.
32.
Zurück zum Zitat Garzone, M., Mercer, R.E.: Towards an automated citation classifier. In: Conference of the Canadian Society for Computational Studies of Intelligence, pp. 337–346. Springer (2000) Garzone, M., Mercer, R.E.: Towards an automated citation classifier. In: Conference of the Canadian Society for Computational Studies of Intelligence, pp. 337–346. Springer (2000)
33.
Zurück zum Zitat Gong, Y., Liu, X.: Generic text summarization using relevance measure and latent semantic analysis. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 19–25. ACM (2001) Gong, Y., Liu, X.: Generic text summarization using relevance measure and latent semantic analysis. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 19–25. ACM (2001)
34.
Zurück zum Zitat Guo, S., Sanner, S.: Probabilistic latent maximal marginal relevance. In: SIGIR, pp. 833–834. ACM (2010) Guo, S., Sanner, S.: Probabilistic latent maximal marginal relevance. In: SIGIR, pp. 833–834. ACM (2010)
35.
Zurück zum Zitat Harris, Z.S.: Distributional structure. Word 10(23), 146–162 (1954) Harris, Z.S.: Distributional structure. Word 10(23), 146–162 (1954)
36.
Zurück zum Zitat Hernández-alvarez, M., Gomez, J.M.: Survey about citation context analysis: tasks, techniques, and resources. Nat. Lang. Eng. 22(03), 327–349 (2016)CrossRef Hernández-alvarez, M., Gomez, J.M.: Survey about citation context analysis: tasks, techniques, and resources. Nat. Lang. Eng. 22(03), 327–349 (2016)CrossRef
39.
Zurück zum Zitat Hulth, A.: Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, pp. 216–223. Association for Computational Linguistics (2003) Hulth, A.: Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, pp. 216–223. Association for Computational Linguistics (2003)
40.
Zurück zum Zitat Huston, S., Croft, W.B.: Evaluating verbose query processing techniques. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 291–298. ACM (2010) Huston, S., Croft, W.B.: Evaluating verbose query processing techniques. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 291–298. ACM (2010)
41.
Zurück zum Zitat Jaidka, K., Chandrasekaran, M.K., Rustagi, S., Kan, M.Y.: Overview of the 2nd computational linguistics scientific document summarization shared task (cl-scisumm 2016). In: Proceedings of the Joint Workshop on Bibliometric-Enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2016) (2016) Jaidka, K., Chandrasekaran, M.K., Rustagi, S., Kan, M.Y.: Overview of the 2nd computational linguistics scientific document summarization shared task (cl-scisumm 2016). In: Proceedings of the Joint Workshop on Bibliometric-Enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2016) (2016)
42.
Zurück zum Zitat Jha, R., Coke, R., Radev, D.: Surveyor: a system for generating coherent survey articles for scientific topics. Ann. Arbor. 1001, 48109 (2015) Jha, R., Coke, R., Radev, D.: Surveyor: a system for generating coherent survey articles for scientific topics. Ann. Arbor. 1001, 48109 (2015)
43.
Zurück zum Zitat Jian, F., Huang, J.X., Zhao, J., He, T., Hu, P.: A simple enhancement for ad-hoc information retrieval via topic modelling. In: SIGIR, pp. 733–736. ACM (2016) Jian, F., Huang, J.X., Zhao, J., He, T., Hu, P.: A simple enhancement for ad-hoc information retrieval via topic modelling. In: SIGIR, pp. 733–736. ACM (2016)
44.
Zurück zum Zitat Jones, K.S., Walker, S., Robertson, S.E.: A probabilistic model of information retrieval: development and comparative experiments: part 2. Inf. Process. Manag. 36(6), 809–840 (2000)CrossRef Jones, K.S., Walker, S., Robertson, S.E.: A probabilistic model of information retrieval: development and comparative experiments: part 2. Inf. Process. Manag. 36(6), 809–840 (2000)CrossRef
45.
Zurück zum Zitat Jurgens, D., Kumar, S., Hoover, R., McFarland, D., Jurafsky, D.: Citation classification for behavioral analysis of a scientific field. CoRR (2016) Jurgens, D., Kumar, S., Hoover, R., McFarland, D., Jurafsky, D.: Citation classification for behavioral analysis of a scientific field. CoRR (2016)
46.
Zurück zum Zitat Kataria, S., Mitra, P., Bhatia, S.: Utilizing context in generative bayesian models for linked corpus. In: AAAI, vol. 10, p. 1 (2010) Kataria, S., Mitra, P., Bhatia, S.: Utilizing context in generative bayesian models for linked corpus. In: AAAI, vol. 10, p. 1 (2010)
47.
Zurück zum Zitat Klampfl, S., Rexha, A., Kern, R.: Identifying referenced text in scientific publications by summarisation and classification techniques. In: BIRNDL@ JCDL, pp. 122–131 (2016) Klampfl, S., Rexha, A., Kern, R.: Identifying referenced text in scientific publications by summarisation and classification techniques. In: BIRNDL@ JCDL, pp. 122–131 (2016)
48.
Zurück zum Zitat Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: ICML, pp. 1188–1196 (2014) Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: ICML, pp. 1188–1196 (2014)
49.
Zurück zum Zitat Li, L., Mao, L., Zhang, Y., Chi, J., Huang, T., Cong, X., Peng, H.: Cist system for cl-scisumm 2016 shared task. In: BIRNDL 2016 Joint Workshop on Bibliometric-Enhanced Information Retrieval and NLP for Digital Libraries (2016) Li, L., Mao, L., Zhang, Y., Chi, J., Huang, T., Cong, X., Peng, H.: Cist system for cl-scisumm 2016 shared task. In: BIRNDL 2016 Joint Workshop on Bibliometric-Enhanced Information Retrieval and NLP for Digital Libraries (2016)
50.
Zurück zum Zitat Lin, C.Y.: Rouge: A package for automatic evaluation of summaries. In: Text Summarization Branches Out, Proceedings of the ACL-04 Workshop, pp. 74–81 (2004) Lin, C.Y.: Rouge: A package for automatic evaluation of summaries. In: Text Summarization Branches Out, Proceedings of the ACL-04 Workshop, pp. 74–81 (2004)
51.
Zurück zum Zitat Lin, J., Madnani, N., Dorr, B.J.: Putting the user in the loop: interactive maximal marginal relevance for query-focused summarization. In: NAACL-HLT, pp. 305–308. Association for Computational Linguistics (2010) Lin, J., Madnani, N., Dorr, B.J.: Putting the user in the loop: interactive maximal marginal relevance for query-focused summarization. In: NAACL-HLT, pp. 305–308. Association for Computational Linguistics (2010)
52.
Zurück zum Zitat Lipscomb, C.E.: Medical subject headings (mesh). Bull. Med. Libr. Assoc. 88(3), 265 (2000) Lipscomb, C.E.: Medical subject headings (mesh). Bull. Med. Libr. Assoc. 88(3), 265 (2000)
53.
Zurück zum Zitat Mihalcea, R.: Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: The Companion Volume to the Proceedings of 42nd Annual Meeting of the Association for Computational Linguistics, pp. 170–173. Association for Computational Linguistics, Barcelona, Spain (2004) Mihalcea, R.: Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: The Companion Volume to the Proceedings of 42nd Annual Meeting of the Association for Computational Linguistics, pp. 170–173. Association for Computational Linguistics, Barcelona, Spain (2004)
54.
Zurück zum Zitat Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS, pp. 3111–3119 (2013) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS, pp. 3111–3119 (2013)
55.
Zurück zum Zitat Miller, G.A.: Wordnet: a lexical database for english. Commun. ACM 38(11), 39–41 (1995)CrossRef Miller, G.A.: Wordnet: a lexical database for english. Commun. ACM 38(11), 39–41 (1995)CrossRef
56.
Zurück zum Zitat Moraes, L., Baki, S., Verma, R., Lee, D.: University of houston at cl-scisumm 2016: Svms with tree kernels and sentence similarity. In: BIRNDL@ JCDL, pp. 113–121 (2016) Moraes, L., Baki, S., Verma, R., Lee, D.: University of houston at cl-scisumm 2016: Svms with tree kernels and sentence similarity. In: BIRNDL@ JCDL, pp. 113–121 (2016)
57.
Zurück zum Zitat Mrkšić, N., Séaghdha, D.Ó., Thomson, B., Gašić, M., Rojas-Barahona, L., Su, P.H., Vandyke, D., Wen, T.H., Young, S.: Counter-fitting word vectors to linguistic constraints. In: NAACL-HLT (2016) Mrkšić, N., Séaghdha, D.Ó., Thomson, B., Gašić, M., Rojas-Barahona, L., Su, P.H., Vandyke, D., Wen, T.H., Young, S.: Counter-fitting word vectors to linguistic constraints. In: NAACL-HLT (2016)
58.
Zurück zum Zitat Nakov, P.I., Schwartz, A.S., Hearst, M.: Citances: Citation sentences for semantic analysis of bioscience text. In: Proceedings of the SIGIR’04 Workshop on Search and Discovery in Bioinformatics, pp. 81–88 (2004) Nakov, P.I., Schwartz, A.S., Hearst, M.: Citances: Citation sentences for semantic analysis of bioscience text. In: Proceedings of the SIGIR’04 Workshop on Search and Discovery in Bioinformatics, pp. 81–88 (2004)
59.
Zurück zum Zitat Nomoto, T.: Neal: A neurally enhanced approach to linking citation and reference. In: BIRNDL 2016 Joint Workshop on Bibliometric-Enhanced Information Retrieval and NLP for Digital Libraries (2016) Nomoto, T.: Neal: A neurally enhanced approach to linking citation and reference. In: BIRNDL 2016 Joint Workshop on Bibliometric-Enhanced Information Retrieval and NLP for Digital Libraries (2016)
60.
Zurück zum Zitat Osborne, M.: Using maximum entropy for sentence extraction. In: Proceedings of the ACL-02 Workshop on Automatic Summarization, vol. 4, pp. 1–8. Association for Computational Linguistics (2002) Osborne, M.: Using maximum entropy for sentence extraction. In: Proceedings of the ACL-02 Workshop on Automatic Summarization, vol. 4, pp. 1–8. Association for Computational Linguistics (2002)
61.
Zurück zum Zitat Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. (1999) Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. (1999)
63.
Zurück zum Zitat Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. EMNLP 12, 1532–1543 (2014) Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. EMNLP 12, 1532–1543 (2014)
64.
Zurück zum Zitat Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 275–281. ACM (1998) Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 275–281. ACM (1998)
65.
Zurück zum Zitat Qazvinian, V., Radev, D., Mohammad, S.: Generating extractive summaries of scientific paradigms. J. Artif. Intell. Res. 46, 165–201 (2013)MathSciNetCrossRef Qazvinian, V., Radev, D., Mohammad, S.: Generating extractive summaries of scientific paradigms. J. Artif. Intell. Res. 46, 165–201 (2013)MathSciNetCrossRef
66.
Zurück zum Zitat Qazvinian, V., Radev, D.R.: Scientific paper summarization using citation summary networks. In: Proceedings of the 22nd International Conference on Computational Linguistics, vol. 1, pp. 689–696. Association for Computational Linguistics (2008) Qazvinian, V., Radev, D.R.: Scientific paper summarization using citation summary networks. In: Proceedings of the 22nd International Conference on Computational Linguistics, vol. 1, pp. 689–696. Association for Computational Linguistics (2008)
67.
Zurück zum Zitat Qazvinian, V., Radev, D.R.: Identifying non-explicit citing sentences for citation-based summarization. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 555–564. Association for Computational Linguistics (2010) Qazvinian, V., Radev, D.R.: Identifying non-explicit citing sentences for citation-based summarization. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 555–564. Association for Computational Linguistics (2010)
69.
Zurück zum Zitat Robertson, S., Zaragoza, H.: The Probabilistic Relevance Framework: BM25 and Beyond. Now Publishers Inc, Hanover (2009) Robertson, S., Zaragoza, H.: The Probabilistic Relevance Framework: BM25 and Beyond. Now Publishers Inc, Hanover (2009)
70.
Zurück zum Zitat Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 379–389. Association for Computational Linguistics, Lisbon, Portugal (2015). http://aclweb.org/anthology/D15-1044 Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 379–389. Association for Computational Linguistics, Lisbon, Portugal (2015). http://​aclweb.​org/​anthology/​D15-1044
71.
Zurück zum Zitat Saggion, H., AbuRaed, A., Ronzano, F.: Trainable citation-enhanced summarization of scientific articles. In: Cabanac G, Chandrasekaran MK, Frommholz I, Jaidka K, Kan M, Mayr P, Wolfram D, editors. Proceedings of the Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL); 2016 June 23; Newark, United States. CEUR Workshop Proceedings:[Sl]; 2016. p. 175-86. CEUR Workshop Proceedings (2016) Saggion, H., AbuRaed, A., Ronzano, F.: Trainable citation-enhanced summarization of scientific articles. In: Cabanac G, Chandrasekaran MK, Frommholz I, Jaidka K, Kan M, Mayr P, Wolfram D, editors. Proceedings of the Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL); 2016 June 23; Newark, United States. CEUR Workshop Proceedings:[Sl]; 2016. p. 175-86. CEUR Workshop Proceedings (2016)
72.
Zurück zum Zitat Snomed, C.: Systematized Nomenclature of Medicine-Clinical Terms. International Health Terminology Standards Development Organisation, Copenhagen (2011) Snomed, C.: Systematized Nomenclature of Medicine-Clinical Terms. International Health Terminology Standards Development Organisation, Copenhagen (2011)
73.
Zurück zum Zitat Sparck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Doc. 28(1), 11–21 (1972)CrossRef Sparck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Doc. 28(1), 11–21 (1972)CrossRef
74.
Zurück zum Zitat Steinberger, J., Jezek, K.: Using latent semantic analysis in text summarization and summary evaluation. In: Proceedings of ISIM04, pp. 93–100 (2004) Steinberger, J., Jezek, K.: Using latent semantic analysis in text summarization and summary evaluation. In: Proceedings of ISIM04, pp. 93–100 (2004)
75.
Zurück zum Zitat Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Welling, M. et al. (eds.) Advances in Neural Information Processing Systems, pp. 3104–3112. Curran Associates, Inc. (2014) Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Welling, M. et al. (eds.) Advances in Neural Information Processing Systems, pp. 3104–3112. Curran Associates, Inc. (2014)
76.
77.
Zurück zum Zitat Teufel, S., Siddharthan, A., Tidhar, D.: Automatic classification of citation function. In: EMNLP ’06, p. 103 (2006) Teufel, S., Siddharthan, A., Tidhar, D.: Automatic classification of citation function. In: EMNLP ’06, p. 103 (2006)
78.
Zurück zum Zitat Vanderwende, L., Suzuki, H., Brockett, C., Nenkova, A.: Beyond sumbasic: task-focused summarization with sentence simplification and lexical expansion. Inf. Process. Manag. 43(6), 1606–1618 (2007) Vanderwende, L., Suzuki, H., Brockett, C., Nenkova, A.: Beyond sumbasic: task-focused summarization with sentence simplification and lexical expansion. Inf. Process. Manag. 43(6), 1606–1618 (2007)
79.
Zurück zum Zitat Wang, S., Manning, C.D.: Baselines and bigrams: simple, good sentiment and topic classification. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers, vol. 2, pp. 90–94. Association for Computational Linguistics (2012) Wang, S., Manning, C.D.: Baselines and bigrams: simple, good sentiment and topic classification. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers, vol. 2, pp. 90–94. Association for Computational Linguistics (2012)
80.
Zurück zum Zitat Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst. (TOIS) 22(2), 179–214 (2004)CrossRef Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst. (TOIS) 22(2), 179–214 (2004)CrossRef
Metadaten
Titel
Scientific document summarization via citation contextualization and scientific discourse
verfasst von
Arman Cohan
Nazli Goharian
Publikationsdatum
09.05.2017
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal on Digital Libraries / Ausgabe 2-3/2018
Print ISSN: 1432-5012
Elektronische ISSN: 1432-1300
DOI
https://doi.org/10.1007/s00799-017-0216-8

Weitere Artikel der Ausgabe 2-3/2018

International Journal on Digital Libraries 2-3/2018 Zur Ausgabe