Skip to main content
Top
Published in: Cluster Computing 1/2018

02-06-2017

Extracting reference text from citation contexts

Authors: Afsheen Khalid, Fakhri Alam, Imran Ahmed

Published in: Cluster Computing | Issue 1/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Information from the textual context of citations in scientific articles has been studied and used in many applications by the research community. For example, it has been used in topic modeling, sentiment analysis, scientific paper summarization and information retrieval. However, these applications suffer the problem of right identification of citation context window and alternately use the text in a fixed size window around the citation mention. In this way, citation contexts may contain terms or other text that is not used for describing the citation and should not be included in the citation context. Identifying such non-reference text in the citation context is a non-trivial task, yet significant. In this paper, it is attempted to identify and remove the non-reference text from the citation contexts by developing a heuristic algorithm based on pruning the transition-based dependency parse tree. Evaluating the accuracy of our algorithm, results showed 77% macro-precision, 83% macro-recall and 80% F-macro for 88 research articles of testing dataset having varying number of citations. Additionally, we find that for many of the cited articles in our testing dataset, the number of objective citation contexts is more than subjective ones.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Abu-Jbara, A., Radev, D.: Coherent citation-based summarization of scientific papers. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 500–509. Association for Computational Linguistics (2011) Abu-Jbara, A., Radev, D.: Coherent citation-based summarization of scientific papers. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 500–509. Association for Computational Linguistics (2011)
2.
go back to reference Agarwal, B., Poria, S., Mittal, N., Gelbukh, A., Hussain, A.: Concept-level sentiment analysis with dependency-based semantic parsing: a novel approach. Cogn. Comput. 7(4), 487–499 (2015)CrossRef Agarwal, B., Poria, S., Mittal, N., Gelbukh, A., Hussain, A.: Concept-level sentiment analysis with dependency-based semantic parsing: a novel approach. Cogn. Comput. 7(4), 487–499 (2015)CrossRef
3.
go back to reference Liang, Y., Li, Q., Qian, T.: Finding relevant papers based on citation relations. In: International Conference on Web-Age Information Management, pp. 403–414. Springer, New York (2011) Liang, Y., Li, Q., Qian, T.: Finding relevant papers based on citation relations. In: International Conference on Web-Age Information Management, pp. 403–414. Springer, New York (2011)
4.
go back to reference Nanba, H., Kando, N., Okumura, M.: Classification of research papers using citation links and citation types: Towards automatic review article generation. Adv. Classif. Res. Online 11(1), 117–134 (2011)CrossRef Nanba, H., Kando, N., Okumura, M.: Classification of research papers using citation links and citation types: Towards automatic review article generation. Adv. Classif. Res. Online 11(1), 117–134 (2011)CrossRef
5.
go back to reference Dong, C., Schäfer, U.: Ensemble-style self-training on citation classification. In: IJCNLP, pp. 623–631 (2011) Dong, C., Schäfer, U.: Ensemble-style self-training on citation classification. In: IJCNLP, pp. 623–631 (2011)
6.
go back to reference Ritchie, A., Teufel, S., Robertson, S.: Using terms from citations for ir: some first results. In: European Conference on Information Retrieval, pp. 211–221. Springer, New York (2008) Ritchie, A., Teufel, S., Robertson, S.: Using terms from citations for ir: some first results. In: European Conference on Information Retrieval, pp. 211–221. Springer, New York (2008)
7.
go back to reference Liu, S., Chen, C.: The differences between latent topics in abstracts and citation contexts of citing papers. J. Am. Soc. Inf. Sci. Technol. 64(3), 627–639 (2013)CrossRef Liu, S., Chen, C.: The differences between latent topics in abstracts and citation contexts of citing papers. J. Am. Soc. Inf. Sci. Technol. 64(3), 627–639 (2013)CrossRef
8.
go back to reference Caragea, C., Bulgarov, F.A., Godea, A., Gollapalli, S.D.: Citation-enhanced keyphrase extraction from research papers: a supervised approach. EMNLP 14, 1435–1446 (2014) Caragea, C., Bulgarov, F.A., Godea, A., Gollapalli, S.D.: Citation-enhanced keyphrase extraction from research papers: a supervised approach. EMNLP 14, 1435–1446 (2014)
9.
go back to reference Teufel, S., Siddharthan, A., Tidhar, D.: An annotation scheme for citation function. In: Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue, pp. 80–87. Association for Computational Linguistics (2009) Teufel, S., Siddharthan, A., Tidhar, D.: An annotation scheme for citation function. In: Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue, pp. 80–87. Association for Computational Linguistics (2009)
10.
go back to reference Qazvinian, V., Radev, D.R.: Identifying non-explicit citing sentences for citation-based summarization. In: Proceedings of the 48th annual meeting of the association for computational linguistics, pp. 555–564. Association for Computational Linguistics (2010) Qazvinian, V., Radev, D.R.: Identifying non-explicit citing sentences for citation-based summarization. In: Proceedings of the 48th annual meeting of the association for computational linguistics, pp. 555–564. Association for Computational Linguistics (2010)
11.
go back to reference Nanba, H., Okumura, M.: Towards multi-paper summarization using reference information. IJCAI 99, 926–931 (1999) Nanba, H., Okumura, M.: Towards multi-paper summarization using reference information. IJCAI 99, 926–931 (1999)
12.
go back to reference Abu-Jbara, A., Radev, D.: Reference scope identification in citing sentences. In: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 80–90. Association for Computational Linguistics (2012) Abu-Jbara, A., Radev, D.: Reference scope identification in citing sentences. In: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 80–90. Association for Computational Linguistics (2012)
13.
go back to reference Ritchie, A., Teufel, S., Robertson, S.: How to find better index terms through citations. In: Proceedings of the Workshop on How Can Computational Linguistics Improve Information Retrieval?, pp. 25–32. Association for Computational Linguistics (2006) Ritchie, A., Teufel, S., Robertson, S.: How to find better index terms through citations. In: Proceedings of the Workshop on How Can Computational Linguistics Improve Information Retrieval?, pp. 25–32. Association for Computational Linguistics (2006)
14.
go back to reference O’Connor, J.: Citing statements: computer recognition and use to improve retrieval. Inf. Process. Manag. 18(3), 125–131 (1982)CrossRef O’Connor, J.: Citing statements: computer recognition and use to improve retrieval. Inf. Process. Manag. 18(3), 125–131 (1982)CrossRef
15.
go back to reference Ritchie, A., Robertson, S., Teufel, S.: Comparing citation contexts for information retrieval. In: Proceedings of the 17th ACM conference on Information and knowledge management, pp. 213–222. ACM (2008) Ritchie, A., Robertson, S., Teufel, S.: Comparing citation contexts for information retrieval. In: Proceedings of the 17th ACM conference on Information and knowledge management, pp. 213–222. ACM (2008)
16.
go back to reference Alvarez, M.H.: Concit-corpus context citation analysis to learn function, polarity and influence. Ph.D. thesis, Universitat d’Alacant-Universidad de Alicante (2015) Alvarez, M.H.: Concit-corpus context citation analysis to learn function, polarity and influence. Ph.D. thesis, Universitat d’Alacant-Universidad de Alicante (2015)
17.
go back to reference Radev, D.R., Muthukrishnan, P., Qazvinian, V., Abu-Jbara, A.: The acl anthology network corpus. Lang. Resour. Eval. 47(4), 919–944 (2013) Radev, D.R., Muthukrishnan, P., Qazvinian, V., Abu-Jbara, A.: The acl anthology network corpus. Lang. Resour. Eval. 47(4), 919–944 (2013)
18.
go back to reference Athar, A.: Sentiment analysis of citations using sentence structure-based features. In: Proceedings of the ACL 2011 student session, pp. 81–87. Association for Computational Linguistics (2011) Athar, A.: Sentiment analysis of citations using sentence structure-based features. In: Proceedings of the ACL 2011 student session, pp. 81–87. Association for Computational Linguistics (2011)
19.
go back to reference Jochim, C., Schütze, H.: Towards a generic and flexible citation classifier based on a faceted classification scheme. In: Proceedings of the 2012 International Conference on Computational Linguistics, pp. 1343–1358. Citeseer (2012) Jochim, C., Schütze, H.: Towards a generic and flexible citation classifier based on a faceted classification scheme. In: Proceedings of the 2012 International Conference on Computational Linguistics, pp. 1343–1358. Citeseer (2012)
20.
go back to reference Di Iorio, A., Nuzzolese, A.G., Peroni, S.: Towards the automatic identification of the nature of citations. In: SePublica, pp. 63–74 (2013) Di Iorio, A., Nuzzolese, A.G., Peroni, S.: Towards the automatic identification of the nature of citations. In: SePublica, pp. 63–74 (2013)
21.
go back to reference Li, X., He, Y., Meyers, A., Grishman, R.: Towards fine-grained citation function classification. In: RANLP, pp. 402–407 (2013) Li, X., He, Y., Meyers, A., Grishman, R.: Towards fine-grained citation function classification. In: RANLP, pp. 402–407 (2013)
22.
go back to reference Tuarob, S., Mitra, P., Giles, C.L.: A classification scheme for algorithm citation function in scholarly works. In: Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 367–368. ACM (2013) Tuarob, S., Mitra, P., Giles, C.L.: A classification scheme for algorithm citation function in scholarly works. In: Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 367–368. ACM (2013)
23.
go back to reference Ding, Y., Zhang, G., Chambers, T., Song, M., Wang, X., Zhai, C.: Content-based citation analysis: the next generation of citation analysis. J. Assoc. Inf. Sci. Technol. 65(9), 1820–1833 (2014)CrossRef Ding, Y., Zhang, G., Chambers, T., Song, M., Wang, X., Zhai, C.: Content-based citation analysis: the next generation of citation analysis. J. Assoc. Inf. Sci. Technol. 65(9), 1820–1833 (2014)CrossRef
24.
go back to reference Mimno, D., Wallach, H.M., Talley, E., Leenders, M., McCallum, A.: Optimizing semantic coherence in topic models. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 262–272. Association for Computational Linguistics (2011) Mimno, D., Wallach, H.M., Talley, E., Leenders, M., McCallum, A.: Optimizing semantic coherence in topic models. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 262–272. Association for Computational Linguistics (2011)
25.
go back to reference Martin, F., Johnson, M.: More efficient topic modelling through a noun only approach. In: Australasian Language Technology Association Workshop 2015, p. 111 (2015) Martin, F., Johnson, M.: More efficient topic modelling through a noun only approach. In: Australasian Language Technology Association Workshop 2015, p. 111 (2015)
26.
go back to reference Matsumoto, S., Takamura, H., Okumura, M.: Sentiment classification using word sub-sequences and dependency sub-trees. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 301–311. Springer (2005) Matsumoto, S., Takamura, H., Okumura, M.: Sentiment classification using word sub-sequences and dependency sub-trees. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 301–311. Springer (2005)
27.
go back to reference Nakagawa, T., Inui, K., Kurohashi, S.: Dependency tree-based sentiment classification using crfs with hidden variables. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 786–794. Association for Computational Linguistics (2010) Nakagawa, T., Inui, K., Kurohashi, S.: Dependency tree-based sentiment classification using crfs with hidden variables. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 786–794. Association for Computational Linguistics (2010)
28.
go back to reference Pak, A., Paroubek, P.: Text representation using dependency tree subgraphs for sentiment analysis. In: International Conference on Database Systems for Advanced Applications, pp. 323–332. Springer, New York (2011) Pak, A., Paroubek, P.: Text representation using dependency tree subgraphs for sentiment analysis. In: International Conference on Database Systems for Advanced Applications, pp. 323–332. Springer, New York (2011)
29.
go back to reference Tu, Z., Jiang, W., Liu, Q., Lin, S.: Dependency forest for sentiment analysis. In: Natural Language Processing and Chinese Computing, pp. 69–77. Springer, New York (2012) Tu, Z., Jiang, W., Liu, Q., Lin, S.: Dependency forest for sentiment analysis. In: Natural Language Processing and Chinese Computing, pp. 69–77. Springer, New York (2012)
30.
go back to reference Gamallo, P., Garcia, M., Fernández-Lanza, S.: Dependency-based open information extraction. In: Proceedings of the Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP, pp. 10–18. Association for Computational Linguistics (2012) Gamallo, P., Garcia, M., Fernández-Lanza, S.: Dependency-based open information extraction. In: Proceedings of the Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP, pp. 10–18. Association for Computational Linguistics (2012)
Metadata
Title
Extracting reference text from citation contexts
Authors
Afsheen Khalid
Fakhri Alam
Imran Ahmed
Publication date
02-06-2017
Publisher
Springer US
Published in
Cluster Computing / Issue 1/2018
Print ISSN: 1386-7857
Electronic ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-017-0954-9

Other articles of this Issue 1/2018

Cluster Computing 1/2018 Go to the issue

Premium Partner