skip to main content
10.1145/3383583.3398534acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
research-article

HybridCite: A Hybrid Model for Context-Aware Citation Recommendation

Published:01 August 2020Publication History

ABSTRACT

Citation recommendation systems aim to recommend citations for either a complete paper or a small portion of text called a citation context. The process of recommending citations for citation contexts is called local citation recommendation and is the focus of this paper. Firstly, we develop citation recommendation approaches based on embeddings, topic modeling, and information retrieval techniques. We combine, for the first time to the best of our knowledge, the best-performing algorithms into a semi-genetic hybrid recommender system for citation recommendation. We evaluate the single approaches and the hybrid approach offline based on several data sets, such as the Microsoft Academic Graph (MAG) and the MAG in combination with arXiv and ACL. We further conduct a user study for evaluating our approaches online. Our evaluation results show that a hybrid model containing embedding and information retrieval-based components outperforms its individual components and further algorithms by a large margin.

Skip Supplemental Material Section

Supplemental Material

3383583.3398534.mp4

mp4

36.8 MB

References

  1. Chandra Bhagavatula, Sergey Feldman, Russell Power, and Waleed Ammar. 2018. Content-Based Citation Recommendation. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT'18 ). 238--251.Google ScholarGoogle ScholarCross RefCross Ref
  2. Steven Bird, Robert Dale, Bonnie J. Dorr, Bryan R. Gibson, Mark Thomas Joseph, Min-Yen Kan, Dongwon Lee, Brett Powley, Dragomir R. Radev, and Yee Fan Tan. 2008. The ACL Anthology Reference Corpus: A Reference Dataset for Bibliographic Research in Computational Linguistics. In Proceedings of the International Conference on Language Resources and Evaluation (LREC'08).Google ScholarGoogle Scholar
  3. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research, Vol. 3 (2003), 993--1022.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Robin Burke. 2002. Hybrid Recommender Systems: Survey and Experiments. User Modeling and User-Adapted Interaction, Vol. 12, 4 (2002), 331--370.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Robin Burke. 2007. Hybrid Web Recommender Systems .Springer Berlin Heidelberg, Berlin, Heidelberg, 377--408.Google ScholarGoogle Scholar
  6. Xiaoyan Cai, Junwei Han, and Libin Yang. 2018. Generative Adversarial Network Based Heterogeneous Bibliographic Network Representation for Personalized Citation Recommendation. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18). 5747--5754.Google ScholarGoogle ScholarCross RefCross Ref
  7. Daniel Duma, Ewan Klein, Maria Liakata, James Ravenscroft, and Amanda Clare. 2016a. Rhetorical Classification of Anchor Text for Citation Recommendation. D-Lib Magazine, Vol. 22, 9/10 (2016).Google ScholarGoogle ScholarCross RefCross Ref
  8. Daniel Duma, Maria Liakata, Amanda Clare, James Ravenscroft, and Ewan Klein. 2016b. Applying Core Scientific Concepts to Context-Based Citation Recommendation. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16). 1737--1742.Google ScholarGoogle Scholar
  9. Travis Ebesu and Yi Fang. 2017. Neural Citation Network for Context-Aware Citation Recommendation. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '17). 1093--1096.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Michael Farber and Adam Jatowt. 2020. Citation Recommendation: Approaches and Datasets. International Journal on Digital Libraries (2020).Google ScholarGoogle Scholar
  11. Michael F"arber, Alexander Thiemann, and Adam Jatowt. 2018. A High-Quality Gold Standard for Citation-based Tasks. In Proceedings of the International Conference on Language Resources and Evaluation (LREC'18). 1885--1889.Google ScholarGoogle Scholar
  12. Michael F"a rber, Alexander Thiemann, and Adam Jatowt. 2018. To Cite, or Not to Cite? Detecting Citation Contexts in Text. In Proceedings of the 40th European Conference on Information Retrieval (ECIR'18). 598--603.Google ScholarGoogle Scholar
  13. Soumyajit Ganguly and Vikram Pudi. 2017. Paper2vec: Combining Graph and Text Information for Scientific Paper Representation. In Proceedings of the 39th European Conference on Information Retrieval (ECIR'17). 383--395.Google ScholarGoogle ScholarCross RefCross Ref
  14. Jialong Han, Yan Song, Wayne Xin Zhao, Shuming Shi, and Haisong Zhang. 2018. hyperdoc2vec: Distributed Representations of Hypertext Documents. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL'18). 2384--2394.Google ScholarGoogle ScholarCross RefCross Ref
  15. Qi He, Daniel Kifer, Jian Pei, Prasenjit Mitra, and C. Lee Giles. 2011. Citation Recommendation Without Author Supervision. In Proceedings of the Fourth ACM International Conference on Web Search and Data Mining (WSDM '11). ACM, New York, NY, USA, 755--764.Google ScholarGoogle Scholar
  16. Qi He, Jian Pei, Daniel Kifer, Prasenjit Mitra, and Lee Giles. 2010. Context-aware Citation Recommendation. In Proceedings of the 19th International Conference on World Wide Web (WWW '10). 421--430.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Bo-Yu Hsiao, Chih-Heng Chung, and Bi-Ru Dai. 2015. A Model of Relevant Common Author and Citation Authority Propagation for Citation Recommendation. In Proceedings of the 16th IEEE International Conference on Mobile Data Management (MDM'15). 117--119.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Wenyi Huang, Saurabh Kataria, Cornelia Caragea, Prasenjit Mitra, C. Lee Giles, and Lior Rokach. 2012. Recommending Citations: Translating Papers into References. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM'12). 1910--1914.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Wenyi Huang, Zhaohui Wu, Chen Liang, Prasenjit Mitra, and C. Lee Giles. 2015. A Neural Probabilistic Model for Context Based Citation Recommendation. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI'15). AAAI Press, 2404--2410.Google ScholarGoogle Scholar
  20. Wenyi Huang, Zhaohui Wu, Prasenjit Mitra, and C. Lee Giles. 2014. RefSeer: A citation recommendation system. In Proceedings of the 14th Joint Conference on Digital Libraries (JCDL'14). 371--374.Google ScholarGoogle Scholar
  21. Zhuoren Jiang. 2013. Citation Recommendation via Time-series Scholarly Topic Analysis and Publication Prior Analysis. TCDL Bulletin, Vol. 9, 2 (2013).Google ScholarGoogle Scholar
  22. Zhuoren Jiang, Yao Lu, and Xiaozhong Liu. 2018a. Cross-language Citation Recommendation via Publication Content and Citation Representation Fusion. In Proceedings of the 18th ACM/IEEE Joint Conference on Digital Libraries (JCDL'18). 347--348.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Zhuoren Jiang, Yue Yin, Liangcai Gao, Yao Lu, and Xiaozhong Liu. 2018b. Cross-language Citation Recommendation via Hierarchical Representation Learning on Heterogeneous Graph. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2018, Ann Arbor, MI, USA, July 08--12, 2018. 635--644.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A.and Mabe M. Johnson, R.and Watkinson. 2018. The STM report, an overview of scientific and scholarly publishing. (2018). www.stm-assoc.org/2018_10_04_STM_Report_2018.pdfGoogle ScholarGoogle Scholar
  25. Anshul Kanakia, Zhihong Shen, Darrin Eide, and Kuansan Wang. 2019. A Scalable Hybrid Research Paper Recommender System for Microsoft Academic. In Proceedings of the The World Wide Web Conference (WWW'19). 2893--2899.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Saurabh Kataria, Prasenjit Mitra, and Sumit Bhatia. 2010. Utilizing Context in Generative Bayesian Models for Linked Corpus. In Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI'10). 1340--1345.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Yuta Kobayashi, Masashi Shimbo, and Yuji Matsumoto. 2018. Citation Recommendation Using Distributed Representation of Discourse Facets in Scientific Articles. In Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries (JCDL '18). 243--251.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Quoc V. Le and Tomas Mikolov. 2014. Distributed Representations of Sentences and Documents. In Proceedings of the 31th International Conference on Machine Learning, ICML 2014, Beijing, China, 21--26 June 2014. 1188--1196.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Xiaozhong Liu, Yingying Yu, Chun Guo, Yizhou Sun, and Liangcai Gao. 2014. Full-text based Context-Rich Heterogeneous Network Mining Approach for Citation Recommendation. In Proceedings of the Joint Conference on Digital Libraries (JCDL'14). 361--370.Google ScholarGoogle ScholarCross RefCross Ref
  30. Sean M. McNee, Istvan Albert, Dan Cosley, Prateep Gopalkrishnan, Shyong K. Lam, Al Mamunur Rashid, Joseph A. Konstan, and John Riedl. 2002. On the recommending of citations for research papers. In Proceedings of the ACM 2002 Conference on Computer Supported Cooperative (CSCW'02). 116--125.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. CoRR, Vol. abs/1310.4546 (2013). arxiv: 1310.4546 http://arxiv.org/abs/1310.4546Google ScholarGoogle Scholar
  32. Juergen Mueller. 2017. Combining aspects of genetic algorithms with weighted recommender hybridization. In Proceedings of the 19th International Conference on Information Integration and Web-based Applications & Services (iiWAS'17). 13--22.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Eric T. Nalisnick, Bhaskar Mitra, Nick Craswell, and Rich Caruana. 2016. Improving Document Ranking with Dual Word Embeddings. In Proceedings of the 25th International Conference on World Wide Web (WWW'16). 83--84.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Ramesh Nallapati, Amr Ahmed, Eric P. Xing, and William W. Cohen. 2008. Joint latent topic models for text and citations. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'08). 542--550.Google ScholarGoogle Scholar
  35. Lior Rokach, Prasenjit Mitra, Saurabh Kataria, Wenyi Huang, and Lee Giles. 2013. A Supervised Learning Method for Context-Aware Citation Recommendation in a Large Corpus. In Proceedings of the Large-Scale and Distributed Systems for Information Retrieval Workshop (LSDS-IR'13). 17--22.Google ScholarGoogle Scholar
  36. Tarek Saier and Michael Farber. 2019. Bibliometric-Enhanced arXiv: A Data Set for Paper-Based and Citation-Based Tasks. In Proceedings of the 8th International Workshop on Bibliometric-enhanced Information Retrieval (BIR'19). 14--26.Google ScholarGoogle Scholar
  37. Arnab Sinha, Zhihong Shen, Yang Song, Hao Ma, Darrin Eide, Bo-June Paul Hsu, and Kuansan Wang. 2015. An Overview of Microsoft Academic Service (MAS) and Applications. In Proceedings of the 24th International Conference on World Wide Web (WWW'15). 243--246.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Trevor Strohman, W. Bruce Croft, and David D. Jensen. 2007. Recommending citations for academic papers. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'07). 705--706.Google ScholarGoogle Scholar
  39. Xuewei Tang, Xiaojun Wan, and Xun Zhang. 2014. Cross-language Context-aware Citation Recommendation in Scientific Articles. In Proceedings of the 37th International Conference on Research and Development in Information Retrieval (SIGIR '14). 817--826.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Libin Yang, Yu Zheng, Xiaoyan Cai, Hang Dai, Dejun Mu, Lantian Guo, and Tao Dai. 2018. A LS™ Based Model for Personalized Context-Aware Citation Recommendation. IEEE Access, Vol. 6 (2018), 59618--59627.Google ScholarGoogle ScholarCross RefCross Ref
  41. Jun Yin and Xiaoming Li. 2017. Personalized Citation Recommendation via Convolutional Neural Networks. In Proceedings of the First International Joint Conference on Web and Big Data (APWeb-WAIM'17). 285--293.Google ScholarGoogle ScholarCross RefCross Ref
  42. Fattane Zarrinkalam and Mohsen Kahani. 2013. SemCiR: A citation recommendation system based on a novel semantic distance measure. Program, Vol. 47, 1 (2013), 92--112.Google ScholarGoogle ScholarCross RefCross Ref
  43. Ye Zhang, Libin Yang, Xiaoyan Cai, and Hang Dai. 2018. A Novel Personalized Citation Recommendation Approach Based on GAN. In Proceedings of the 24th International Symposium on Foundations of Intelligent Systems (ISMIS'18). 268--278.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. HybridCite: A Hybrid Model for Context-Aware Citation Recommendation

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image ACM Conferences
                JCDL '20: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020
                August 2020
                611 pages
                ISBN:9781450375856
                DOI:10.1145/3383583

                Copyright © 2020 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 1 August 2020

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • research-article

                Acceptance Rates

                Overall Acceptance Rate415of1,482submissions,28%

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader