Skip to main content
Top
Published in: International Journal of Machine Learning and Cybernetics 6/2021

14-01-2021 | Original Article

Joint learning of author and citation contexts for computing drift in scholarly documents

Authors: J. Vijayarani, T. V. Geetha

Published in: International Journal of Machine Learning and Cybernetics | Issue 6/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Scholarly documents are sources of information on research topics written by academic experts. Topic drift in such scholarly documents is usually linked with the contextual variation in the title or abstract or entire document over time. However, topic distribution over words in different components of the document is non-uniform due to the varying impact of authors and citations, and their contribution to drift must be processed accordingly. This paper builds a model that distinguishes the context of a research document based on the author and citation by incorporating relation between topic, author, citation, word and time in the form of author context vector and citation context vector. To infer posterior probabilities, a parallel author cited_author topic model is presented. Continuous time bivariate Brownian motion model is employed for deducing the evolving bivariate topic parameters, specific to the author and citation. The word, topic pairs from the author and citation context vectors are jointly learned to yield topical word embeddings over time conditioned on author and citation contexts. When evaluated with NIPS and business journals datasets, the proposed model identifies topical variations over time precisely compared to other methods. It is found that broadening of topic happens due to the author context, and topic deviation is mainly caused by citation context.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Show more products
Literature
1.
go back to reference Alzubi OA, Alzubi JA, Tedmori S, Rashaideh H, Almomani O (2018) Consensus-based combining method for classifier ensembles. Int Arab J Inf Technol 15(1):76–86 Alzubi OA, Alzubi JA, Tedmori S, Rashaideh H, Almomani O (2018) Consensus-based combining method for classifier ensembles. Int Arab J Inf Technol 15(1):76–86
2.
go back to reference Alzubi OA, Alzubi JA, Alweshah M, Qiqieh I, Al-Shami S, Ramachandran M (2020a) An optimal pruning algorithm of classifier ensembles: dynamic programming approach. Neural Comput Appl 32:16091–16107CrossRef Alzubi OA, Alzubi JA, Alweshah M, Qiqieh I, Al-Shami S, Ramachandran M (2020a) An optimal pruning algorithm of classifier ensembles: dynamic programming approach. Neural Comput Appl 32:16091–16107CrossRef
3.
go back to reference Alzubi JA, Jain R, Kathuria A, Khandelwal A, Saxena A, Singh A (2020b) Paraphrase identification using collaborative adversarial networks. J Intell Fuzzy Syst 39(1):1021–1032CrossRef Alzubi JA, Jain R, Kathuria A, Khandelwal A, Saxena A, Singh A (2020b) Paraphrase identification using collaborative adversarial networks. J Intell Fuzzy Syst 39(1):1021–1032CrossRef
4.
go back to reference Alzubi JA (2016) Diversity-based boosting algorithm. Int J Adv Comput Sci Appl 7(5):524–529 Alzubi JA (2016) Diversity-based boosting algorithm. Int J Adv Comput Sci Appl 7(5):524–529
5.
go back to reference Amjad T, Daud A, Song M (2018) Measuring the impact of topic drift in scholarly networks. Companion Proc Web Conf 2018:373–378 Amjad T, Daud A, Song M (2018) Measuring the impact of topic drift in scholarly networks. Companion Proc Web Conf 2018:373–378
6.
go back to reference Bai X, Zhang F, Lee I (2019) Predicting the citations of scholarly paper. J Informetr 13(1):407–418CrossRef Bai X, Zhang F, Lee I (2019) Predicting the citations of scholarly paper. J Informetr 13(1):407–418CrossRef
7.
go back to reference Bhadury A, Chen J, Zhu J, Liu S (2016) Scaling up dynamic topic models. In: Proceedings of the 25th international conference on world wide web. International World Wide Web Conferences Steering Committee, pp 381–390 Bhadury A, Chen J, Zhu J, Liu S (2016) Scaling up dynamic topic models. In: Proceedings of the 25th international conference on world wide web. International World Wide Web Conferences Steering Committee, pp 381–390
8.
go back to reference Blei DM, Lafferty JD (2006) Dynamic topic models. In: Proceedings of the 23rd international conference on machine learning, ACM, pp 113–120 Blei DM, Lafferty JD (2006) Dynamic topic models. In: Proceedings of the 23rd international conference on machine learning, ACM, pp 113–120
9.
go back to reference Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022MATH Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022MATH
10.
go back to reference Bolellia L, Gilesb SECL (2009) What is trendy? Generative models for topic detection in scientific literature Bolellia L, Gilesb SECL (2009) What is trendy? Generative models for topic detection in scientific literature
11.
go back to reference Dietz L, Bickel S, Scheffer T (2007) Unsupervised prediction of citation influences. In: Proceedings of the 24th international conference on machine learning, ACM, pp 233–240 Dietz L, Bickel S, Scheffer T (2007) Unsupervised prediction of citation influences. In: Proceedings of the 24th international conference on machine learning, ACM, pp 233–240
12.
go back to reference Dubey A, Hefny A, Williamson S, Xing EP (2013) A nonparametric mixture model for topic modeling over time. In: Proceedings of the 2013 SIAM international conference on data mining. Society for Industrial and Applied Mathematics, pp 530–538 Dubey A, Hefny A, Williamson S, Xing EP (2013) A nonparametric mixture model for topic modeling over time. In: Proceedings of the 2013 SIAM international conference on data mining. Society for Industrial and Applied Mathematics, pp 530–538
13.
go back to reference Giaquinto R, Banerjee A (2018) Topic modeling on health journals with regularized variational inference. In: Thirty-second AAAI conference on artificial intelligence Giaquinto R, Banerjee A (2018) Topic modeling on health journals with regularized variational inference. In: Thirty-second AAAI conference on artificial intelligence
14.
go back to reference Gupta P, Rajaram S, Schütze H, Andrassy B (2017) Deep temporal-recurrent-replicated-softmax for topical trends over time. arXiv:1711.05626 Gupta P, Rajaram S, Schütze H, Andrassy B (2017) Deep temporal-recurrent-replicated-softmax for topical trends over time. arXiv:​1711.​05626
15.
go back to reference Jensen S, Liu X, Yu Y, Milojevic S (2016) Generation of topic evolution trees from heterogeneous bibliographic networks. J Informetr 10(2):606–621CrossRef Jensen S, Liu X, Yu Y, Milojevic S (2016) Generation of topic evolution trees from heterogeneous bibliographic networks. J Informetr 10(2):606–621CrossRef
16.
go back to reference Jeong YS, Lee SH, Gweon G (2016) Discovery of research interests of authors over time using a topic model. In: 2016 international conference on big data and smart computing (BigComp), IEEE, pp 24–31 Jeong YS, Lee SH, Gweon G (2016) Discovery of research interests of authors over time using a topic model. In: 2016 international conference on big data and smart computing (BigComp), IEEE, pp 24–31
17.
go back to reference Jeong YK, Song M, Ding Y (2014) Content-based author co-citation analysis. J Informetr 8(1):197–211CrossRef Jeong YK, Song M, Ding Y (2014) Content-based author co-citation analysis. J Informetr 8(1):197–211CrossRef
18.
go back to reference Jiang D, Shi L, Lian R, Wu H (2016) Latent topic embedding. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers, pp 2689–2698 Jiang D, Shi L, Lian R, Wu H (2016) Latent topic embedding. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers, pp 2689–2698
19.
go back to reference Jin J, Geng Q, Mou H, Chen C (2019) Author–subject–topic model for reviewer recommendation. J Inf Sci 45(4):554–570CrossRef Jin J, Geng Q, Mou H, Chen C (2019) Author–subject–topic model for reviewer recommendation. J Inf Sci 45(4):554–570CrossRef
20.
go back to reference Kataria S, Mitra P, Caragea C, Giles CL (2011) Context sensitive topic models for author influence in document networks. In: Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, vol 3, pp 2274–2280 Kataria S, Mitra P, Caragea C, Giles CL (2011) Context sensitive topic models for author influence in document networks. In: Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, vol 3, pp 2274–2280
21.
go back to reference Kim J, Kim D, Oh A (2017) Joint modeling of topics, citations, and topical authority in academic corpora. Trans Assoc Comput Linguist 5:191–204CrossRef Kim J, Kim D, Oh A (2017) Joint modeling of topics, citations, and topical authority in academic corpora. Trans Assoc Comput Linguist 5:191–204CrossRef
22.
go back to reference Li Y, Xu Z, Wang X, Wang X (2020) A bibliometric analysis on deep learning during 2007–2019. Int J Mach Learn Cybern 1–20 Li Y, Xu Z, Wang X, Wang X (2020) A bibliometric analysis on deep learning during 2007–2019. Int J Mach Learn Cybern 1–20
23.
go back to reference Lim KW, Buntine W (2015) Bibliographic analysis with the citation network topic model. In: Asian conference on machine learning, pp 142–158 Lim KW, Buntine W (2015) Bibliographic analysis with the citation network topic model. In: Asian conference on machine learning, pp 142–158
24.
go back to reference Liu Y, Liu Z, Chua TS, Sun M (2015) Topical word embeddings. In: AAAI. 2015, January, pp 2418–2424 Liu Y, Liu Z, Chua TS, Sun M (2015) Topical word embeddings. In: AAAI. 2015, January, pp 2418–2424
25.
go back to reference McCallum A, Corrada-Emmanuel A, Wang X (2005) The Author-Recipient-Topic Model for Topic and Role Discovery in Social Networks, with Enron and Academic Email. In: Workshop on Link Analysis, Counterterrorism and Security, pp 33–44 McCallum A, Corrada-Emmanuel A, Wang X (2005) The Author-Recipient-Topic Model for Topic and Role Discovery in Social Networks, with Enron and Academic Email. In: Workshop on Link Analysis, Counterterrorism and Security, pp 33–44
26.
go back to reference Meng C, Yang C, Wang Y (2016) Community detection and topic drift with word embedding. In 33rd international conference on machine learning, vol 48 Meng C, Yang C, Wang Y (2016) Community detection and topic drift with word embedding. In 33rd international conference on machine learning, vol 48
27.
go back to reference Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119 Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119
28.
go back to reference Morters P, Peres Y (2010) Brownian motion, vol 30. Cambridge University Press, CambridgeMATH Morters P, Peres Y (2010) Brownian motion, vol 30. Cambridge University Press, CambridgeMATH
29.
go back to reference Naveed N, Sizov S, Rauf Z (2019) ATTention: understanding authors and topics in context of temporal evolution. J Appl Emerg Sci 8(2):181–185 Naveed N, Sizov S, Rauf Z (2019) ATTention: understanding authors and topics in context of temporal evolution. J Appl Emerg Sci 8(2):181–185
30.
go back to reference Naveed N, Sizov S, Staab S (2011) Attention: understanding authors and topics in context of temporal evolution. In: European conference on information retrieval. Springer, Berlin, pp 733–737 Naveed N, Sizov S, Staab S (2011) Attention: understanding authors and topics in context of temporal evolution. In: European conference on information retrieval. Springer, Berlin, pp 733–737
31.
go back to reference Nguyen DQ, Billingsley R, Du L, Johnson M (2015) Improving topic models with latent feature word representations. Trans Assoc Comput Linguist 3:299–313CrossRef Nguyen DQ, Billingsley R, Du L, Johnson M (2015) Improving topic models with latent feature word representations. Trans Assoc Comput Linguist 3:299–313CrossRef
32.
go back to reference Niu L, Dai XY, Huang S, Chen J (2016) A unified framework for jointly learning distributed representations of word and attributes. In: Asian conference on machine learning, pp 143–156 Niu L, Dai XY, Huang S, Chen J (2016) A unified framework for jointly learning distributed representations of word and attributes. In: Asian conference on machine learning, pp 143–156
33.
go back to reference Niu L, Dai X, Zhang J, Chen J (2015) Topic2Vec: learning distributed representations of topics. In: 2015 international conference on Asian language processing (IALP), IEEE, pp 193–196 Niu L, Dai X, Zhang J, Chen J (2015) Topic2Vec: learning distributed representations of topics. In: 2015 international conference on Asian language processing (IALP), IEEE, pp 193–196
34.
go back to reference Rismanto R, Syulistyo AR, Agusta BPC (2020) Research supervisor recommendation system based on topic conformity. Int J Mod Educ Comput Sci 12(1):26CrossRef Rismanto R, Syulistyo AR, Agusta BPC (2020) Research supervisor recommendation system based on topic conformity. Int J Mod Educ Comput Sci 12(1):26CrossRef
35.
go back to reference Rosen-Zvi M, Chemudugunta C, Griffiths T, Smyth P, Steyvers M (2010) Learning author-topic models from text corpora. ACM Transactions on Information Systems (TOIS) 28(1):4CrossRef Rosen-Zvi M, Chemudugunta C, Griffiths T, Smyth P, Steyvers M (2010) Learning author-topic models from text corpora. ACM Transactions on Information Systems (TOIS) 28(1):4CrossRef
36.
go back to reference Rudolph M, Blei D (2017) Dynamic Bernoulli embeddings for language evolution. arXiv:1703.08052 Rudolph M, Blei D (2017) Dynamic Bernoulli embeddings for language evolution. arXiv:​1703.​08052
37.
go back to reference Sahragard R, Meihami H (2016) A diachronic study on the information provided by the research titles of applied linguistics journals. Scientometrics 108(3):1315–1331CrossRef Sahragard R, Meihami H (2016) A diachronic study on the information provided by the research titles of applied linguistics journals. Scientometrics 108(3):1315–1331CrossRef
38.
go back to reference Saier T, Farber M (2020) unarXive: a large scholarly data set with publications’ full-text, annotated in-text citations, and links to metadata. Scientometrics 125:3085–3108 Saier T, Farber M (2020) unarXive: a large scholarly data set with publications’ full-text, annotated in-text citations, and links to metadata. Scientometrics 125:3085–3108
40.
go back to reference Sleeman J, Halem M, Finin T, Cane M (2016) Dynamic topic modeling to infer the influence of research citations on ipcc assessment reports. In: Big data challenges, research, and technologies in the earth and planetary sciences workshop, IEEE international conference on big data, IEEE Sleeman J, Halem M, Finin T, Cane M (2016) Dynamic topic modeling to infer the influence of research citations on ipcc assessment reports. In: Big data challenges, research, and technologies in the earth and planetary sciences workshop, IEEE international conference on big data, IEEE
41.
go back to reference Shi B, Lam W, Jameel S, Schockaert S, Lai KP (2017) Jointly learning word embeddings and latent topics. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, ACM, pp 375–384 Shi B, Lam W, Jameel S, Schockaert S, Lai KP (2017) Jointly learning word embeddings and latent topics. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, ACM, pp 375–384
43.
go back to reference Wang C, Blei D, Heckerman D (2012) Continuous time dynamic topic models. arXiv:1206.3298 Wang C, Blei D, Heckerman D (2012) Continuous time dynamic topic models. arXiv:​1206.​3298
44.
go back to reference Wang X, McCallum A (2006) Topics over time: a non-Markov continuous-time model of topical trends. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 424–433 Wang X, McCallum A (2006) Topics over time: a non-Markov continuous-time model of topical trends. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 424–433
45.
go back to reference Wang J, Wu X, Li L (2018) A framework for semantic connection based topic evolution with DeepWalk. Intell Data Anal 22(1):211–237CrossRef Wang J, Wu X, Li L (2018) A framework for semantic connection based topic evolution with DeepWalk. Intell Data Anal 22(1):211–237CrossRef
46.
go back to reference Yang J, Donnat C (2017) CS 224N: language dynamics analysis through Word2Vec embeddings Yang J, Donnat C (2017) CS 224N: language dynamics analysis through Word2Vec embeddings
47.
go back to reference Yang M, Zhu D, Tang Y, Wang J (2017) Authorship attribution with topic drift model. In: AAAI, pp 5015–5016 Yang M, Zhu D, Tang Y, Wang J (2017) Authorship attribution with topic drift model. In: AAAI, pp 5015–5016
48.
go back to reference Zhou H, Yu H, Hu R (2017) Topic evolution based on the probabilistic topic model: a review. Front Comput Sci 11(5):786–802CrossRef Zhou H, Yu H, Hu R (2017) Topic evolution based on the probabilistic topic model: a review. Front Comput Sci 11(5):786–802CrossRef
Metadata
Title
Joint learning of author and citation contexts for computing drift in scholarly documents
Authors
J. Vijayarani
T. V. Geetha
Publication date
14-01-2021
Publisher
Springer Berlin Heidelberg
Published in
International Journal of Machine Learning and Cybernetics / Issue 6/2021
Print ISSN: 1868-8071
Electronic ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-020-01265-6

Other articles of this Issue 6/2021

International Journal of Machine Learning and Cybernetics 6/2021 Go to the issue