Using machine learning techniques for rising star prediction in co-author network

Daud, Ali; Ahmad, Muhammad; Malik, M. S. I.; Che, Dunren

doi:10.1007/s11192-014-1455-8

Using machine learning techniques for rising star prediction in co-author network

Published: 15 October 2014

Volume 102, pages 1687–1711, (2015)
Cite this article

Scientometrics Aims and scope Submit manuscript

Ali Daud¹,
Muhammad Ahmad²,
M. S. I. Malik¹ &
…
Dunren Che³

2157 Accesses
65 Citations
2 Altmetric
Explore all metrics

Abstract

Online bibliographic databases are powerful resources for research in data mining and social network analysis especially co-author networks. Predicting future rising stars is to find brilliant scholars/researchers in co-author networks. In this paper, we propose a solution for rising star prediction by applying machine learning techniques. For classification task, discriminative and generative modeling techniques are considered and two algorithms are chosen for each category. The author, co-authorship and venue based information are incorporated, resulting in eleven features with their mathematical formulations. Extensive experiments are performed to analyze the impact of individual feature, category wise and their combination w.r.t classification accuracy. Then, two ranking lists for top 30 scholars are presented from predicted rising stars. In addition, this concept is demonstrated for prediction of rising stars in database domain. Data from DBLP and Arnetminer databases (1996–2000 for wide disciplines) are used for algorithms’ experimental analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Predicting academic success in higher education: literature review and best practices

Article Open access 10 February 2020

Educational data mining: prediction of students' academic performance using machine learning algorithms

Article Open access 03 March 2022

Educational data mining to predict students' academic performance: A survey study

Article 09 July 2022

Notes

http://academic.research.microsoft.com/.
Co-author Path and Graph in Microsoft Academic Search.

References

Bermejo, P., Gamez, J. A., & Puerta, J. M. (2014). Speeding up incremental wrapper feature subset selection with Naive Bayes classifier. Knowledge-Based Systems, 55, 140–147.
Article Google Scholar
Chen, J., Huang, H., Tian, S., & Qu, Y. (2009). Feature selection for text classification with Naïve Bayes. Expert Systems with Applications, 36(3), 5432–5435.
Article Google Scholar
Chrysos, G., Dagritzikos, P., Papaefstathiou, I., & Dollas, A. (2013). HC-CART: A parallel system implementation of data mining classification and regression tree (CART) algorithm on a multi-FPGA system. ACM Transactions on Architecture and Code Optimization, 9(4), 47.
Article Google Scholar
Constantinou, A. C., Fenton, N. E., & Neil, M. (2012). pi-football: A Bayesian network model for forecasting Association Football match outcomes. Knowledge-Based Systems, 36, 322–339.
Article Google Scholar
Cui, X., Afify, M., Gao, Y., & Zhou, B. (2013). Stereo hidden Markov modeling for noise robust speech recognition. Computer Speech & Language, 27(2), 407–419.
Article Google Scholar
Cuxac, P., Lamirel, J.-C., & Bonvallot, V. (2013). Efficient supervised and semi-supervised approaches for affiliations disambiguation. Scientometrics, 97(1), 47–58.
Article Google Scholar
Daud, A., Abbasi, R., & Muhammad, F. (2013). Finding rising stars in social networks. Database Systems for Advanced Applications (LNCS), 7825, 13–24.
Google Scholar
Daud, A., Li, J., Zhou, L., & Muhammad, F. (2010). Temporal expert finding through generalized time topic modeling. Knowledge-Based Systems (KBS), 23(6), 615–625.
Article Google Scholar
Fakhari, A., & Moghadam, A. M. E. (2013). Combination of classification and regression in decision tree for multi-labeling image annotation and retrieval. Applied Soft Computing, 13(2), 1292–1302.
Article Google Scholar
Farid, D. M., Zhang, L., Rahman, C. F., Hossain, M. A., & Strachan, R. (2014). Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks. Expert Systems with Applications, 41(4) Part 2, 1937–1946.
Gu, F., Zhang, H., & Zhu, D. (2013). Blind separation of non-stationary sources using continuous density hidden Markov models. Digital Signal Processing, 23(5), 1549–1564.
Article MathSciNet Google Scholar
Guns, R., & Rousseau, R. (2014). Recommending research collaborations using link prediction and random forest classifiers. Scientometrics,. doi:10.1007/s11192-013-1228-9.
Google Scholar
Huang, S., Yang, B., Yan, S., & Rousseau, R. (2013). Institution name disambiguation for research assessment. Scientometrics,. doi:10.1007/s11192-013-1214-2.
Google Scholar
Kao, L. J., Chiu, C. C., & Chiu, F. Y. (2013). A Bayesian latent variable model with classification and regression tree approach for behavior and credit scoring. Knowledge-Based Systems, 36, 245–252.
Article Google Scholar
Li, Z., Fang, H., & Xia, L. (2014). Increasing mapping based hidden Markov model for dynamic process monitoring and diagnosis. Expert Systems with Applications, 41(2), 744–751.
Article Google Scholar
Li, X. K., Foo, C. S., Tew, K. L., & Ng, S. K. (2009).Searching for rising stars in bibliography networks. In Proceedings of the 14th international conference on database systems for advanced applications (pp. 288–292).
Loh, W. J. (2011). Classification and regression trees. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(1), 14–23.
Google Scholar
López-Cruz, P. L., Larrañaga, P., DeFelipe, J., & Bielza, C. (2014). Bayesian network modeling of the consensus between experts: An application to neuron classification. International Journal of Approximate Reasoning, 55(1), 3–22.
Article MathSciNet Google Scholar
Ma, Z., Sun, A., & Cong, G. (2013). On predicting the popularity of newly emerging hashtags in Twitter. Journal of the American Society for Information Science and Technology, 64(7), 1399–1410.
Article Google Scholar
Mascaro, S., Nicholso, A. E., & Korb, K. B. (2014). Anomaly detection in vessel tracks using Bayesian networks. International Journal of Approximate Reasoning, 55(1), 84–98.
Article Google Scholar
McCallum, A., Freitag, D., & Pereira, F. C. (2000). Maximum entropy Markov models for information extraction and segmentation. In Proceedings of the seventeenth international conference on machine learning (pp. 591–598). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.
Orman, L. V. (2013). Bayesian inference in trust networks. ACM Transactions on Management Information Systems (TMIS), 4(2), Article No. 7. New York, USA: ACM.
Ren, F., & Kang, X. (2013). Employing hierarchical Bayesian networks in simple and complex emotion topic analysis. Computer Speech & Language, 27(4), 943–968.
Article Google Scholar
Santos, R. L. T., Macdonald, C., & Ounis, I. (2013). Learning to rank query suggestions for adhoc and diversity search. Information Retrieval, 16(4), 429–451.
Article Google Scholar
Sekercioglu, C. H. (2008). Quantifying co-author contributions. Science, 322, 371.
Song, I. J., & Cho, S. B. (2013). Bayesian and behavior networks for context-adaptive user interface in a ubiquitous home environment. Expert Systems with Applications, 40(5), 1827–1838.
Article Google Scholar
Speybroeck, N. (2012). Classification and regression trees. International Journal of Public Health., 57(1), 243–246.
Article Google Scholar
Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., & Su, Z. (2008). Arnetminer: Extraction and mining of academic social networks. In Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 990–998).
Tsatsaronis, G., Varlamis, I., & Norvag, K. (2011). How to become a group leader? Or modeling author types based on graph mining. LNCS, 6966, 15–26.
Google Scholar
Wang, G. A., Jiao, J., Abrahams, A. S., Fan, W., & Zhang, Z. (2013). Expert rank: A topic-aware expert finding algorithm for online knowledge communities. Decision Support Systems, 54(3), 1442–1451.
Article Google Scholar
Yan, R., Huang, C., Tang, J., Zhang, Y., & Li, X. (2012). To better stand on the shoulder of giants. In JCDL ‘12 Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries New York (pp. 51–60).
Zhang, G., Ding, Y., & Milojevic, S. (2013). Citation content analysis (CCA): A method for syntactic and semantic analysis of citation content. Journal of the American Society for Information Science and Technology, 64(7), 1490–1503.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Software Engineering, International Islamic University, Islamabad, Pakistan
Ali Daud & M. S. I. Malik
Department of Computer Science, Allama Iqbal Open University, Islamabad, Pakistan
Muhammad Ahmad
Department of Computer Science, Southern Illinois University, Carbondale, IL, 62901, USA
Dunren Che

Authors

Ali Daud
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Ahmad
View author publications
You can also search for this author in PubMed Google Scholar
M. S. I. Malik
View author publications
You can also search for this author in PubMed Google Scholar
Dunren Che
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ali Daud.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Daud, A., Ahmad, M., Malik, M.S.I. et al. Using machine learning techniques for rising star prediction in co-author network. Scientometrics 102, 1687–1711 (2015). https://doi.org/10.1007/s11192-014-1455-8

Download citation

Received: 26 May 2014
Published: 15 October 2014
Issue Date: February 2015
DOI: https://doi.org/10.1007/s11192-014-1455-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using machine learning techniques for rising star prediction in co-author network

Abstract

Access this article

Similar content being viewed by others

Predicting academic success in higher education: literature review and best practices

Educational data mining: prediction of students' academic performance using machine learning algorithms

Educational data mining to predict students' academic performance: A survey study

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Using machine learning techniques for rising star prediction in co-author network

Abstract

Access this article

Similar content being viewed by others

Predicting academic success in higher education: literature review and best practices

Educational data mining: prediction of students' academic performance using machine learning algorithms

Educational data mining to predict students' academic performance: A survey study

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation