Skip to main content

2016 | OriginalPaper | Buchkapitel

AdaWIRL: A Novel Bayesian Ranking Approach for Personal Big-Hit Paper Prediction

verfasst von : Chuxu Zhang, Lu Yu, Jie Lu, Tao Zhou, Zi-Ke Zhang

Erschienen in: Web-Age Information Management

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Predicting the most impactful (big-hit) paper among a researcher’s publications so it can be well disseminated in advance not only has a large impact on individual academic success, but also provides useful guidance to the research community. In this work, we tackle the problem of given the corpus of a researcher’s publications in previous few years, how to effectively predict which paper will become the big-hit in the future. We explore a series of features that can drive a paper to become the big-hit, and design a novel Bayesian ranking algorithm AdaWIRL (Adaptive Weighted Impact Ranking Learning) that leverages a weighted training schema and an adaptive timely false correction strategy to predict big-hit papers. Experimental results on the large ArnetMiner dataset with over 1.7 million authors and 2 million papers demonstrate the effectiveness of AdaWIRL. Specifically, it correctly predicts over 78.3 % of all researchers’ big-hit papers and outperforms the compared regression and ranking algorithms, with an average of \(5.8\,\%\) and \(2.9\,\%\) improvement respectively. Further analysis shows that temporal features are the best indicator for personal big-hit papers, while authorship and social features are less relevant. We also demonstrate that there is a high correlation between the impact of a researcher’s future works and their similarity to the predicted big-hit paper.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bethard, S., Jurafsky, D.: Who should i cite: learning literature search models from citation behavior. In: CIKM 2010, pp. 609–618. ACM (2010) Bethard, S., Jurafsky, D.: Who should i cite: learning literature search models from citation behavior. In: CIKM 2010, pp. 609–618. ACM (2010)
2.
Zurück zum Zitat Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. JMLR 3, 993–1022 (2003)MATH Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. JMLR 3, 993–1022 (2003)MATH
3.
Zurück zum Zitat Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., Hullender, G.: Learning to rank using gradient descent. In: ICML 2005, pp. 89–96. ACM (2005) Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., Hullender, G.: Learning to rank using gradient descent. In: ICML 2005, pp. 89–96. ACM (2005)
4.
Zurück zum Zitat Cao, Z., Qin, T., Liu, T.-Y., Tsai, M.-F., Li, H.: Learning to rank: from pairwise approach to listwise approach. In: ICML 2007, pp. 129–136. ACM (2007) Cao, Z., Qin, T., Liu, T.-Y., Tsai, M.-F., Li, H.: Learning to rank: from pairwise approach to listwise approach. In: ICML 2007, pp. 129–136. ACM (2007)
5.
Zurück zum Zitat Castillo, C., Donato, D., Gionis, A.: Estimating number of citations using author reputation. In: Ziviani, N., Baeza-Yates, R. (eds.) SPIRE 2007. LNCS, vol. 4726, pp. 107–117. Springer, Heidelberg (2007)CrossRef Castillo, C., Donato, D., Gionis, A.: Estimating number of citations using author reputation. In: Ziviani, N., Baeza-Yates, R. (eds.) SPIRE 2007. LNCS, vol. 4726, pp. 107–117. Springer, Heidelberg (2007)CrossRef
6.
Zurück zum Zitat Dong, Y., Johnson, R.A., Chawla, N.V.: Will this paper increase your h-index? Scientific impact prediction. In: WSDM 2015, pp. 149–158. ACM (2015) Dong, Y., Johnson, R.A., Chawla, N.V.: Will this paper increase your h-index? Scientific impact prediction. In: WSDM 2015, pp. 149–158. ACM (2015)
7.
Zurück zum Zitat Dong, Y., Johnson, R.A., Yang, Y., Chawla, N.V.: Collaboration signatures reveal scientific impact. In: ASONAM 2015, pp. 480–487. ACM (2015) Dong, Y., Johnson, R.A., Yang, Y., Chawla, N.V.: Collaboration signatures reveal scientific impact. In: ASONAM 2015, pp. 480–487. ACM (2015)
8.
Zurück zum Zitat Freund, Y., Iyer, R., Schapire, R.E., Singer, Y.: An efficient boosting algorithm for combining preferences. JMLR 4, 933–969 (2003)MathSciNetMATH Freund, Y., Iyer, R., Schapire, R.E., Singer, Y.: An efficient boosting algorithm for combining preferences. JMLR 4, 933–969 (2003)MathSciNetMATH
9.
Zurück zum Zitat Hirsch, J.E.: An index to quantify an individual’s scientific research output. PNAS 102(46), 16569–16572 (2005)CrossRef Hirsch, J.E.: An index to quantify an individual’s scientific research output. PNAS 102(46), 16569–16572 (2005)CrossRef
10.
Zurück zum Zitat Joachims, T.: Optimizing search engines using clickthrough data. In: KDD 2002, pp. 133–142. ACM (2002) Joachims, T.: Optimizing search engines using clickthrough data. In: KDD 2002, pp. 133–142. ACM (2002)
12.
Zurück zum Zitat Liu, Y., Zhao, P., Sun, A., Miao, C.: A boosting algorithm for item recommendation with implicit feedback. In: AAAI 2015, pp. 1792–1798. AAAI Press (2015) Liu, Y., Zhao, P., Sun, A., Miao, C.: A boosting algorithm for item recommendation with implicit feedback. In: AAAI 2015, pp. 1792–1798. AAAI Press (2015)
13.
Zurück zum Zitat Lü, L., Zhou, T., Zhang, Q.-M., Stanley, H.E.: The h-index of a network node and its relation to degree and coreness. Nat. Commun. 7, 10168 (2016)CrossRef Lü, L., Zhou, T., Zhang, Q.-M., Stanley, H.E.: The h-index of a network node and its relation to degree and coreness. Nat. Commun. 7, 10168 (2016)CrossRef
15.
Zurück zum Zitat Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the web (1999) Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the web (1999)
16.
Zurück zum Zitat Rendle, S., Freudenthaler, C.: Improving pairwise learning for item recommendation from implicit feedback. In: WSDM 2014, pp. 273–282. ACM (2014) Rendle, S., Freudenthaler, C.: Improving pairwise learning for item recommendation from implicit feedback. In: WSDM 2014, pp. 273–282. ACM (2014)
17.
Zurück zum Zitat Rendle, S., Freudenthaler, C., Gantner, Z., Schmidt-Thieme, L.: Bpr: Bayesian personalized ranking from implicit feedback. In: UAI 2009, pp. 452–461. AUAI Press (2009) Rendle, S., Freudenthaler, C., Gantner, Z., Schmidt-Thieme, L.: Bpr: Bayesian personalized ranking from implicit feedback. In: UAI 2009, pp. 452–461. AUAI Press (2009)
18.
Zurück zum Zitat Sun, Y., Han, J., Aggarwal, C.C., Chawla, N.V.: When will it happen? relationship prediction in heterogeneous information networks. In: WSDM 2012, pp. 663–672. ACM (2012) Sun, Y., Han, J., Aggarwal, C.C., Chawla, N.V.: When will it happen? relationship prediction in heterogeneous information networks. In: WSDM 2012, pp. 663–672. ACM (2012)
19.
Zurück zum Zitat Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: Arnetminer: extraction and mining of academic social networks. In: KDD 2008, pp. 990–998. ACM (2008) Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: Arnetminer: extraction and mining of academic social networks. In: KDD 2008, pp. 990–998. ACM (2008)
20.
Zurück zum Zitat Wang, C., Han, J., Jia, Y., Tang, J., Zhang, D., Yu, Y., Guo, J.: Mining advisor-advisee relationships from research publication networks. In: KDD 2010, pp. 203–212. ACM (2010) Wang, C., Han, J., Jia, Y., Tang, J., Zhang, D., Yu, Y., Guo, J.: Mining advisor-advisee relationships from research publication networks. In: KDD 2010, pp. 203–212. ACM (2010)
21.
Zurück zum Zitat Wang, D., Song, C., Barabási, A.-L.: Quantifying long-term scientific impact. Science 342(6154), 127–132 (2013)CrossRef Wang, D., Song, C., Barabási, A.-L.: Quantifying long-term scientific impact. Science 342(6154), 127–132 (2013)CrossRef
22.
Zurück zum Zitat Wu, Q., Burges, C.J., Svore, K.M., Gao, J.: Adapting boosting for information retrieval measures. Inf. Retrieval 13(3), 254–270 (2010)CrossRef Wu, Q., Burges, C.J., Svore, K.M., Gao, J.: Adapting boosting for information retrieval measures. Inf. Retrieval 13(3), 254–270 (2010)CrossRef
23.
Zurück zum Zitat Xu, J., Li, H.: Adarank: a boosting algorithm for information retrieval. In: SIGIR 2007, pp. 391–398. ACM (2007) Xu, J., Li, H.: Adarank: a boosting algorithm for information retrieval. In: SIGIR 2007, pp. 391–398. ACM (2007)
24.
Zurück zum Zitat Yan, R., Huang, C., Tang, J., Zhang, Y., Li, X.: To better stand on the shoulder of giants. In: JCDL 2012, pp. 51–60. ACM (2012) Yan, R., Huang, C., Tang, J., Zhang, Y., Li, X.: To better stand on the shoulder of giants. In: JCDL 2012, pp. 51–60. ACM (2012)
25.
Zurück zum Zitat Yan, R., Tang, J., Liu, X., Shan, D., Li, X.: Citation count prediction: learning to estimate future citations for literature. In: CIKM 2011, pp. 1247–1252. ACM (2011) Yan, R., Tang, J., Liu, X., Shan, D., Li, X.: Citation count prediction: learning to estimate future citations for literature. In: CIKM 2011, pp. 1247–1252. ACM (2011)
26.
Zurück zum Zitat Zhang, T.: Solving large scale linear prediction problems using stochastic gradient descent algorithms. In: ICML 2004, p. 116. ACM (2004) Zhang, T.: Solving large scale linear prediction problems using stochastic gradient descent algorithms. In: ICML 2004, p. 116. ACM (2004)
27.
Zurück zum Zitat Zhao, T., McAuley, J., King, I.: Leveraging social connections to improve personalized ranking for collaborative filtering. In: CIKM 2014, pp. 261–270. ACM (2014) Zhao, T., McAuley, J., King, I.: Leveraging social connections to improve personalized ranking for collaborative filtering. In: CIKM 2014, pp. 261–270. ACM (2014)
Metadaten
Titel
AdaWIRL: A Novel Bayesian Ranking Approach for Personal Big-Hit Paper Prediction
verfasst von
Chuxu Zhang
Lu Yu
Jie Lu
Tao Zhou
Zi-Ke Zhang
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-39958-4_27

Neuer Inhalt