Skip to main content
Top
Published in: World Wide Web 6/2017

19-01-2017

Wikiometrics: a Wikipedia based ranking system

Authors: Gilad Katz, Lior Rokach

Published in: World Wide Web | Issue 6/2017

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We present a new concept—Wikiometrics—the derivation of metrics and indicators from Wikipedia. Wikipedia provides an accurate representation of the real world due to its size, structure, editing policy and popularity. We demonstrate an innovative “mining” methodology, where different elements of Wikipedia – content, structure, editorial actions and reader reviews – are used to rank items in a manner which is by no means inferior to rankings produced by experts or other methods. We test our proposed method by applying it to two real-world ranking problems: top world universities and academic journals. Our proposed ranking methods were compared to leading and widely accepted benchmarks, and were found to be extremely correlative but with the advantage of the data being publically available.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Agrawal, V.K., Agrawal, V., Rungtusanatham, M.: Theoretical and interpretation challenges to using the author affiliation index method to rank journals. Prod. Oper. Manag. 20(2), 280–300 (2011)CrossRef Agrawal, V.K., Agrawal, V., Rungtusanatham, M.: Theoretical and interpretation challenges to using the author affiliation index method to rank journals. Prod. Oper. Manag. 20(2), 280–300 (2011)CrossRef
2.
go back to reference Aguillo, I.F., Bar-Ilan, J., Levene, M., Ortega, J.L.: Comparing university rankings. Scientometrics. 85(1), 243–256 (2010)CrossRef Aguillo, I.F., Bar-Ilan, J., Levene, M., Ortega, J.L.: Comparing university rankings. Scientometrics. 85(1), 243–256 (2010)CrossRef
3.
go back to reference Al-Maskari, A., Sanderson, M., and Clough, P. The relationship between IR effectiveness measures and user satisfaction. in Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM (2007) Al-Maskari, A., Sanderson, M., and Clough, P. The relationship between IR effectiveness measures and user satisfaction. in Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM (2007)
4.
go back to reference Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives Z.: DBpedia: a nucleus for a Web of open data. In: Aberer, K., et al. (eds.) The semantic Web. Lect. Notes Comput. Sci. vol 4825. Springer, Berlin (2007) Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives Z.: DBpedia: a nucleus for a Web of open data. In: Aberer, K., et al. (eds.) The semantic Web. Lect. Notes Comput. Sci. vol 4825. Springer, Berlin (2007)
5.
go back to reference Balog, K., M. Bron, and M. De Rijke, Category-based query modeling for entity search, in Advances in Information Retrieval, Springer. p. 319–331 (2010) Balog, K., M. Bron, and M. De Rijke, Category-based query modeling for entity search, in Advances in Information Retrieval, Springer. p. 319–331 (2010)
6.
go back to reference Bergstrom, C.: Measuring the value and prestige of scholarly journals. College & Research Libraries News. 68(5), 314–316 (2007)CrossRef Bergstrom, C.: Measuring the value and prestige of scholarly journals. College & Research Libraries News. 68(5), 314–316 (2007)CrossRef
7.
go back to reference Brynjolfsson, E., Hu, Y., Simester, D.: Goodbye pareto principle, hello long tail: the effect of search costs on the concentration of product sales. Manag. Sci. 57(8), 1373–1386 (2011)CrossRef Brynjolfsson, E., Hu, Y., Simester, D.: Goodbye pareto principle, hello long tail: the effect of search costs on the concentration of product sales. Manag. Sci. 57(8), 1373–1386 (2011)CrossRef
8.
go back to reference Calver, M., Bradley, J.: Should we use the mean citations per paper to summarise a journal’s impact or to rank journals in the same field? Scientometrics. 81(3), 611–615 (2009)CrossRef Calver, M., Bradley, J.: Should we use the mean citations per paper to summarise a journal’s impact or to rank journals in the same field? Scientometrics. 81(3), 611–615 (2009)CrossRef
9.
go back to reference Cheng, C.H., Holsapple, C.W., Lee, A.: Citation-based journal rankings for AI research a business perspective. AI Mag. 17(2), 87 (1996) Cheng, C.H., Holsapple, C.W., Lee, A.: Citation-based journal rankings for AI research a business perspective. AI Mag. 17(2), 87 (1996)
10.
go back to reference Chepelianskii, A.D., Towards physical laws for software architecture. arXiv preprint arXiv:1003.5455, (2010) Chepelianskii, A.D., Towards physical laws for software architecture. arXiv preprint arXiv:1003.5455, (2010)
11.
go back to reference Cronin, B., Meho, L.I.: Applying the author affiliation index to library and information science journals. J. Am. Soc. Inf. Sci. Technol. 59(11), 1861–1865 (2008)CrossRef Cronin, B., Meho, L.I.: Applying the author affiliation index to library and information science journals. J. Am. Soc. Inf. Sci. Technol. 59(11), 1861–1865 (2008)CrossRef
12.
go back to reference Demartini, G., C.S. Firan, T. Iofciu, and W. Nejdl, Semantically enhanced entity ranking, in Web Information Systems Engineering-WISE 2008. Springer. p. 176–188 (2008) Demartini, G., C.S. Firan, T. Iofciu, and W. Nejdl, Semantically enhanced entity ranking, in Web Information Systems Engineering-WISE 2008. Springer. p. 176–188 (2008)
13.
go back to reference Eom, Y.-H., Frahm, K.M., Benczúr, A., and Shepelyansky, D.L, Time evolution of Wikipedia network ranking. arXiv preprint arXiv:1304.6601, (2013) Eom, Y.-H., Frahm, K.M., Benczúr, A., and Shepelyansky, D.L, Time evolution of Wikipedia network ranking. arXiv preprint arXiv:1304.6601, (2013)
14.
go back to reference Fader, A., Soderland, S., Etzioni, O., and Center, T. Scaling Wikipedia-based named entity disambiguation to arbitrary web text. in Proceedings of the IJCAI Workshop on User-contributed Knowledge and Artificial Intelligence: An Evolving Synergy, Pasadena, CA, USA. (2009) Fader, A., Soderland, S., Etzioni, O., and Center, T. Scaling Wikipedia-based named entity disambiguation to arbitrary web text. in Proceedings of the IJCAI Workshop on User-contributed Knowledge and Artificial Intelligence: An Evolving Synergy, Pasadena, CA, USA. (2009)
15.
go back to reference Ferron, M., Massa, P.: The Arab spring| wikirevolutions: Wikipedia as a lens for studying the real-time formation of collective memories of revolutions. International Journal of Communication. 5, 20 (2011) Ferron, M., Massa, P.: The Arab spring| wikirevolutions: Wikipedia as a lens for studying the real-time formation of collective memories of revolutions. International Journal of Communication. 5, 20 (2011)
16.
go back to reference Garfield, E.: The history and meaning of the journal impact factor. JAMA. 295(1), 90–93 (2006)CrossRef Garfield, E.: The history and meaning of the journal impact factor. JAMA. 295(1), 90–93 (2006)CrossRef
17.
go back to reference Hachey, B., Radford, W., Nothman, J., Honnibal, M., Curran, J.R.: Evaluating entity linking with Wikipedia. Artif. Intell. 194, 130–150 (2013)MathSciNetCrossRefMATH Hachey, B., Radford, W., Nothman, J., Honnibal, M., Curran, J.R.: Evaluating entity linking with Wikipedia. Artif. Intell. 194, 130–150 (2013)MathSciNetCrossRefMATH
18.
go back to reference Harless, D. and Reilly, R., Revision of the journal list for doctoral designation. Unpublished report, Virginia Commonwealth University, Richmond, VA. Retrieved June, 1998. 17: (2008) Harless, D. and Reilly, R., Revision of the journal list for doctoral designation. Unpublished report, Virginia Commonwealth University, Richmond, VA. Retrieved June, 1998. 17: (2008)
19.
go back to reference Harzing, A.-W., Van der Wal, R.: Google scholar: the democratization of citation analysis. Ethics in Science and Environmental Politics. 8(1), 61–73 (2007) Harzing, A.-W., Van der Wal, R.: Google scholar: the democratization of citation analysis. Ethics in Science and Environmental Politics. 8(1), 61–73 (2007)
20.
go back to reference Hoffart, J., M.A. Yosef, I. Bordino, H. Fürstenau, M. Pinkal, M. Spaniol, B. Taneva, S. Thater, and G. Weikum. Robust disambiguation of named entities in text. in Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (2011) Hoffart, J., M.A. Yosef, I. Bordino, H. Fürstenau, M. Pinkal, M. Spaniol, B. Taneva, S. Thater, and G. Weikum. Robust disambiguation of named entities in text. in Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (2011)
21.
go back to reference Holsapple, C.W.: A publication power approach for identifying premier information systems journals. J. Am. Soc. Inf. Sci. Technol. 59(2), 166–185 (2008)CrossRef Holsapple, C.W.: A publication power approach for identifying premier information systems journals. J. Am. Soc. Inf. Sci. Technol. 59(2), 166–185 (2008)CrossRef
22.
go back to reference Kaptein, R., Kamps, J.: Exploiting the category structure of Wikipedia for entity ranking. Artif. Intell. 194, 111–129 (2013)CrossRefMATH Kaptein, R., Kamps, J.: Exploiting the category structure of Wikipedia for entity ranking. Artif. Intell. 194, 111–129 (2013)CrossRefMATH
23.
go back to reference Kaptein, R., P. Serdyukov, A. De Vries, and J. Kamps. Entity ranking using Wikipedia as a pivot. in Proceedings of the 19th ACM international conference on Information and Knowl. Manag. ACM (2010) Kaptein, R., P. Serdyukov, A. De Vries, and J. Kamps. Entity ranking using Wikipedia as a pivot. in Proceedings of the 19th ACM international conference on Information and Knowl. Manag. ACM (2010)
25.
go back to reference Lages, J., Patt, A., and Shepelyansky, D.L., Wikipedia Ranking of World Universities. arXiv preprint arXiv:1511.09021, (2015) Lages, J., Patt, A., and Shepelyansky, D.L., Wikipedia Ranking of World Universities. arXiv preprint arXiv:1511.09021, (2015)
26.
go back to reference Marginson, S., Van der Wende, M.: To rank or to be ranked: the impact of global rankings in higher education. J. Stud. Int. Educ. 11(3–4), 306–329 (2007)CrossRef Marginson, S., Van der Wende, M.: To rank or to be ranked: the impact of global rankings in higher education. J. Stud. Int. Educ. 11(3–4), 306–329 (2007)CrossRef
27.
go back to reference McKean, J. and T. Hettmansperger, Robust nonparametric statistical methods: CRC Press (2011) McKean, J. and T. Hettmansperger, Robust nonparametric statistical methods: CRC Press (2011)
28.
29.
go back to reference Mestyán, M., Yasseri, T., Kertész, J.: Early prediction of movie box office success based on Wikipedia activity big data. PLoS One. 8(8), e71226 (2013)CrossRef Mestyán, M., Yasseri, T., Kertész, J.: Early prediction of movie box office success based on Wikipedia activity big data. PLoS One. 8(8), e71226 (2013)CrossRef
30.
go back to reference Mirizzi, R., A. Ragone, T. Di Noia, and E. Di Sciascio, Ranking the linked data: the case of dbpedia: Springer (2010) Mirizzi, R., A. Ragone, T. Di Noia, and E. Di Sciascio, Ranking the linked data: the case of dbpedia: Springer (2010)
31.
go back to reference Myers, L. and Robe, J., College rankings: history, criticism and reform. Center for College Affordability and Productivity (NJ1), (2009) Myers, L. and Robe, J., College rankings: history, criticism and reform. Center for College Affordability and Productivity (NJ1), (2009)
32.
go back to reference Nielsen, F.Å., Wikipedia research and tools: Review and comments. (2011) Nielsen, F.Å., Wikipedia research and tools: Review and comments. (2011)
33.
go back to reference Page, L., S. Brin, R. Motwani, and T. Winograd, The PageRank citation ranking: Bringing order to the web. (1999) Page, L., S. Brin, R. Motwani, and T. Winograd, The PageRank citation ranking: Bringing order to the web. (1999)
34.
go back to reference Pehcevski, J., A.-M. Vercoustre, and J.A. Thom, Exploiting locality of Wikipedia links in entity ranking, in Advances in Information Retrieval, Springer. p. 258–269 (2008) Pehcevski, J., A.-M. Vercoustre, and J.A. Thom, Exploiting locality of Wikipedia links in entity ranking, in Advances in Information Retrieval, Springer. p. 258–269 (2008)
35.
go back to reference Pehcevski, J., Thom, J.A., Vercoustre, A.-M., Naumovski, V.: Entity ranking in Wikipedia: utilising categories, links and topic difficulty prediction. Inf. Retr. 13(5), 568–600 (2010)CrossRef Pehcevski, J., Thom, J.A., Vercoustre, A.-M., Naumovski, V.: Entity ranking in Wikipedia: utilising categories, links and topic difficulty prediction. Inf. Retr. 13(5), 568–600 (2010)CrossRef
36.
go back to reference Raviv, H., D. Carmel, and O. Kurland. A ranking framework for entity oriented search using Markov random fields. in Proceedings of the 1st Joint International Workshop on Entity-Oriented and Semantic Search. ACM (2012) Raviv, H., D. Carmel, and O. Kurland. A ranking framework for entity oriented search using Markov random fields. in Proceedings of the 1st Joint International Workshop on Entity-Oriented and Semantic Search. ACM (2012)
37.
go back to reference Raviv, H., O. Kurland, and D. Carmel. The cluster hypothesis for entity oriented search. in Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. ACM (2013) Raviv, H., O. Kurland, and D. Carmel. The cluster hypothesis for entity oriented search. in Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. ACM (2013)
38.
go back to reference Rokach, L.: Applying the publication power approach to artificial intelligence journals. J. Am. Soc. Inf. Sci. Technol. 63(6), 1270–1277 (2012)CrossRef Rokach, L.: Applying the publication power approach to artificial intelligence journals. J. Am. Soc. Inf. Sci. Technol. 63(6), 1270–1277 (2012)CrossRef
39.
go back to reference Schloegl, C., Stock, W.G.: Impact and relevance of LIS journals: a scientometric analysis of international and German-language LIS journals—citation analysis versus reader survey. J. Am. Soc. Inf. Sci. Technol. 55(13), 1155–1168 (2004)CrossRef Schloegl, C., Stock, W.G.: Impact and relevance of LIS journals: a scientometric analysis of international and German-language LIS journals—citation analysis versus reader survey. J. Am. Soc. Inf. Sci. Technol. 55(13), 1155–1168 (2004)CrossRef
40.
go back to reference Serenko, A.: The development of an AI journal ranking based on the revealed preference approach. Journal of Informetrics. 4(4), 447–459 (2010)CrossRef Serenko, A.: The development of an AI journal ranking based on the revealed preference approach. Journal of Informetrics. 4(4), 447–459 (2010)CrossRef
41.
go back to reference Serenko, A., Dohan, M.: Comparing the expert survey and citation impact journal ranking methods: example from the field of artificial intelligence. Journal of Informetrics. 5(4), 629–648 (2011)CrossRef Serenko, A., Dohan, M.: Comparing the expert survey and citation impact journal ranking methods: example from the field of artificial intelligence. Journal of Informetrics. 5(4), 629–648 (2011)CrossRef
42.
go back to reference Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: a large ontology from Wikipedia and WordNet. Web Semant. Sci. Serv. Agents World Wide Web. 6(3), 203–217 (2008)CrossRef Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: a large ontology from Wikipedia and WordNet. Web Semant. Sci. Serv. Agents World Wide Web. 6(3), 203–217 (2008)CrossRef
43.
go back to reference Vercoustre, A.-M., J.A. Thom, and J. Pehcevski. Entity ranking in Wikipedia. in Proceedings of the 2008 ACM symposium on Applied computing. ACM (2008a) Vercoustre, A.-M., J.A. Thom, and J. Pehcevski. Entity ranking in Wikipedia. in Proceedings of the 2008 ACM symposium on Applied computing. ACM (2008a)
44.
go back to reference Vercoustre, A.-M., J. Pehcevski, and J.A. Thom, Using wikipedia categories and links in entity ranking, in Focused Access to XML Documents, Springer. p. 321–335 (2008b) Vercoustre, A.-M., J. Pehcevski, and J.A. Thom, Using wikipedia categories and links in entity ranking, in Focused Access to XML Documents, Springer. p. 321–335 (2008b)
45.
go back to reference Zar, J.H., Spearman rank correlation. Encyclopedia of Biostatistics, (1998) Zar, J.H., Spearman rank correlation. Encyclopedia of Biostatistics, (1998)
46.
go back to reference Zaragoza, H., H. Rode, P. Mika, J. Atserias, M. Ciaramita, and G. Attardi. Ranking very many typed entities on wikipedia. in Proceedings of the sixteenth ACM conference on Conference on information and knowledge management. ACM (2007) Zaragoza, H., H. Rode, P. Mika, J. Atserias, M. Ciaramita, and G. Attardi. Ranking very many typed entities on wikipedia. in Proceedings of the sixteenth ACM conference on Conference on information and knowledge management. ACM (2007)
47.
go back to reference Zhirov, A., Zhirov, O., Shepelyansky, D.L.: Two-dimensional ranking of Wikipedia articles. The European Physical Journal B. 77(4), 523–531 (2010)CrossRefMATH Zhirov, A., Zhirov, O., Shepelyansky, D.L.: Two-dimensional ranking of Wikipedia articles. The European Physical Journal B. 77(4), 523–531 (2010)CrossRefMATH
Metadata
Title
Wikiometrics: a Wikipedia based ranking system
Authors
Gilad Katz
Lior Rokach
Publication date
19-01-2017
Publisher
Springer US
Published in
World Wide Web / Issue 6/2017
Print ISSN: 1386-145X
Electronic ISSN: 1573-1413
DOI
https://doi.org/10.1007/s11280-016-0427-8

Other articles of this Issue 6/2017

World Wide Web 6/2017 Go to the issue

Premium Partner