Skip to main content
Erschienen in: World Wide Web 6/2017

19.01.2017

Wikiometrics: a Wikipedia based ranking system

verfasst von: Gilad Katz, Lior Rokach

Erschienen in: World Wide Web | Ausgabe 6/2017

Einloggen

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We present a new concept—Wikiometrics—the derivation of metrics and indicators from Wikipedia. Wikipedia provides an accurate representation of the real world due to its size, structure, editing policy and popularity. We demonstrate an innovative “mining” methodology, where different elements of Wikipedia – content, structure, editorial actions and reader reviews – are used to rank items in a manner which is by no means inferior to rankings produced by experts or other methods. We test our proposed method by applying it to two real-world ranking problems: top world universities and academic journals. Our proposed ranking methods were compared to leading and widely accepted benchmarks, and were found to be extremely correlative but with the advantage of the data being publically available.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Agrawal, V.K., Agrawal, V., Rungtusanatham, M.: Theoretical and interpretation challenges to using the author affiliation index method to rank journals. Prod. Oper. Manag. 20(2), 280–300 (2011)CrossRef Agrawal, V.K., Agrawal, V., Rungtusanatham, M.: Theoretical and interpretation challenges to using the author affiliation index method to rank journals. Prod. Oper. Manag. 20(2), 280–300 (2011)CrossRef
2.
Zurück zum Zitat Aguillo, I.F., Bar-Ilan, J., Levene, M., Ortega, J.L.: Comparing university rankings. Scientometrics. 85(1), 243–256 (2010)CrossRef Aguillo, I.F., Bar-Ilan, J., Levene, M., Ortega, J.L.: Comparing university rankings. Scientometrics. 85(1), 243–256 (2010)CrossRef
3.
Zurück zum Zitat Al-Maskari, A., Sanderson, M., and Clough, P. The relationship between IR effectiveness measures and user satisfaction. in Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM (2007) Al-Maskari, A., Sanderson, M., and Clough, P. The relationship between IR effectiveness measures and user satisfaction. in Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM (2007)
4.
Zurück zum Zitat Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives Z.: DBpedia: a nucleus for a Web of open data. In: Aberer, K., et al. (eds.) The semantic Web. Lect. Notes Comput. Sci. vol 4825. Springer, Berlin (2007) Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives Z.: DBpedia: a nucleus for a Web of open data. In: Aberer, K., et al. (eds.) The semantic Web. Lect. Notes Comput. Sci. vol 4825. Springer, Berlin (2007)
5.
Zurück zum Zitat Balog, K., M. Bron, and M. De Rijke, Category-based query modeling for entity search, in Advances in Information Retrieval, Springer. p. 319–331 (2010) Balog, K., M. Bron, and M. De Rijke, Category-based query modeling for entity search, in Advances in Information Retrieval, Springer. p. 319–331 (2010)
6.
Zurück zum Zitat Bergstrom, C.: Measuring the value and prestige of scholarly journals. College & Research Libraries News. 68(5), 314–316 (2007)CrossRef Bergstrom, C.: Measuring the value and prestige of scholarly journals. College & Research Libraries News. 68(5), 314–316 (2007)CrossRef
7.
Zurück zum Zitat Brynjolfsson, E., Hu, Y., Simester, D.: Goodbye pareto principle, hello long tail: the effect of search costs on the concentration of product sales. Manag. Sci. 57(8), 1373–1386 (2011)CrossRef Brynjolfsson, E., Hu, Y., Simester, D.: Goodbye pareto principle, hello long tail: the effect of search costs on the concentration of product sales. Manag. Sci. 57(8), 1373–1386 (2011)CrossRef
8.
Zurück zum Zitat Calver, M., Bradley, J.: Should we use the mean citations per paper to summarise a journal’s impact or to rank journals in the same field? Scientometrics. 81(3), 611–615 (2009)CrossRef Calver, M., Bradley, J.: Should we use the mean citations per paper to summarise a journal’s impact or to rank journals in the same field? Scientometrics. 81(3), 611–615 (2009)CrossRef
9.
Zurück zum Zitat Cheng, C.H., Holsapple, C.W., Lee, A.: Citation-based journal rankings for AI research a business perspective. AI Mag. 17(2), 87 (1996) Cheng, C.H., Holsapple, C.W., Lee, A.: Citation-based journal rankings for AI research a business perspective. AI Mag. 17(2), 87 (1996)
10.
Zurück zum Zitat Chepelianskii, A.D., Towards physical laws for software architecture. arXiv preprint arXiv:1003.5455, (2010) Chepelianskii, A.D., Towards physical laws for software architecture. arXiv preprint arXiv:1003.5455, (2010)
11.
Zurück zum Zitat Cronin, B., Meho, L.I.: Applying the author affiliation index to library and information science journals. J. Am. Soc. Inf. Sci. Technol. 59(11), 1861–1865 (2008)CrossRef Cronin, B., Meho, L.I.: Applying the author affiliation index to library and information science journals. J. Am. Soc. Inf. Sci. Technol. 59(11), 1861–1865 (2008)CrossRef
12.
Zurück zum Zitat Demartini, G., C.S. Firan, T. Iofciu, and W. Nejdl, Semantically enhanced entity ranking, in Web Information Systems Engineering-WISE 2008. Springer. p. 176–188 (2008) Demartini, G., C.S. Firan, T. Iofciu, and W. Nejdl, Semantically enhanced entity ranking, in Web Information Systems Engineering-WISE 2008. Springer. p. 176–188 (2008)
13.
Zurück zum Zitat Eom, Y.-H., Frahm, K.M., Benczúr, A., and Shepelyansky, D.L, Time evolution of Wikipedia network ranking. arXiv preprint arXiv:1304.6601, (2013) Eom, Y.-H., Frahm, K.M., Benczúr, A., and Shepelyansky, D.L, Time evolution of Wikipedia network ranking. arXiv preprint arXiv:1304.6601, (2013)
14.
Zurück zum Zitat Fader, A., Soderland, S., Etzioni, O., and Center, T. Scaling Wikipedia-based named entity disambiguation to arbitrary web text. in Proceedings of the IJCAI Workshop on User-contributed Knowledge and Artificial Intelligence: An Evolving Synergy, Pasadena, CA, USA. (2009) Fader, A., Soderland, S., Etzioni, O., and Center, T. Scaling Wikipedia-based named entity disambiguation to arbitrary web text. in Proceedings of the IJCAI Workshop on User-contributed Knowledge and Artificial Intelligence: An Evolving Synergy, Pasadena, CA, USA. (2009)
15.
Zurück zum Zitat Ferron, M., Massa, P.: The Arab spring| wikirevolutions: Wikipedia as a lens for studying the real-time formation of collective memories of revolutions. International Journal of Communication. 5, 20 (2011) Ferron, M., Massa, P.: The Arab spring| wikirevolutions: Wikipedia as a lens for studying the real-time formation of collective memories of revolutions. International Journal of Communication. 5, 20 (2011)
16.
Zurück zum Zitat Garfield, E.: The history and meaning of the journal impact factor. JAMA. 295(1), 90–93 (2006)CrossRef Garfield, E.: The history and meaning of the journal impact factor. JAMA. 295(1), 90–93 (2006)CrossRef
17.
Zurück zum Zitat Hachey, B., Radford, W., Nothman, J., Honnibal, M., Curran, J.R.: Evaluating entity linking with Wikipedia. Artif. Intell. 194, 130–150 (2013)MathSciNetCrossRefMATH Hachey, B., Radford, W., Nothman, J., Honnibal, M., Curran, J.R.: Evaluating entity linking with Wikipedia. Artif. Intell. 194, 130–150 (2013)MathSciNetCrossRefMATH
18.
Zurück zum Zitat Harless, D. and Reilly, R., Revision of the journal list for doctoral designation. Unpublished report, Virginia Commonwealth University, Richmond, VA. Retrieved June, 1998. 17: (2008) Harless, D. and Reilly, R., Revision of the journal list for doctoral designation. Unpublished report, Virginia Commonwealth University, Richmond, VA. Retrieved June, 1998. 17: (2008)
19.
Zurück zum Zitat Harzing, A.-W., Van der Wal, R.: Google scholar: the democratization of citation analysis. Ethics in Science and Environmental Politics. 8(1), 61–73 (2007) Harzing, A.-W., Van der Wal, R.: Google scholar: the democratization of citation analysis. Ethics in Science and Environmental Politics. 8(1), 61–73 (2007)
20.
Zurück zum Zitat Hoffart, J., M.A. Yosef, I. Bordino, H. Fürstenau, M. Pinkal, M. Spaniol, B. Taneva, S. Thater, and G. Weikum. Robust disambiguation of named entities in text. in Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (2011) Hoffart, J., M.A. Yosef, I. Bordino, H. Fürstenau, M. Pinkal, M. Spaniol, B. Taneva, S. Thater, and G. Weikum. Robust disambiguation of named entities in text. in Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (2011)
21.
Zurück zum Zitat Holsapple, C.W.: A publication power approach for identifying premier information systems journals. J. Am. Soc. Inf. Sci. Technol. 59(2), 166–185 (2008)CrossRef Holsapple, C.W.: A publication power approach for identifying premier information systems journals. J. Am. Soc. Inf. Sci. Technol. 59(2), 166–185 (2008)CrossRef
22.
Zurück zum Zitat Kaptein, R., Kamps, J.: Exploiting the category structure of Wikipedia for entity ranking. Artif. Intell. 194, 111–129 (2013)CrossRefMATH Kaptein, R., Kamps, J.: Exploiting the category structure of Wikipedia for entity ranking. Artif. Intell. 194, 111–129 (2013)CrossRefMATH
23.
Zurück zum Zitat Kaptein, R., P. Serdyukov, A. De Vries, and J. Kamps. Entity ranking using Wikipedia as a pivot. in Proceedings of the 19th ACM international conference on Information and Knowl. Manag. ACM (2010) Kaptein, R., P. Serdyukov, A. De Vries, and J. Kamps. Entity ranking using Wikipedia as a pivot. in Proceedings of the 19th ACM international conference on Information and Knowl. Manag. ACM (2010)
25.
Zurück zum Zitat Lages, J., Patt, A., and Shepelyansky, D.L., Wikipedia Ranking of World Universities. arXiv preprint arXiv:1511.09021, (2015) Lages, J., Patt, A., and Shepelyansky, D.L., Wikipedia Ranking of World Universities. arXiv preprint arXiv:1511.09021, (2015)
26.
Zurück zum Zitat Marginson, S., Van der Wende, M.: To rank or to be ranked: the impact of global rankings in higher education. J. Stud. Int. Educ. 11(3–4), 306–329 (2007)CrossRef Marginson, S., Van der Wende, M.: To rank or to be ranked: the impact of global rankings in higher education. J. Stud. Int. Educ. 11(3–4), 306–329 (2007)CrossRef
27.
Zurück zum Zitat McKean, J. and T. Hettmansperger, Robust nonparametric statistical methods: CRC Press (2011) McKean, J. and T. Hettmansperger, Robust nonparametric statistical methods: CRC Press (2011)
28.
Zurück zum Zitat McKinnon, K.I.: Convergence of the Nelder--mead simplex method to a Nonstationary point. SIAM J. Optim. 9(1), 148–158 (1998)MathSciNetCrossRefMATH McKinnon, K.I.: Convergence of the Nelder--mead simplex method to a Nonstationary point. SIAM J. Optim. 9(1), 148–158 (1998)MathSciNetCrossRefMATH
29.
Zurück zum Zitat Mestyán, M., Yasseri, T., Kertész, J.: Early prediction of movie box office success based on Wikipedia activity big data. PLoS One. 8(8), e71226 (2013)CrossRef Mestyán, M., Yasseri, T., Kertész, J.: Early prediction of movie box office success based on Wikipedia activity big data. PLoS One. 8(8), e71226 (2013)CrossRef
30.
Zurück zum Zitat Mirizzi, R., A. Ragone, T. Di Noia, and E. Di Sciascio, Ranking the linked data: the case of dbpedia: Springer (2010) Mirizzi, R., A. Ragone, T. Di Noia, and E. Di Sciascio, Ranking the linked data: the case of dbpedia: Springer (2010)
31.
Zurück zum Zitat Myers, L. and Robe, J., College rankings: history, criticism and reform. Center for College Affordability and Productivity (NJ1), (2009) Myers, L. and Robe, J., College rankings: history, criticism and reform. Center for College Affordability and Productivity (NJ1), (2009)
32.
Zurück zum Zitat Nielsen, F.Å., Wikipedia research and tools: Review and comments. (2011) Nielsen, F.Å., Wikipedia research and tools: Review and comments. (2011)
33.
Zurück zum Zitat Page, L., S. Brin, R. Motwani, and T. Winograd, The PageRank citation ranking: Bringing order to the web. (1999) Page, L., S. Brin, R. Motwani, and T. Winograd, The PageRank citation ranking: Bringing order to the web. (1999)
34.
Zurück zum Zitat Pehcevski, J., A.-M. Vercoustre, and J.A. Thom, Exploiting locality of Wikipedia links in entity ranking, in Advances in Information Retrieval, Springer. p. 258–269 (2008) Pehcevski, J., A.-M. Vercoustre, and J.A. Thom, Exploiting locality of Wikipedia links in entity ranking, in Advances in Information Retrieval, Springer. p. 258–269 (2008)
35.
Zurück zum Zitat Pehcevski, J., Thom, J.A., Vercoustre, A.-M., Naumovski, V.: Entity ranking in Wikipedia: utilising categories, links and topic difficulty prediction. Inf. Retr. 13(5), 568–600 (2010)CrossRef Pehcevski, J., Thom, J.A., Vercoustre, A.-M., Naumovski, V.: Entity ranking in Wikipedia: utilising categories, links and topic difficulty prediction. Inf. Retr. 13(5), 568–600 (2010)CrossRef
36.
Zurück zum Zitat Raviv, H., D. Carmel, and O. Kurland. A ranking framework for entity oriented search using Markov random fields. in Proceedings of the 1st Joint International Workshop on Entity-Oriented and Semantic Search. ACM (2012) Raviv, H., D. Carmel, and O. Kurland. A ranking framework for entity oriented search using Markov random fields. in Proceedings of the 1st Joint International Workshop on Entity-Oriented and Semantic Search. ACM (2012)
37.
Zurück zum Zitat Raviv, H., O. Kurland, and D. Carmel. The cluster hypothesis for entity oriented search. in Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. ACM (2013) Raviv, H., O. Kurland, and D. Carmel. The cluster hypothesis for entity oriented search. in Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. ACM (2013)
38.
Zurück zum Zitat Rokach, L.: Applying the publication power approach to artificial intelligence journals. J. Am. Soc. Inf. Sci. Technol. 63(6), 1270–1277 (2012)CrossRef Rokach, L.: Applying the publication power approach to artificial intelligence journals. J. Am. Soc. Inf. Sci. Technol. 63(6), 1270–1277 (2012)CrossRef
39.
Zurück zum Zitat Schloegl, C., Stock, W.G.: Impact and relevance of LIS journals: a scientometric analysis of international and German-language LIS journals—citation analysis versus reader survey. J. Am. Soc. Inf. Sci. Technol. 55(13), 1155–1168 (2004)CrossRef Schloegl, C., Stock, W.G.: Impact and relevance of LIS journals: a scientometric analysis of international and German-language LIS journals—citation analysis versus reader survey. J. Am. Soc. Inf. Sci. Technol. 55(13), 1155–1168 (2004)CrossRef
40.
Zurück zum Zitat Serenko, A.: The development of an AI journal ranking based on the revealed preference approach. Journal of Informetrics. 4(4), 447–459 (2010)CrossRef Serenko, A.: The development of an AI journal ranking based on the revealed preference approach. Journal of Informetrics. 4(4), 447–459 (2010)CrossRef
41.
Zurück zum Zitat Serenko, A., Dohan, M.: Comparing the expert survey and citation impact journal ranking methods: example from the field of artificial intelligence. Journal of Informetrics. 5(4), 629–648 (2011)CrossRef Serenko, A., Dohan, M.: Comparing the expert survey and citation impact journal ranking methods: example from the field of artificial intelligence. Journal of Informetrics. 5(4), 629–648 (2011)CrossRef
42.
Zurück zum Zitat Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: a large ontology from Wikipedia and WordNet. Web Semant. Sci. Serv. Agents World Wide Web. 6(3), 203–217 (2008)CrossRef Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: a large ontology from Wikipedia and WordNet. Web Semant. Sci. Serv. Agents World Wide Web. 6(3), 203–217 (2008)CrossRef
43.
Zurück zum Zitat Vercoustre, A.-M., J.A. Thom, and J. Pehcevski. Entity ranking in Wikipedia. in Proceedings of the 2008 ACM symposium on Applied computing. ACM (2008a) Vercoustre, A.-M., J.A. Thom, and J. Pehcevski. Entity ranking in Wikipedia. in Proceedings of the 2008 ACM symposium on Applied computing. ACM (2008a)
44.
Zurück zum Zitat Vercoustre, A.-M., J. Pehcevski, and J.A. Thom, Using wikipedia categories and links in entity ranking, in Focused Access to XML Documents, Springer. p. 321–335 (2008b) Vercoustre, A.-M., J. Pehcevski, and J.A. Thom, Using wikipedia categories and links in entity ranking, in Focused Access to XML Documents, Springer. p. 321–335 (2008b)
45.
Zurück zum Zitat Zar, J.H., Spearman rank correlation. Encyclopedia of Biostatistics, (1998) Zar, J.H., Spearman rank correlation. Encyclopedia of Biostatistics, (1998)
46.
Zurück zum Zitat Zaragoza, H., H. Rode, P. Mika, J. Atserias, M. Ciaramita, and G. Attardi. Ranking very many typed entities on wikipedia. in Proceedings of the sixteenth ACM conference on Conference on information and knowledge management. ACM (2007) Zaragoza, H., H. Rode, P. Mika, J. Atserias, M. Ciaramita, and G. Attardi. Ranking very many typed entities on wikipedia. in Proceedings of the sixteenth ACM conference on Conference on information and knowledge management. ACM (2007)
47.
Zurück zum Zitat Zhirov, A., Zhirov, O., Shepelyansky, D.L.: Two-dimensional ranking of Wikipedia articles. The European Physical Journal B. 77(4), 523–531 (2010)CrossRefMATH Zhirov, A., Zhirov, O., Shepelyansky, D.L.: Two-dimensional ranking of Wikipedia articles. The European Physical Journal B. 77(4), 523–531 (2010)CrossRefMATH
Metadaten
Titel
Wikiometrics: a Wikipedia based ranking system
verfasst von
Gilad Katz
Lior Rokach
Publikationsdatum
19.01.2017
Verlag
Springer US
Erschienen in
World Wide Web / Ausgabe 6/2017
Print ISSN: 1386-145X
Elektronische ISSN: 1573-1413
DOI
https://doi.org/10.1007/s11280-016-0427-8

Weitere Artikel der Ausgabe 6/2017

World Wide Web 6/2017 Zur Ausgabe