ABSTRACT
Wikipedia has grown to be the world largest and busiest free encyclopedia, in which articles are collaboratively written and maintained by volunteers online. Despite its success as a means of knowledge sharing and collaboration, the public has never stopped criticizing the quality of Wikipedia articles edited by non-experts and inexperienced contributors. In this paper, we investigate the problem of assessing the quality of articles in collaborative authoring of Wikipedia. We propose three article quality measurement models that make use of the interaction data between articles and their contributors derived from the article edit history. Our B<scp>asic</scp> model is designed based on the mutual dependency between article quality and their author authority. The P<scp>eer</scp>R<scp>eview</scp> model introduces the review behavior into measuring article quality. Finally, our P<scp>rob</scp>R<scp>eview</scp> models extend P<scp>eer</scp>R<scp>eview</scp> with partial reviewership of contributors as they edit various portions of the articles. We conduct experiments on a set of well-labeled Wikipedia articles to evaluate the effectiveness of our quality measurement models in resembling human judgement.
- S. F. Adafre and M. de Rijke. Discovering missing links in Wikipedia. In Proc. of LinkKDD'05, pages 90--97, 2005. Google ScholarDigital Library
- B. T. Adler and L. de Alfaro. A content-driven reputation system for the Wikipedia. In Proc. of WWW'07, pages 261--270, 2007. Google ScholarDigital Library
- E. Agichtein, E. Brill, and S. Dumais. Improving Web search ranking by incoporating user behavior information. In Proc. of SIGIR'06, pages 19--26, 2006. Google ScholarDigital Library
- R. B. Almeida, B. Mozafari, and J. Cho. On the evolution of Wikipedia. In Proc. of ICWSM'07, March 2007.Google Scholar
- D. Anthony, S. Smith, and T. Williamson. Explaining quality in Internet collective goods: Zealots and good samaritans in the case of Wikipedia, 2005. Retireved online: http://web.mit.edu/iandeseminar/Papers/Fall2005/anthony.pdf.Google Scholar
- T. Cross. Puppy smoothies: Improving the reliability of open, collaborative wikis, 2006. Retrieved online: http://www.firstmonday.org/issues/issue11_9/cross/index.html.Google Scholar
- C. Dwork, R. Kumar, and M. Naor. Rank aggregation methods for the Web. In Proc. of WWW'01, pages 613--622, 2001. Google ScholarDigital Library
- J. Giles. Internet encyclopaedias go head to head, 2005. Published online: 14 December 2005 http://www.nature.com/news/2005/051212/full/438900a.html.Google Scholar
- J. Goldstein, M. Kantrowitz, V. Mittal, and J. Carbonell. Summarizing text documents: Sentence selection and evaluation metrics. In Proc. of SIGIR'99, pages 121--128, 1999. Google ScholarDigital Library
- G. H. Golub and C. F. V. Loan. Matrix Computations. Johns Hopkins University Press, 3rd edition, 1996.Google Scholar
- Z. Gyöngyi, P. Berkhin, H. Garcia-Molina, and J. Pedersen. Link spam detection based on mass estimation. In Proc. of VLDB'06, pages 439--450, 2006. Google ScholarDigital Library
- Z. Gyöngyi, H. Garcia-Molina, and J. Pedersen. Combating Web spam with TrustRank. In Proc. of VLDB'04, pages 576--587, 2004. Google ScholarDigital Library
- K. Jarvelin and J. Kekalainen. IR evaluation methods for retrieving highly relevant documents. In Proc. of SIGIR'00, pages 41--48, 2000. Google ScholarDigital Library
- G. Jeh and J. Widom. Scaling personalized Web search. In Proc. of WWW'03, pages 271--279, May 2003. Google ScholarDigital Library
- J. M. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5):604--632, 1999. Google ScholarDigital Library
- A. Lih. Wikipedia as participatory journalism: Reliable sources? metrics for evaluating collaborative media as a news resource. In Proc. of the 5th International Symposium on Online Journalism, April 2004.Google Scholar
- E.-P. Lim, B.-Q. Vuong, H. W. Lauw, and A. Sun. Measuring qualities of articles contributed by online communities. In Proc. of WI'06, pages 81--87, December 2006. Google ScholarDigital Library
- Max Völkel and Markus Krötzsch and Denny Vrandečić and Heiko Haller and Rudi Studer. Semantic Wikipedia. In Proc. of WWW'06, pages 585--594, 2006. Google ScholarDigital Library
- G. A. Miller. The magical number seven, plus or minus two: Some limits on our capacity for processing information. The Pychological Review, 63:81--97, 1956.Google ScholarCross Ref
- B. B. C. News. Wikipedia survives research test, 2005. Published online: 15 December 2005 http://news.bbc.co.uk/2/hi/technology/4530930.stm.Google Scholar
- A. Orlowski. Wikipedia founder admits to serious quality problems, 2005. Published online: 18 October 2005 http://www.theregister.co.uk/2005/10/18/wikipedia_quality_problem.Google Scholar
- L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the Web. Technical report, November 1999.Google Scholar
- P. Schönhofen. Identifying document topics using the Wikipedia category network. In Proc. of WI'06, pages 456--462, 2006. Google ScholarDigital Library
- M. Strube and S. P. Ponzetto. Wikirelate! computing semantic relatedness using Wikipedia. In Proc. of AAAI'06, pages 1419--1424, 2006. Google ScholarDigital Library
- P. Tsaparas. Using non-linear dynamical systems for Web searching and ranking. In Proc. of PODS'04, pages 59--70, 2004. Google ScholarDigital Library
- J. Voss. Measuring Wikipedia. In Proc. of the 10th International Conference of the International Society for Scientometrics and Informatics, pages 221--231, July 2005.Google Scholar
- J. Wales. Wikipedia sociographics, 2004. Retrieved online: www.ccc.de/congress/2004/fahrplan/files/372-wikipedia-sociographics-slides.pdf.Google Scholar
- Wikipedia. Replies to common objections, 2007. http://en.wikipedia.org/wiki/Replies_to_common_objections Accessed on April 2007.Google Scholar
- H. Zeng, M. A. Alhossaini, L. Ding, R. Fikes, and D. L. McGuinness. Computing trust from revision history. In Proc. of International Conference on Privacy, Security and Trust, October-November 2006. Google ScholarDigital Library
Index Terms
- Measuring article quality in wikipedia: models and evaluation
Recommendations
Who does what: Collaboration patterns in the wikipedia and their impact on article quality
The quality of Wikipedia articles is debatable. On the one hand, existing research indicates that not only are people willing to contribute articles but the quality of these articles is close to that found in conventional encyclopedias. On the other ...
Statistical measure of quality in Wikipedia
SOMA '10: Proceedings of the First Workshop on Social Media AnalyticsWikipedia is commonly viewed as the main online encyclopedia. Its content quality, however, has often been questioned due to the open nature of its editing model. A high--quality contribution by an expert may be followed by a low-quality contribution ...
Assessing quality score of Wikipedia article using mutual evaluation of editors and texts
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge ManagementIn this paper, we propose a method for assessing quality scores of Wikipedia articles by mutually evaluating editors and texts. Survival ratio based approach is a major approach to assessing article quality. In this approach, when a text survives beyond ...
Comments