ABSTRACT
Wikipedia's rich category structure has helped make it one of the largest semantic taxonomies in existence, a property that has been central to much recent research. However, Wikipedia's category representation is simplistic: an article contains a single list of categories, with no data about their relative importance. We investigate the ordering of category lists to determine how a category's position in the list correlates with its relevance to the article and overall significance. We identify a number of interesting connections between a category's position and its persistence within the article, age, popularity, size, and descriptiveness.
- Wikipedia traffic. http://dammit.lt/wikistats/.Google Scholar
- Wikipedia:categorization. http://en.wikipedia.org/wiki/Wikipedia:Categorization.Google Scholar
- S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives. DBpedia: A nucleus for a web of open data. In ISWC, pages 722--735. Springer, 2007. Google ScholarDigital Library
- M. Koolen and J. Kamps. Are semantically related links more effective for retrieval? In ECIR, pages 92--103. Springer, 2011. Google ScholarDigital Library
- S. P. Ponzetto and M. Strube. Deriving a large-scale taxonomy from Wikipedia. In AAAI, pages 1440--1445, 2007. Google ScholarDigital Library
- F. M. Suchanek, G. Kasneci, and G. Weikum. Yago: a core of semantic knowledge. WWW '07, pages 697--706, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
Index Terms
- Examining the "leftness" property of Wikipedia categories
Recommendations
Use of Wikipedia categories on information retrieval research: a brief review
CERI '18: Proceedings of the 5th Spanish Conference on Information RetrievalWikipedia categories, a classification scheme built for organizing and describing Wikpedia articles, are being applied in computer science research. This paper adopts a systematic literature review approach, in order to identify different approaches and ...
TagTheWeb: Using Wikipedia Categories to Automatically Categorize Resources on the Web
The Semantic Web: ESWC 2018 Satellite EventsAbstractIdentifying topics associated with a set of documents is a common task for many applications and can be used to improve various tasks involving documents on the Web, such as search, retrieval, recommendation, and clustering. To address this ...
Analysis of structured data on Wikipedia
Wikipedia has been widely used for information consumption or for implementing solutions using its content. It contains primarily unstructured text about entities, but it can also contain infoboxes, which are structured attributes describing these ...
Comments