ABSTRACT
We designed and implemented TAGME, a system that is able to efficiently and judiciously augment a plain-text with pertinent hyperlinks to Wikipedia pages. The specialty of TAGME with respect to known systems [5,8] is that it may annotate texts which are short and poorly composed, such as snippets of search-engine results, tweets, news, etc.. This annotation is extremely informative, so any task that is currently addressed using the bag-of-words paradigm could benefit from using this annotation to draw upon (the millions of) Wikipedia pages and their inter-relations.
- C. Carpineto, S. Osinski, G. Romano, and D. Weiss. A survey of web clustering engines. ACM Comput. Surv., 41(3):1--38, 2009. Google ScholarDigital Library
- S. Cucerzan. Large-scale named entity disambiguation based on Wikipedia data. Proc. of Empirical Methods in NLP, 2007.Google Scholar
- P. Ferragina and A. Gulli. A personalized search engine based on web-snippet hierarchical clustering. Softw. Pract. & Exper., 38(2): 189--225, 2008. Google ScholarDigital Library
- P. Ferragina and U. Scaiella. TAGME: On-the-fly annotation of short text fragents (by Wikipedia entities). Available on http://arxiv.org/abs/1006.3498.Google Scholar
- S. Kulkarni, A. Singh, G. Ramakrishnan, and S. Chakrabarti. Collective annotation of Wikipedia entities in web text. In Proc. ACM KDD, 457--466, 2009. Google ScholarDigital Library
- R. Mihalcea and A. Csomai. Wikify!: linking documents to encyclopedic knowledge. In Proc. ACM CIKM, 233--242, 2007. Google ScholarDigital Library
- D. Milne and I. H. Witten. An effective, low-cost measure of semantic relatedness obtained from Wikipedia links. In Proc. AAAI Workshop on Wikipedia and Artificial Intelligence, 2008.Google Scholar
- D. Milne and I. H. Witten. Learning to link with Wikipedia. In Proc. ACM CIKM, 509--518, 2008. Google ScholarDigital Library
Index Terms
- TAGME: on-the-fly annotation of short text fragments (by wikipedia entities)
Recommendations
Learning to link with wikipedia
CIKM '08: Proceedings of the 17th ACM conference on Information and knowledge managementThis paper describes how to automatically cross-reference documents with Wikipedia: the largest knowledge base ever known. It explains how machine learning can be used to identify significant terms within unstructured text, and enrich it with links to ...
From TagME to WAT: a new entity annotator
ERD '14: Proceedings of the first international workshop on Entity recognition & disambiguationIn this paper we propose a novel entity annotator for texts which hinges on TagME's algorithmic technology, currently the best one available. The novelty is twofold: from the one hand, we have engineered the software in order to be modular and more ...
Wikify!: linking documents to encyclopedic knowledge
CIKM '07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge managementThis paper introduces the use of Wikipedia as a resource for automatic keyword extraction and word sense disambiguation, and shows how this online encyclopedia can be used to achieve state-of-the-art results on both these tasks. The paper also shows how ...
Comments