ABSTRACT
Named Entity Recognition (NER) is a subtask of information extraction and aims to identify atomic entities in text that fall into predefined categories such as person, location, organization, etc. Recent efforts in NER try to extract entities and link them to linked data entities. Linked data is a term used for data resources that are created using semantic web standards such as DBpedia. There are a number of online tools that try to identify named entities in text and link them to linked data resources. Although one can use these tools via their APIs and web interfaces, they use different data resources and different techniques to identify named entities and not all of them reveal this information. One of the major tasks in NER is disambiguation that is identifying the right entity among a number of entities with the same names; for example "apple" standing for both "Apple, Inc." the company and the fruit. We developed a similar tool called NERSO, short for Named Entity Recognition Using Semantic Open Data, to automatically extract named entities, disambiguating and linking them to DBpedia entities. Our disambiguation method is based on constructing a graph of linked data entities and scoring them using a graph-based centrality algorithm. We evaluate our system by comparing its performance with two publicly available NER tools. The results show that NERSO performs better.
- Sinha, R., Mihalcea, R. 2007. Unsupervised graph-based word sense disambiguation using measures of word semantic similarity. In Proceedings of the IEEE International Conference on Semantic Computing (ICSC 2007). Google ScholarDigital Library
- Gentile, A., Zhang, Z., Xia, L. 2009. Graph-based semantic relatedness for named entity disambiguation. In Proceedings of International Conference on Software, Services & Semantic Technologies, 2009.Google Scholar
- Gerber, A., Gao, L. 2011. A Scoping Study of (Who, What, When, Where) Semantic Tagging Services. Research report Public Release February 2011, eResearch Lab, The University of QueenslandGoogle Scholar
- Hassell, J., Aleman-Meza, B. 2006. Ontology-driven automatic entity disambiguation in unstructured text. In Proc. 5th International Semantic Web Conference (ISWC), volume 4273 of LNCS, pp. 44--57, Athens, GA, 2006 Google ScholarDigital Library
- Cucerzan, S. 2007. Large-scale named entity disambiguation based on Wikipedia data. In Proc. of Empirical Methods in Natural Language Processing Conference on Computational Natural Language Learning 2007, pp. 708--716, 2007.Google Scholar
- Ni, Y., Zhang, L., Qiu, Z., Wang, C. 2010. Enhancing the open-domain classification of named entity using linked open data. In Proc. 9th International Semantic Web Conference (ISWC 2010), pp. 566--581, Shanghai, China, 2010. Google ScholarDigital Library
- Fogarolli, A. 2009. Word Sense Disambiguation Based on Wikipedia Link Structure. IEEE International Conference on Semantic Computing, pp. 77--82, 2009. Google ScholarDigital Library
- Han, X., Zhao, J. 2009. Named entity disambiguation by leveraging Wikipedia semantic knowledge. In Proc. of the 18th ACM Conference on Information and Knowledge Management, (CIKM 2009), pp. 215--224, 2009. Google ScholarDigital Library
- Mendes, P. N., Jakob, M., García-Silva, A., Bizer, C. 2011. DBpedia Spotlight: Shedding Light on the Web of Documents. Proceedings of the 7th International Conference on Semantic Systems (I-Semantics). Graz, Austria, 7--9 September 2011. Google ScholarDigital Library
- Bunescu, R., Pasca, M. 2006. Using encyclopedic knowledge for named entity disambiguation. In Proc. of EACL, pp. 9--16.Google Scholar
- Kulkarni, S., Singh, A., Ramakrishnan, G., Chakrabarti, S. 2009. Collective annotation of Wikipedia entities in web text. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD 2009), pp. 457--466, New York, NY, USA, 2009. Google ScholarDigital Library
- Hoffart, J., Yosef, M., A., Bordino, I., Furstenau, H., Pinkal, M., Spaniol, M., Taneva, B., Thater, S., Weikum. G. 2011. Robust Disambiguation of Named Entities in Text. In Proc. of Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 782--792, July 2011. Google ScholarDigital Library
- Ferragina, P., Scaiella, U. 2010. Tagme: on-the-fly annotation of short text fragments (by Wikipedia entities). In Proc. of the 19th ACM Conference on Information and Knowledge Management, (CIKM 2010), 1625--1628. Google ScholarDigital Library
- Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S. 2009. DBpedia - A crystallization point for the Web of Data. Journal of Web Semantics: Science, Services and Agents on the World Wide Web, 7(3), 154--165, 2009. Google ScholarDigital Library
- Bizer, C., Heath, T., Berners-Lee, T. 2009. Linked data-the story so far. Int. Journal on Semantic Web and Information Systems, Special Issue on Linked Data, 4(2), 1--22, 2009.Google ScholarCross Ref
- Mihalcea, R., Csomai, A. 2007. Wikify!: linking documents to encyclopedic knowledge. In Proc. of the 16th ACM Conference on Information and Knowledge management (CIKM 2007), Lisbon, Portugal, pp. 233--242, 2007 Google ScholarDigital Library
Index Terms
- Named entity recognition and disambiguation using linked data and graph-based centrality scoring
Recommendations
Context Aware Named Entity Disambiguation
WI-IAT '12: Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01Recently, named entity recognition tools tend to disambiguate recognized named entities on a very detailed level. Instead of elementary types (e.g. Person or Location), they assign concrete identifiers, trying to distinguish even different entities ...
Various approaches to text representation for named entity disambiguation
IIWAS '12: Proceedings of the 14th International Conference on Information Integration and Web-based Applications & ServicesIn this paper, we focus on the problem of named entity disambiguation. We disambiguate named entities on a very detailed level. To each entity is assigned a concrete identifier of a corresponding Wikipedia article describing the entity. For such a fine ...
Improving named entity recognition and disambiguation in news headlines
In this paper, we present a framework for extraction and disambiguation of hyphenated and partially named entities in news headlines. The direct application of state-of-the-art named entity detection and disambiguation approaches on news headlines results ...
Comments