ABSTRACT
Automatically populating ontology with named entities extracted from the unstructured text has become a key issue for Semantic Web and knowledge management techniques. This issue naturally consists of two subtasks: (1) for the entity mention whose mapping entity does not exist in the ontology, attach it to the right category in the ontology (i.e., fine-grained named entity classification), and (2) for the entity mention whose mapping entity is contained in the ontology, link it with its mapping real world entity in the ontology (i.e., entity linking). Previous studies only focus on one of the two subtasks and cannot solve this task of populating ontology with named entities integrally. This paper proposes APOLLO, a grAph-based aPproach for pOpuLating ontoLOgy with named entities. APOLLO leverages the rich semantic knowledge embedded in the Wikipedia to resolve this task via random walks on graphs. Meanwhile, APOLLO can be directly applied to either of the two subtasks with minimal revision. We have conducted a thorough experimental study to evaluate the performance of APOLLO. The experimental results show that APOLLO achieves significant accuracy improvement for the task of ontology population with named entities, and outperforms the baseline methods for both subtasks.
- S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, and Z. Ives. Dbpedia: A nucleus for a web of open data. In ISWC'07, pages 11--15. Google ScholarDigital Library
- S. Baluja, R. Seth, D. Sivakumar, Y. Jing, J. Yagnik, S. Kumar, D. Ravich, and R. M. Aly. Video suggestion and discovery for youtube: Taking random walks through the view graph. In WWW'08, pages 895--904. Google ScholarDigital Library
- M. Banko, M. J. Cafarella, S. Soderl, M. Broadhead, and O. Etzioni. Open information extraction from the web. In IJCAI'07, pages 2670--2676. Google ScholarDigital Library
- R. Bunescu and M. Pasca. Using encyclopedic knowledge for named entity disambiguation. In EACL'06, pages 9--16.Google Scholar
- R. L. Cilibrasi and P. M. B. Vitanyi. The google similarity distance. IEEE Trans. on Knowl. and Data Eng., 19:370--383, March 2007. Google ScholarDigital Library
- P. Cimiano and J. Völker. Towards Large-Scale, Open-Domain and Ontology-Based Named Entity Classification. In RANLP'05, pages 166--172.Google Scholar
- S. Cucerzan. Large-scale named entity disambiguation based on wikipedia data. In EMNLP and CoNLL, pages 708--716, 2007.Google Scholar
- M. Dredze, P. McNamee, D. Rao, A. Gerber, and T. Finin. Entity disambiguation for knowledge base population. In COLING'10, pages 277--285. Google ScholarDigital Library
- A. Ekbal, E. Sourjikova, A. Frank, and S. P. Ponzetto. Assessing the challenge of fine-grained named entity recognition and classification. In Proceedings of the 2010 Named Entities Workshop, pages 93--101, 2010. Google ScholarDigital Library
- O. Etzioni, M. Cafarella, D. Downey, A.-M. Popescu, T. Shaked, S. Soderland, D. S. Weld, and A. Yates. Unsupervised named-entity extraction from the web: an experimental study. Artif. Intell., 165:91--134, 2005. Google ScholarDigital Library
- C. Fellbaum, editor. WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA, 1998.Google ScholarCross Ref
- M. Fleischman and E. Hovy. Fine grained classification of named entities. In COLING'02, pages 1--7. Google ScholarDigital Library
- V. Ganti, A. C. König, and R. Vernica. Entity categorization over large document collections. In SIGKDD'08, pages 274--282. Google ScholarDigital Library
- C. Giuliano. Fine-grained classification of named entities exploiting latent semantic kernels. In CoNLL'09, pages 201--209. Google ScholarDigital Library
- C. Giuliano and A. Gliozzo. Instance-based ontology population exploiting named-entity substitution. In COLING'08, pages 265--272. Google ScholarDigital Library
- Z. Harris. Distributional structure. Word, 10(23):146--162, 1954.Google ScholarCross Ref
- A. McCallum and W. Li. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In CONLL'03, pages 188--191. Google ScholarDigital Library
- P. McNamee, H. Simpson, and H. T. Dang. Overview of the tac 2009 knowledge base population track. In Text Analysis Conference (TAC 2009).Google Scholar
- D. Milne and I. H. Witten. An effective, low-cost measure of semantic relatedness obtained from Wikipedia links. In WIKIAI'08.Google Scholar
- D. Milne and I. H. Witten. Learning to link with wikipedia. In CIKM'08, pages 509--518. Google ScholarDigital Library
- D. Nadeau and S. Sekine. A Survey of Named Entity Recognition and Classification. Journal of Linguisticae Investigationes, 30(1):1--20, 2007.Google ScholarCross Ref
- W. Shen, J. Wang, P. Luo, and M. Wang. Apollo: a general framework for populating ontology with named entities via random walks on graphs. In WWW'12, pages 595--596. Google ScholarDigital Library
- W. Shen, J. Wang, P. Luo, and M. Wang. Liege: Link entities in web lists with knowledge base. In SIGKDD'12, pages 1424--1432. Google ScholarDigital Library
- W. Shen, J. Wang, P. Luo, and M. Wang. Linden: linking named entities with knowledge base via semantic knowledge. In WWW'12, pages 449--458. Google ScholarDigital Library
- W. Shen, J. Wang, P. Luo, M. Wang, and C. Yao. Reactor: a framework for semantic relation extraction and tagging over enterprise data. In WWW'11, pages 121--122. Google ScholarDigital Library
- F. M. Suchanek, G. Kasneci, and G. Weikum. Yago: A core of semantic knowledge unifying wordnet and wikipedia. In WWW'07, pages 697--706. Google ScholarDigital Library
- P. P. Talukdar, J. Reisinger, M. Paşca, D. Ravichandran, R. Bhagat, and F. Pereira. Weakly-supervised acquisition of labeled class instances using graph random walks. In EMNLP'08, pages 582--590. Google ScholarDigital Library
- H. Tanev and B. Magnini. Weakly supervised approaches for ontology population. In Proceeding of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge, pages 129--143. Google ScholarDigital Library
Index Terms
- A graph-based approach for ontology population with named entities
Recommendations
APOLLO: a general framework for populating ontology with named entities via random walks on graphs
WWW '12 Companion: Proceedings of the 21st International Conference on World Wide WebAutomatically populating ontology with named entities extracted from the unstructured text has become a key issue for Semantic Web. This issue naturally consists of two subtasks: (1) for the entity mention whose mapping entity does not exist in the ...
Weakly Supervised Approaches for Ontology Population
Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and KnowledgeWe present a weakly supervised approach to automatic ontology population from text and compare it with two other unsupervised approaches. In our experiments we populate a part of our ontology of Named Entities. We considered two high level categories-...
Re-ranking for joint named-entity recognition and linking
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge ManagementRecognizing names and linking them to structured data is a fundamental task in text analysis. Existing approaches typically perform these two steps using a pipeline architecture: they use a Named-Entity Recognition (NER) system to find the boundaries of ...
Comments