skip to main content
10.1145/2396761.2396807acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

A graph-based approach for ontology population with named entities

Published:29 October 2012Publication History

ABSTRACT

Automatically populating ontology with named entities extracted from the unstructured text has become a key issue for Semantic Web and knowledge management techniques. This issue naturally consists of two subtasks: (1) for the entity mention whose mapping entity does not exist in the ontology, attach it to the right category in the ontology (i.e., fine-grained named entity classification), and (2) for the entity mention whose mapping entity is contained in the ontology, link it with its mapping real world entity in the ontology (i.e., entity linking). Previous studies only focus on one of the two subtasks and cannot solve this task of populating ontology with named entities integrally. This paper proposes APOLLO, a grAph-based aPproach for pOpuLating ontoLOgy with named entities. APOLLO leverages the rich semantic knowledge embedded in the Wikipedia to resolve this task via random walks on graphs. Meanwhile, APOLLO can be directly applied to either of the two subtasks with minimal revision. We have conducted a thorough experimental study to evaluate the performance of APOLLO. The experimental results show that APOLLO achieves significant accuracy improvement for the task of ontology population with named entities, and outperforms the baseline methods for both subtasks.

References

  1. S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, and Z. Ives. Dbpedia: A nucleus for a web of open data. In ISWC'07, pages 11--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Baluja, R. Seth, D. Sivakumar, Y. Jing, J. Yagnik, S. Kumar, D. Ravich, and R. M. Aly. Video suggestion and discovery for youtube: Taking random walks through the view graph. In WWW'08, pages 895--904. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Banko, M. J. Cafarella, S. Soderl, M. Broadhead, and O. Etzioni. Open information extraction from the web. In IJCAI'07, pages 2670--2676. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Bunescu and M. Pasca. Using encyclopedic knowledge for named entity disambiguation. In EACL'06, pages 9--16.Google ScholarGoogle Scholar
  5. R. L. Cilibrasi and P. M. B. Vitanyi. The google similarity distance. IEEE Trans. on Knowl. and Data Eng., 19:370--383, March 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Cimiano and J. Völker. Towards Large-Scale, Open-Domain and Ontology-Based Named Entity Classification. In RANLP'05, pages 166--172.Google ScholarGoogle Scholar
  7. S. Cucerzan. Large-scale named entity disambiguation based on wikipedia data. In EMNLP and CoNLL, pages 708--716, 2007.Google ScholarGoogle Scholar
  8. M. Dredze, P. McNamee, D. Rao, A. Gerber, and T. Finin. Entity disambiguation for knowledge base population. In COLING'10, pages 277--285. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Ekbal, E. Sourjikova, A. Frank, and S. P. Ponzetto. Assessing the challenge of fine-grained named entity recognition and classification. In Proceedings of the 2010 Named Entities Workshop, pages 93--101, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. O. Etzioni, M. Cafarella, D. Downey, A.-M. Popescu, T. Shaked, S. Soderland, D. S. Weld, and A. Yates. Unsupervised named-entity extraction from the web: an experimental study. Artif. Intell., 165:91--134, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C. Fellbaum, editor. WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  12. M. Fleischman and E. Hovy. Fine grained classification of named entities. In COLING'02, pages 1--7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. V. Ganti, A. C. König, and R. Vernica. Entity categorization over large document collections. In SIGKDD'08, pages 274--282. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. Giuliano. Fine-grained classification of named entities exploiting latent semantic kernels. In CoNLL'09, pages 201--209. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. C. Giuliano and A. Gliozzo. Instance-based ontology population exploiting named-entity substitution. In COLING'08, pages 265--272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Z. Harris. Distributional structure. Word, 10(23):146--162, 1954.Google ScholarGoogle ScholarCross RefCross Ref
  17. A. McCallum and W. Li. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In CONLL'03, pages 188--191. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. McNamee, H. Simpson, and H. T. Dang. Overview of the tac 2009 knowledge base population track. In Text Analysis Conference (TAC 2009).Google ScholarGoogle Scholar
  19. D. Milne and I. H. Witten. An effective, low-cost measure of semantic relatedness obtained from Wikipedia links. In WIKIAI'08.Google ScholarGoogle Scholar
  20. D. Milne and I. H. Witten. Learning to link with wikipedia. In CIKM'08, pages 509--518. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. Nadeau and S. Sekine. A Survey of Named Entity Recognition and Classification. Journal of Linguisticae Investigationes, 30(1):1--20, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  22. W. Shen, J. Wang, P. Luo, and M. Wang. Apollo: a general framework for populating ontology with named entities via random walks on graphs. In WWW'12, pages 595--596. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. W. Shen, J. Wang, P. Luo, and M. Wang. Liege: Link entities in web lists with knowledge base. In SIGKDD'12, pages 1424--1432. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. W. Shen, J. Wang, P. Luo, and M. Wang. Linden: linking named entities with knowledge base via semantic knowledge. In WWW'12, pages 449--458. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. W. Shen, J. Wang, P. Luo, M. Wang, and C. Yao. Reactor: a framework for semantic relation extraction and tagging over enterprise data. In WWW'11, pages 121--122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. F. M. Suchanek, G. Kasneci, and G. Weikum. Yago: A core of semantic knowledge unifying wordnet and wikipedia. In WWW'07, pages 697--706. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. P. P. Talukdar, J. Reisinger, M. Paşca, D. Ravichandran, R. Bhagat, and F. Pereira. Weakly-supervised acquisition of labeled class instances using graph random walks. In EMNLP'08, pages 582--590. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. H. Tanev and B. Magnini. Weakly supervised approaches for ontology population. In Proceeding of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge, pages 129--143. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A graph-based approach for ontology population with named entities

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management
      October 2012
      2840 pages
      ISBN:9781450311564
      DOI:10.1145/2396761

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 29 October 2012

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader