research-article

Using Knowledge Base Semantics in Context-Aware Entity Linking

Authors:
Cheikh Brahim El Vaigh

Inria, IRISA, Rennes, France

Inria, IRISA, Rennes, France
View Profile

,
François Goasdoué

Univ. Rennes 1, IRISA, Lannion, France

Univ. Rennes 1, IRISA, Lannion, France
View Profile

,
Guillaume Gravier

CNRS, IRISA, Rennes, France

CNRS, IRISA, Rennes, France
View Profile

,
Pascale Sébillot

INSA Rennes, IRISA, Rennes, France

INSA Rennes, IRISA, Rennes, France
View Profile

DocEng '19: Proceedings of the ACM Symposium on Document Engineering 2019September 2019Article No.: 8Pages 1–10https://doi.org/10.1145/3342558.3345393

Published:23 September 2019Publication History

DocEng '19: Proceedings of the ACM Symposium on Document Engineering 2019

Pages 1–10

ABSTRACT

Entity linking is a core task in textual document processing, which consists in identifying the entities of a knowledge base (KB) that are mentioned in a text. Approaches in the literature consider either independent linking of individual mentions or collective linking of all mentions. Regardless of this distinction, most approaches rely on the Wikipedia encyclopedic KB in order to improve the linking quality, by exploiting its entity descriptions (web pages) or its entity interconnections (hyperlink graph of web pages). In this paper, we devise a novel collective linking technique which departs from most approaches in the literature by relying on a structured RDF KB. This allows exploiting the semantics of the interrelationships that candidate entities may have at disambiguation time rather than relying on raw structural approximation based on Wikipedia's hyperlink graph. The few approaches that also use an RDF KB simply rely on the existence of a relation between the candidate entities to which mentions may be linked. Instead, we weight such relations based on the RDF KB structure and propose an efficient decoding strategy for collective linking. Experiments on standard benchmarks show significant improvement over the state of the art.

References

Ayman Alhelbawy and Robert Gaizauskas. 2014. Graph ranking for collective named entity disambiguation. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Vol. 2. 75--80.Google ScholarCross Ref
Roi Blanco, Giuseppe Ottaviano, and Edgar Meij. 2015. Fast and space-efficient entity linking for queries. In Proceedings of the 8th ACM International Conference on Web Search and Data Mining. 179--188.Google ScholarDigital Library
Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics 5 (2017), 135--146.Google ScholarCross Ref
Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. 1247--1250.Google ScholarDigital Library
Razvan Bunescu and Marius Paşca. 2006. Using encyclopedic knowledge for named entity disambiguation. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics. 9--16.Google Scholar
Yixin Cao, Lei Hou, Juanzi Li, and Zhiyuan Liu. 2018. Neural collective entity linking. In Proceedings of the 27th International Conference on Computational Linguistics. 675--686.Google Scholar
Diego Ceccarelli, Claudio Lucchese, Salvatore Orlando, Raffaele Perego, and Salvatore Trani. 2013. Learning relatedness measures for entity linking. In Proceedings of the 22nd ACM International Conference on Information & Knowledge Management. 139--148.Google ScholarDigital Library
Silviu Cucerzan. 2007. Large-scale named entity disambiguation based on Wikipedia data. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 708--716.Google Scholar
Mark Dredze, Paul McNamee, Delip Rao, Adam Gerber, and Tim Finin. 2010. Entity disambiguation for knowledge base population. In Proceedings of the 23rd International Conference on Computational Linguistics. 277--285.Google ScholarDigital Library
Greg Durrett and Dan Klein. 2014. A joint model for entity analysis: coreference, typing, and linking. Transactions of the Association for Computational Linguistics 2 (2014), 477--490.Google ScholarCross Ref
John R Firth. 1957. A synopsis of linguistic theory, 1930--1955. Studies in linguistic analysis (1957).Google Scholar
Matthew Francis-Landau, Greg Durrett, and Dan Klein. 2016. Capturing semantic similarity for entity linking with convolutional neural networks. In Proceedings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1256--1261.Google ScholarCross Ref
Octavian-Eugen Ganea, Marina Ganea, Aurelien Lucchi, Carsten Eickhoff, and Thomas Hofmann. 2016. Probabilistic bag-of-hyperlinks model for entity linking. In Proceedings of the 25th International Conference on World Wide Web. 927--938.Google ScholarDigital Library
Octavian-Eugen Ganea and Thomas Hofmann. 2017. Deep joint entity disambiguation with local neural attention. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2619--2629.Google ScholarCross Ref
Nitish Gupta, Sameer Singh, and Dan Roth. 2017. Entity linking via joint encoding of types, descriptions, and context. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2681--2690.Google ScholarCross Ref
Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, Hagen Fürstenau, Manfred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum. 2011. Robust disambiguation of named entities in text. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. 782--792.Google ScholarDigital Library
Hongzhao Huang, Larry Heck, and Heng Ji. 2015. Leveraging deep neural networks and knowledge graphs for entity disambiguation. arXiv preprint arXiv:1504.07678 (2015).Google Scholar
Heng Ji and Nothman. 2016. Overview of TAC-KBP2016 tri-lingual EDL and its impact on end-to-end cold-start KBP. Proceedings of the 2016 Text Analysis Conference (2016).Google Scholar
Heng Ji, Joel Nothman, Ben Hachey, and Radu Florian. 2015. Overview of TAC-KBP2015 tri-lingual entity discovery and linking. In Proceedings of the 2015 Text Analysis Conference.Google Scholar
Heng Ji, Xiaoman Pan, Boliang Zhang, Joel Nothman, James Mayfield, Paul McNamee, and Cash Costello. 2017. Overview of TAC-KBP2017 13 languages entity discovery and linking. In Proceedings of the 2017 Text Analysis Conference.Google Scholar
Nikolaos Kolitsas, Octavian-Eugen Ganea, and Thomas Hofmann. 2018. End-to-end neural entity linking. In Proceedings of the 22nd Conference on Computational Natural Language Learning. 519--529.Google ScholarCross Ref
Phong Le and Ivan Titov. 2018. Improving entity linking by modeling latent relations between mentions. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 1595--1604.Google ScholarCross Ref
Xiao Ling, Sameer Singh, and Daniel S Weld. 2015. Design challenges for entity linking. Transactions of the Association for Computational Linguistics 3 (2015), 315--328.Google ScholarCross Ref
Ming Liu, Gu Gong, Bing Qin, and Ting Liu. 2019. A multi-view-based collective entity linking method. ACM Transactions on Information Systems. 37, 2, Article 23, 29 pages. https://doi.org/10.1145/3300197Google ScholarDigital Library
Weiming Lu, Yangfan Zhou, Haijiao Lu, Pengkun Ma, Zhenyu Zhang, and Baogang Wei. 2017. Boosting collective entity linking via type-guided semantic embedding. In Proceedings of the National CCF Conference on Natural Language Processing and Chinese Computing. 541--553.Google Scholar
Pablo N. Mendes, Max Jakob, Andrés García-Silva, and Christian Bizer. 2011. DBpedia spotlight: shedding light on the web of documents. In Proceedings of the 7th International Conference on Semantic Systems. 1--8.Google ScholarDigital Library
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of Advances in Neural Information Processing Systems. 3111--3119.Google ScholarDigital Library
David Milne and Ian H Witten. 2008. Learning to link with Wikipedia. In Proceedings of the 17th ACM Conference on Information and Knowledge Management. 509--518.Google ScholarDigital Library
Jose G. Moreno, Romaric Besançon, Romain Beaumont, Eva D'hondt, Anne-Laure Ligozat, Sophie Rosset, Xavier Tannier, and Brigitte Grau. 2017. Combining word and entity embeddings for entity linking. In Proceedings of European Semantic Web Conference. 337--352.Google ScholarDigital Library
Andrea Moro, Alessandro Raganato, and Roberto Navigli. 2014. Entity linking meets word sense disambiguation: a unified approach. Transactions of the Association for Computational Linguistics 2 (2014), 231--244.Google ScholarCross Ref
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 1532--1543.Google ScholarCross Ref
Minh C. Phan, Aixin Sun, Yi Tay, Jialong Han, and Chenliang Li. 2018. Pair-linking for collective entity disambiguation: Two could be better than all. IEEE Transactions on Knowledge and Data Engineering.Google Scholar
Michael Röder, Ricardo Usbeck, Sebastian Hellmann, Daniel Gerber, and Andreas Both. 2014. N3- A collection of datasets for named entity recognition and disambiguation in the NLP interchange format. In Proceedings of the 9th International Conference on Language Resources and Evaluation. 3529--3533.Google Scholar
Wei Shen, Jianyong Wang, and Jiawei Han. 2015. Entity linking with a knowledge base: Issues, techniques, and solutions. IEEE Transactions on Knowledge and Data Engineering 27, 2 (2015), 443--460.Google ScholarCross Ref
Valentin I. Spitkovsky and Angel X. Chang. 2012. A cross-lingual dictionary for English Wikipedia concepts. In Proceedings of the 8th International Conference on Language Resources and Evaluation. 3168--3175.Google Scholar
Ricardo Usbeck, Axel-Cyrille Ngonga Ngomo, Michael Röder, Daniel Gerber, Sandro Athaide Coelho, Sören Auer, and Andreas Both. 2014. AGDISTIS - Graph-based disambiguation of named entities using linked data. In Proceedings of the International Semantic Web Conference. 457--471.Google ScholarDigital Library
Ricardo Usbeck, Michael Röder, Axel-Cyrille Ngonga Ngomo, Ciro Baron, Andreas Both, Martin Brümmer, Diego Ceccarelli, Marco Cornolti, Didier Cherix, Bernd Eickmann, et al. 2015. GERBIL: General entity annotator benchmarking framework. In Proceedings of the 24th International Conference on World Wide Web. 1133--1143.Google ScholarDigital Library
W3C. 2014. RDF 1.1 Concepts and Abstract Syntax. https://www.w3.org/TR/rdf11-concepts.Google Scholar
W3C. 2014. RDF 1.1 Primer. https://www.w3.org/TR/rdf11-primer.Google Scholar
W3C. 2014. RDF 1.1 Semantics. https://www.w3.org/TR/rdf11-mt.Google Scholar
Han Wang, Jin Guang Zheng, Xiaogang Ma, Peter Fox, and Heng Ji. 2015. Language and domain independent entity linking with quantified collective validation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 695--704.Google ScholarCross Ref
Ikuya Yamada, Hiroyuki Shindo, Hideaki Takeda, and Yoshiyasu Takefuji. 2016. Joint learning of the embedding of words and entities for named entity disambiguation. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning. 250--259. https://doi.org/10.18653/v1/k16-1025Google ScholarCross Ref

Index Terms

Using Knowledge Base Semantics in Context-Aware Entity Linking
1. Computing methodologies
  1. Artificial intelligence
    1. Knowledge representation and reasoning
    2. Natural language processing
2. Information systems
  1. World Wide Web
    1. Web data description languages
      1. Semantic web description languages
        Resource Description Framework (RDF)

Recommendations

Populating knowledge base with collective entity mentions: a graph-based approach
ASONAM '14: Proceedings of the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Populating a knowledge base with new entity mentions extracted from unstructured text can help enhance its coverage and freshness. It naturally consists of two subtasks, namely, fine-grained entity classification and entity linking. Existing studies ...
Read More
Reveal the Unknown: Out-of-Knowledge-Base Mention Discovery with Entity Linking
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

Discovering entity mentions that are out of a Knowledge Base (KB) from texts plays a critical role in KB maintenance, but has not yet been fully explored. The current methods are mostly limited to the simple threshold-based approach and feature-based ...
Read More
A Novel Path-Based Entity Relatedness Measure for Efficient Collective Entity Linking
The Semantic Web – ISWC 2020
Abstract
Collective entity linking is a core natural language processing task, which consists in jointly identifying the entities of a knowledge base (KB) that are mentioned in a text exploiting existing relations between entities within the KB. State-of-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
DocEng '19: Proceedings of the ACM Symposium on Document Engineering 2019
September 2019
254 pages
ISBN:9781450368872
DOI:10.1145/3342558
General Chairs:
Uwe Borghoff,
Sonja Schimmler
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 23 September 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Entity linking
RDF
collective entity linking
entity relatedness measure
knowledge base semantics
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
DocEng '19 Paper Acceptance Rate30of77submissions,39%Overall Acceptance Rate178of537submissions,33%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 229
  Total Downloads
- Downloads (Last 12 months)30
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Using Knowledge Base Semantics in Context-Aware Entity Linking

DocEng '19: Proceedings of the ACM Symposium on Document Engineering 2019

ABSTRACT

References

Cited By

Index Terms

Recommendations

Populating knowledge base with collective entity mentions: a graph-based approach

Reveal the Unknown: Out-of-Knowledge-Base Mention Discovery with Entity Linking

A Novel Path-Based Entity Relatedness Measure for Efficient Collective Entity Linking