2013 | OriginalPaper | Chapter
Topic-Oriented Words as Features for Named Entity Recognition
Authors : Ziqi Zhang, Trevor Cohn, Fabio Ciravegna
Published in: Computational Linguistics and Intelligent Text Processing
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
Research has shown that topic-oriented words are often related to named entities and can be used for Named Entity Recognition. Many have proposed to measure topicality of words in terms of ‘informativeness’ based on global distributional characteristics of words in a corpus. However, this study shows that there can be large discrepancy between informativeness and topicality; empirically, informativeness based features can damage learning accuracy of NER. This paper proposes to measure words’ topicality based on local distributional features specific to individual documents, and proposes methods to transform topicality into gazetteer-like features for NER by binning. Evaluated using five datasets from three domains, the methods have shown consistent improvement over a baseline by between 0.9 and 4.0 in F-measure, and always outperformed methods that use informativeness measures.