ABSTRACT
In this paper we describe a way to discover Named Entities by using the distribution of words in news articles. Named Entity recognition is an important task for today's natural language applications, but it still suffers from data sparseness. We used an observation that a Named Entity is likely to appear synchronously in several news articles, whereas a common noun is less likely. Exploiting this characteristic, we successfully obtained rare Named Entities with 90% accuracy just by comparing time series distributions of a word in two newspapers. Although the achieved recall is not sufficient yet, we believe that this method can be used to strengthen the lexical knowledge of a Named Entity tagger.
- Regina Barzilay and Kathleen R. McKeown. 2001. Extracting paraphrases from a parallel corpus. In Proceedings of ACL/EACL 2001. Google ScholarDigital Library
- Michael Collins and Yoram Singer. 1999. Unsupervised models for named entity classification. In Proceedings of EMNLP 1999.Google Scholar
- Satoshi Sekine and Hitoshi Isahara. 2000. IREX: IR and IE evaluation-based project in Japanese. In Proceedings of LREC 2000.Google Scholar
- Satoshi Sekine, Kiyoshi Sudo, and Chikashi No-bata. 2002. Extended named entity hierarchy. In Proceedings of LREC 2002.Google Scholar
- Yusuke Shinyama and Satoshi Sekine. 2003. Paraphrase acquisition for information extraction. In Proceedings of International Workshop on Paraphrasing 2003. Google ScholarDigital Library
- Tomek Strzalkowski and Jin Wang. 1996. A self-learning universal concept spotter. In Proceedings of COLING 1996. Google ScholarDigital Library
- Roman Yangarber, Winston Lin, and Ralph Grish-man. 2002. Unsupervised learning of generalized names. In Proceedings of COLING 2002. Google ScholarDigital Library
- Named entity discovery using comparable news articles
Recommendations
Exploring entity relations for named entity disambiguation
HLT-SS '11: Proceedings of the ACL 2011 Student SessionNamed entity disambiguation is the task of linking an entity mention in a text to the correct real-world referent predefined in a knowledge base, and is a crucial subtask in many areas like information retrieval or topic detection and tracking. Named ...
Improved Named Entity Translation and Bilingual Named Entity Extraction
ICMI '02: Proceedings of the 4th IEEE International Conference on Multimodal InterfacesTranslation of named entities (NE), including proper names, temporal and numerical expressions, is very important in multilingual natural language processing, like crosslingual information retrieval and statistical machine translation. In this paper we ...
Weakly supervised named entity transliteration and discovery from multilingual comparable corpora
ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational LinguisticsNamed Entity recognition (NER) is an important part of many natural language processing tasks. Current approaches often employ machine learning techniques and require supervised data. However, many languages lack such resources. This paper presents an (...
Comments