ABSTRACT
This paper presents a new bootstrapping approach to named entity (NE) classification. This approach only requires a few common noun/pronoun seeds that correspond to the concept for the target NE type, e.g. he/she/man/woman for PERSON NE. The entire bootstrapping procedure is implemented as training two successive learners: (i) a decision list is used to learn the parsing-based high precision NE rules; (ii) a Hidden Markov Model is then trained to learn string sequence-based NE patterns. The second learner uses the training corpus automatically tagged by the first learner. The resulting NE system approaches supervised NE performance for some NE types. The system also demonstrates intuitive support for tagging user-defined NE types. The differences of this approach from the co-training-based NE bootstrapping are also discussed.
- Bikel, D. M. 1997. Nymble: a high-performance learning name-finder. Proceedings of ANLP 1997, 194--201, Morgan Kaufmann Publishers. Google ScholarDigital Library
- Beckwith, R. et al. 1991. WordNet: A Lexical Database Organized on Psycholinguistic Principles. Lexicons: Using On-line Resources to build a Lexicon, Uri Zernik, editor, Lawrence Erlbaum, Hillsdale, NJ.Google Scholar
- Borthwick, A. et al. 1998. Description of the MENE named Entity System. Proceedings of MUC-7.Google Scholar
- Collins, M. and Y. Singer. 1999. Unsupervised Models for Named Entity Classification. Proceedings of the 1999 Joint SIGDAT Conference on EMNLP and VLC.Google Scholar
- Cucchiarelli, A. and P. Velardi. 2001. Unsupervised Named Entity Recognition Using Syntactic and Semantic Contextual Evidence. Computational Linguistics, Volume 27, Number 1, 123--131. Google ScholarDigital Library
- Cucerzan, S. and D. Yarowsky. 1999. Language Independent Named Entity Recognition Combining Morphological and Contextual Evidence. Proceedings of the 1999 Joint SIGDAT Conference on EMNLP and VLC, 90--99.Google Scholar
- Gale, W., K. Church, and D. Yarowsky. 1992. One Sense Per Discourse. Proceedings of the 4th DARPA Speech and Natural Language Workshop. 233--237. Google ScholarDigital Library
- Kim, J., I. Kang, and K. Choi. 2002. Unsupervised Named Entity Classification Models and their Ensembles. COLING 2002. Google ScholarDigital Library
- Krupka, G. R. and K. Hausman. 1998. IsoQuest Inc: Description of the NetOwl Text Extraction System as used for MUC-7. Proceedings of MUC-7.Google Scholar
- Lin, D. K. 1998. Automatic Retrieval and Clustering of Similar Words. COLING-ACL 1998. Google ScholarDigital Library
- MUC-7, 1998. Proceedings of the Seventh Message Understanding Conference (MUC-7).Google Scholar
- Thelen, M. and E. Riloff. 2002. A Bootstrapping Method for Learning Semantic Lexicons using Extraction Pattern Contexts. Proceedings of EMNLP 2002. Google ScholarDigital Library
- Segal, R. and O. Etzioni. 1994. Learning decision lists using homogeneous rules. Proceedings of the 12th National Conference on Artificial Intelligence. Google ScholarDigital Library
- Srihari, R., W. Li, C. Niu and T. Cornell. 2003. InfoXtract: An Information Discovery Engine Supported by New Levels of Information Extraction. Proceeding of HLT-NAACL 2003 Workshop on Software Engineering and Architecture of Language Technology Systems, Edmonton, Canada.Google Scholar
- Srihari, R., C. Niu, & W. Li. 2000. A Hybrid Approach for Named Entity and Sub-Type Tagging. Proceedings of ANLP 2000, Seattle. Google ScholarDigital Library
- Yarowsky, David. 1995. Unsupervised Word Sense Disambiguation Rivaling Supervised Method. ACL 1995. Google ScholarDigital Library
- A bootstrapping approach to named entity classification using successive learners
Recommendations
Bootstrapping for named entity tagging using concept-based seeds
NAACL-Short '03: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2A novel bootstrapping approach to Named Entity (NE) tagging using concept-based seeds and successive learners is presented. This approach only requires a few common noun or pronoun seeds that correspond to the concept for the targeted NE, e.g. he/she/...
Exploring entity relations for named entity disambiguation
HLT-SS '11: Proceedings of the ACL 2011 Student SessionNamed entity disambiguation is the task of linking an entity mention in a text to the correct real-world referent predefined in a knowledge base, and is a crucial subtask in many areas like information retrieval or topic detection and tracking. Named ...
Two-stage approach to named entity recognition using Wikipedia and DBpedia
IMCOM '17: Proceedings of the 11th International Conference on Ubiquitous Information Management and CommunicationIn natural language understanding, extraction of named entity (NE) mentions in given text and classification of the mentions into pre-defined NE types are important processes. Most NE recognition (NER) relies on resources such as a training corpus or NE ...
Comments