skip to main content
10.3115/1075096.1075139dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free Access

A bootstrapping approach to named entity classification using successive learners

Published:07 July 2003Publication History

ABSTRACT

This paper presents a new bootstrapping approach to named entity (NE) classification. This approach only requires a few common noun/pronoun seeds that correspond to the concept for the target NE type, e.g. he/she/man/woman for PERSON NE. The entire bootstrapping procedure is implemented as training two successive learners: (i) a decision list is used to learn the parsing-based high precision NE rules; (ii) a Hidden Markov Model is then trained to learn string sequence-based NE patterns. The second learner uses the training corpus automatically tagged by the first learner. The resulting NE system approaches supervised NE performance for some NE types. The system also demonstrates intuitive support for tagging user-defined NE types. The differences of this approach from the co-training-based NE bootstrapping are also discussed.

References

  1. Bikel, D. M. 1997. Nymble: a high-performance learning name-finder. Proceedings of ANLP 1997, 194--201, Morgan Kaufmann Publishers. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Beckwith, R. et al. 1991. WordNet: A Lexical Database Organized on Psycholinguistic Principles. Lexicons: Using On-line Resources to build a Lexicon, Uri Zernik, editor, Lawrence Erlbaum, Hillsdale, NJ.Google ScholarGoogle Scholar
  3. Borthwick, A. et al. 1998. Description of the MENE named Entity System. Proceedings of MUC-7.Google ScholarGoogle Scholar
  4. Collins, M. and Y. Singer. 1999. Unsupervised Models for Named Entity Classification. Proceedings of the 1999 Joint SIGDAT Conference on EMNLP and VLC.Google ScholarGoogle Scholar
  5. Cucchiarelli, A. and P. Velardi. 2001. Unsupervised Named Entity Recognition Using Syntactic and Semantic Contextual Evidence. Computational Linguistics, Volume 27, Number 1, 123--131. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Cucerzan, S. and D. Yarowsky. 1999. Language Independent Named Entity Recognition Combining Morphological and Contextual Evidence. Proceedings of the 1999 Joint SIGDAT Conference on EMNLP and VLC, 90--99.Google ScholarGoogle Scholar
  7. Gale, W., K. Church, and D. Yarowsky. 1992. One Sense Per Discourse. Proceedings of the 4th DARPA Speech and Natural Language Workshop. 233--237. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Kim, J., I. Kang, and K. Choi. 2002. Unsupervised Named Entity Classification Models and their Ensembles. COLING 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Krupka, G. R. and K. Hausman. 1998. IsoQuest Inc: Description of the NetOwl Text Extraction System as used for MUC-7. Proceedings of MUC-7.Google ScholarGoogle Scholar
  10. Lin, D. K. 1998. Automatic Retrieval and Clustering of Similar Words. COLING-ACL 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. MUC-7, 1998. Proceedings of the Seventh Message Understanding Conference (MUC-7).Google ScholarGoogle Scholar
  12. Thelen, M. and E. Riloff. 2002. A Bootstrapping Method for Learning Semantic Lexicons using Extraction Pattern Contexts. Proceedings of EMNLP 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Segal, R. and O. Etzioni. 1994. Learning decision lists using homogeneous rules. Proceedings of the 12th National Conference on Artificial Intelligence. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Srihari, R., W. Li, C. Niu and T. Cornell. 2003. InfoXtract: An Information Discovery Engine Supported by New Levels of Information Extraction. Proceeding of HLT-NAACL 2003 Workshop on Software Engineering and Architecture of Language Technology Systems, Edmonton, Canada.Google ScholarGoogle Scholar
  15. Srihari, R., C. Niu, & W. Li. 2000. A Hybrid Approach for Named Entity and Sub-Type Tagging. Proceedings of ANLP 2000, Seattle. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Yarowsky, David. 1995. Unsupervised Word Sense Disambiguation Rivaling Supervised Method. ACL 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. A bootstrapping approach to named entity classification using successive learners

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image DL Hosted proceedings
        ACL '03: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
        July 2003
        571 pages

        Publisher

        Association for Computational Linguistics

        United States

        Publication History

        • Published: 7 July 2003

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate85of443submissions,19%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader