Article

Free Access

Unsupervised word sense disambiguation rivaling supervised methods

Author:
David Yarowsky

University of Pennsylvania, Philadelphia, PA

University of Pennsylvania, Philadelphia, PA
View Profile

ACL '95: Proceedings of the 33rd annual meeting on Association for Computational LinguisticsJune 1995Pages 189–196https://doi.org/10.3115/981658.981684

Published:26 June 1995Publication History

ACL '95: Proceedings of the 33rd annual meeting on Association for Computational Linguistics

Pages 189–196

ABSTRACT

This paper presents an unsupervised learning algorithm for sense disambiguation that, when trained on unannotated English text, rivals the performance of supervised techniques that require time-consuming hand annotations. The algorithm is based on two powerful constraints---that words tend to have one sense per discourse and one sense per collocation---exploited in an iterative bootstrapping procedure. Tested accuracy exceeds 96%.

References

Baum, L. E., "An Inequality and Associated Maximization Technique in Statistical Estimation of Probabilistic Functions of a Markov Process," Inequalities, v 3, pp 1--8, 1972.Google Scholar
Black, Ezra, "An Experiment in Computational Discrimination of English Word Senses," in IBM Journal of Research and Development, v 232, pp 185--194, 1988. Google ScholarDigital Library
Brill, Eric, "A Corpus-Based Approach to Language Learning," Ph.D. Thesis, University of Pennsylvania, 1993. Google ScholarDigital Library
Brown, Peter, Stephen Della Pietra, Vincent Della Pietra, and Robert Mercer, "Word Sense Disambiguation using Statistical Methods," Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, pp 264--270, 1991. Google ScholarDigital Library
Bruce, Rebecca and Janyce Wiebe, "Word-Sense Disambiguation Using Decomposable Models," in Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, Las Cruces, NM, 1994. Google ScholarDigital Library
Church, K. W., "A Stochastic Parts Program an Noun Phrase Parser for Unrestricted Text," in Proceeding, IEEE International Conference on Acoustics, Speech and Signal Processing, Glasgow, 1989.Google Scholar
Dagan, Ido and Alon Itai, "Word Sense Disambiguation Using a Second Language Monolingual Corpus", Computational Linguistics, v 20, pp 563--596, 1994. Google ScholarDigital Library
Dempster, A. P., Laird, N. M., and Rubin, D. B., "Maximum Likelihood From Incomplete Data via the EM Algorithm," Journal of the Royal Statistical Society, v 39, pp 1--38, 1977.Google Scholar
Gale, W., K. Church, and D. Yarowsky, "A Method for Disambiguating Word Senses in a Large Corpus," Computers and the Humanities, 26, pp 415--439, 1992.Google ScholarCross Ref
Gale, W., K. Church, and D. Yarowsky. "Discrimination Decisions for 100,000-Dimensional Spaces." In A. Zampoli, N. Calzolari and M. Palmer (eds.), Current Issues in Computational Linguistics: In Honour of Don Walker, Kluwer Academic Publishers, pp. 429--450, 1994.Google Scholar
Guthrie, J., L. Guthrie, Y. Wilks and H. Aidinejad, "Subject Dependent Co-occurrence and Word Sense Disambiguation," in Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, pp 146--152, 1991. Google ScholarDigital Library
Hearst, Marti, "Noun Homograph Disambiguation Using Local Context in Large Text Corpora," in Using Corpora, University of Waterloo, Ontario, 1991.Google Scholar
Leacock, Claudia, Geoffrey Towell and Ellen Voorhees "Corpus-Based Statistical Sense Resolution," in Proceedings, ARPA Human Language Technology Workshop, 1993. Google ScholarDigital Library
Lehman, Jill Fain, "Toward the Essential Nature of Statistical Knowledge in Sense Resolution", in Proceedings of the Twelfth National Conference on Artificial Intelligence, pp 734--471, 1994. Google ScholarDigital Library
Lesk, Michael, "Automatic Sense Disambiguation: How to tell a Pine Cone from an Ice Cream Cone," Proceeding of the 1986 SIGDOC Conference, Association for Computing Machinery, New York, 1986. Google ScholarDigital Library
Miller, George, "WordNet: An On-Line Lexical Database," International Journal of Lexicography, 3, 4, 1990.Google ScholarCross Ref
Mosteller, Frederick, and David Wallace, Inference and Disputed Authorship: The Federalist, Addison-Wesley, Reading, Massachusetts, 1964.Google Scholar
Rivest, R. L., "Learning Decision Lists," in Machine Learning, 2, pp 229--246, 1987. Google ScholarDigital Library
Schütze, Hinrich, "Dimensions of Meaning," in Proceedings of Supercomputing '92, 1992. Google ScholarDigital Library
Slator, Brian, "Using Context for Sense Preference," in Text-Based Intelligent Systems: Current Research in Text Analysis, Information Extraction and Retrieval, P. S. Jacobs, ed., GE Research and Development Center, Schenectady, New York, 1990.Google Scholar
Veronis, Jean and Nancy Ide, "Word Sense Disambiguation with Very Large Neural Networks Extracted from Machine Readable Dictionaries," in Proceedings, COLING-90, pp 389--394, 1990. Google ScholarDigital Library
Yarowsky, David "Word-Sense Disambiguation Using Statistical Models of Roget's Categories Trained on Large Corpora," in Proceedings, COLING-92, Nantes, France, 1992. Google ScholarDigital Library
Yarowsky, David, "One Sense Per Collocation," in Proceedings, ARPA Human Language Technology Workshop, Princeton, 1993. Google ScholarDigital Library
Yarowsky, David, "Decision Lists for Lexical Ambiguity Resolution: Application to Accent Restoration in Spanish and French," in Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, Las Cruces, NM, 1994. Google ScholarDigital Library
Yarowsky, David. "Homograph Disambiguation in Speech Synthesis." In J. Hirschberg, R. Sproat and J. van Santen (eds.), Progress in Speech Synthesis, Springer-Verlag, to appear.Google Scholar

Unsupervised word sense disambiguation rivaling supervised methods
1. Computing methodologies
  1. Artificial intelligence
2. Hardware
  1. Power and energy
    1. Power estimation and optimization

Recommendations

An unsupervised method for word sense disambiguation
Abstract
Word sense disambiguation (WSD) finds the actual meaning of a word according to its context. This paper presents a novel WSD method to find the correct sense of a word present in a sentence. The proposed method uses both the WordNet ...
Read More
Unsupervised Word-Sense Disambiguation Using Bilingual Comparable Corpora

An unsupervised method for word-sense disambiguation using bilingual comparable corpora was developed. First, it extracts word associations, i.e., statistically significant pairs of associated words, from the corpus of each language. Then, it aligns ...
Read More
Unsupervised word sense disambiguation using bilingual comparable corpora
COLING '02: Proceedings of the 19th international conference on Computational linguistics - Volume 1

An unsupervised method for word sense disambiguation using a bilingual comparable corpus was developed. First, it extracts statistically significant pairs of related words from the corpus of each language. Then, aligning pairs of related words ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ACL '95: Proceedings of the 33rd annual meeting on Association for Computational Linguistics
June 1995
354 pages
Program Chair:
Hans Uszkoreit
Saarbrücken, Germany
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 26 June 1995
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate85of443submissions,19%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 476
  Total Citations
  View Citations
- 7,914
  Total Downloads
- Downloads (Last 12 months)521
- Downloads (Last 6 weeks)52
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Unsupervised word sense disambiguation rivaling supervised methods

ACL '95: Proceedings of the 33rd annual meeting on Association for Computational Linguistics

ABSTRACT

References

Cited By

Recommendations

An unsupervised method for word sense disambiguation

Unsupervised Word-Sense Disambiguation Using Bilingual Comparable Corpora

Unsupervised word sense disambiguation using bilingual comparable corpora

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Unsupervised word sense disambiguation rivaling supervised methods

ACL '95: Proceedings of the 33rd annual meeting on Association for Computational Linguistics

ABSTRACT

References

Cited By

Recommendations

An unsupervised method for word sense disambiguation

Unsupervised Word-Sense Disambiguation Using Bilingual Comparable Corpora

Unsupervised word sense disambiguation using bilingual comparable corpora

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media