research-article

Free Access

Phrase clustering for discriminative learning

Authors:
Dekang Lin

Google, Inc., Amphitheater Parkway, Mountain View, CA

Google, Inc., Amphitheater Parkway, Mountain View, CA
View Profile

,
Xiaoyun Wu

Google, Inc., Amphitheater Parkway, Mountain View, CA

Google, Inc., Amphitheater Parkway, Mountain View, CA
View Profile

ACL '09: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2August 2009Pages 1030–1038

Published:02 August 2009Publication History

ACL '09: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2

Pages 1030–1038

ABSTRACT

We present a simple and scalable algorithm for clustering tens of millions of phrases and use the resulting clusters as features in discriminative classifiers. To demonstrate the power and generality of this approach, we apply the method in two very different applications: named entity recognition and query classification. Our results show that phrase clusters offer significant improvements over word clusters. Our NER system achieves the best current result on the widely used CoNLL benchmark. Our query classifier is on par with the best system in KDDCUP 2005 without resorting to labor intensive knowledge engineering efforts.

References

R. Ando and T. Zhang A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data. Journal of Machine Learning Research, Vol 6:1817--1853, 2005. Google ScholarDigital Library
B. H. Bloom. 1970, Space/time trade-offs in hash coding with allowable errors, Communications of the ACM 13 (7): 422--426 Google ScholarDigital Library
A. Blum and T. Mitchell. 1998. Combining labeled and unlabeled data with co-training. Proceedings of the Eleventh Annual Conference on Computational Learning Theory pp. 92--100. Google ScholarDigital Library
P. F. Brown, V. J. Della Pietra, P. V. de Souza, J. C. Lai, and R. L. Mercer. 1992. Class-based n-gram models of natural language. Computational Linguistics, 18(4):467--479. Google ScholarDigital Library
H. L. Chieu and H. T. Ng. Named entity recognition with a maximum entropy approach. In Proceedings CoNLL-2003, pages 160--163, 2003. Google ScholarDigital Library
J. Dean and S. Ghemawat. 2004. MapReduce: Simplified data processing on large clusters. In Proceedings of the Sixth Symposium on Operating System Design and Implementation (OSDI-04), San Francisco, CA, USA Google ScholarDigital Library
S Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. A. Harshman. 1990. Indexing by latent semantic analysis, Journal of the American Society for Information Science, 1990, 41(6), 391--407Google ScholarCross Ref
R. Florian, A. Ittycheriah, H. Jing, and T. Zhang. Named entity recognition through classifier combination. In Proceedings CoNLL-2003, pages 168--171, 2003. Google ScholarDigital Library
D. Klein, J. Smarr, H. Nguyen, and C. D. Manning. Named entity recognition with character-level models. In Proceedings CoNLL-2003, pages 188--191, 2003. Google ScholarDigital Library
P. Koehn, F. J. Och, and D. Marcu. 2003. Statistical phrase-based translation. In Proceedings of HLT-NAACL 2003, pp. 127--133. Google ScholarDigital Library
T. Koo, X. Carreras, and M. Collins. Simple Semi-supervised Dependency Parsing. Proceedings of ACL, 2008.Google Scholar
J. Lafferty, A. McCallum, F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proc. 18th International Conf. on Machine Learning, Morgan Kaufmann, San Francisco, CA (2001) 282--289 Google ScholarDigital Library
Y. Li, Z. Zheng, and H. K. Dai, KDD Cup-2005 Report: Facing a Great Challenge. SIGKDD Explorations, 7 (2), 2005, 91--99. Google ScholarDigital Library
D. Lin, S. Zhao, and B. Van Durme, and M. Pasca. 2008. Mining Parenthetical Translations from the Web by Word Alignment. Proc. of ACL-08. Columbus, OH.Google Scholar
J. Lin. Scalable Language Processing Algorithms for the Masses: A Case Study in Computing Word Cooccurrence Matrices with MapReduce. Proceedings of EMNLP 2008, pp. 419--428, Honolulu, Hawaii. Google ScholarDigital Library
J. B. MacQueen (1967): Some Methods for classification and Analysis of Multivariate Observations, Proc. of 5-th Berkeley Symposium on Mathematical Statistics and Probability", Berkeley, University of California Press, 1:281--297Google Scholar
S. Miller, J. Guinness, and A. Zamanian. 2004. Name Tagging with Word Clusters and Discriminative Training. In Proceedings of HLT-NAACL, pages 337--342.Google Scholar
M. Sahami and T. D. Heilman. 2006. A web-based kernel function for measuring the similarity of short text snippets. Proceedings of the 15th international conference on World Wide Web, pp. 377--386. Google ScholarDigital Library
D. Shen, R. Pan, J. T. Sun, J. J. Pan, K. Wu, J. Yin, Q. Yang. Q2C@UST: our winning solution to query classification in KDDCUP 2005. SIGKDD Explorations, 2005: 100--110. Google ScholarDigital Library
J. Suzuki, and H. Isozaki. 2008. Semi-Supervised Sequential Labeling and Segmentation using Giga-word Scale Unlabeled Data. In Proc. of ACL/HLT-08. Columbus, Ohio. pp. 665--673.Google Scholar
E. T. Tjong Kim Sang and F. De Meulder. 2003. Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition. In Proc. of CoNLL-2003, pages 142--147. Google ScholarDigital Library
Y. Wong and H. T. Ng, 2007. One Class per Named Entity: Exploiting Unlabeled Text for Named Entity Recognition. In Proc. of IJCAI-07, Hyderabad, India. Google ScholarDigital Library
J. Uszkoreit and T. Brants. 2008. Distributed Word Clustering for Large Scale Class-Based Language Modeling in Machine Translation. Proceedings of ACL-08: HLT, pp. 755--762.Google Scholar
V. Vapnik, 1999. The Nature of Statistical Learning Theory, 2nd edition. Springer Verlag. Google ScholarDigital Library
D. Vogel, S. Bickel, P. Haider, R. Schimpfky, P. Siemen, S. Bridges, T. Scheffer. Classifying Search Engine Queries Using the Web as Background Knowledge. SIGKDD Explorations 7(2): 117--122. 2005. Google ScholarDigital Library

Index Terms

Recommendations

Discriminative clustering via extreme learning machine

Discriminative clustering is an unsupervised learning framework which introduces the discriminative learning rule of supervised classification into clustering. The underlying assumption is that a good partition (clustering) of the data should yield high ...
Read More
Incorporating Word Clustering into Complex Noun Phrase Identification
Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data
Abstract
Since the professional technical literature include amounts of complex noun phrases, identifying those phrases has an important practical value for such tasks as machine translation. Through analysis of those phrases in Chinese-English bilingual ...
Read More
Key phrase extraction: a hybrid assignment and extraction approach
iiWAS '09: Proceedings of the 11th International Conference on Information Integration and Web-based Applications & Services

Automatic key phrase extraction is fundamental to the success of many recent digital library applications and semantic information retrieval techniques and a difficult and essential problem in Vietnamese natural language processing (NLP). In this work, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ACL '09: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
August 2009
595 pages
ISBN:9781932432466
General Chair:
Keh-Yih Su
Behavior Design Corp., Taiwan
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 2 August 2009
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate85of443submissions,19%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 42
  Total Citations
  View Citations
- 890
  Total Downloads
- Downloads (Last 12 months)39
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Phrase clustering for discriminative learning

ACL '09: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2

ABSTRACT

References

Cited By

Index Terms

Recommendations

Discriminative clustering via extreme learning machine

Incorporating Word Clustering into Complex Noun Phrase Identification

Key phrase extraction: a hybrid assignment and extraction approach

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Phrase clustering for discriminative learning

ACL '09: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2

ABSTRACT

References

Cited By

Index Terms

Recommendations

Discriminative clustering via extreme learning machine

Incorporating Word Clustering into Complex Noun Phrase Identification

Key phrase extraction: a hybrid assignment and extraction approach

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media