poster

Combining labeled and unlabeled data with word-class distribution learning

Authors:
Yanjun Qi

NEC Labs America Inc, Princeton, NJ, USA

NEC Labs America Inc, Princeton, NJ, USA
View Profile

,
Ronan Collobert

NEC Labs America Inc, Princeton, NJ, USA

NEC Labs America Inc, Princeton, NJ, USA
View Profile

,
Pavel Kuksa

Rutgers University, Piscataway, NJ, USA

Rutgers University, Piscataway, NJ, USA
View Profile

,
Koray Kavukcuoglu

New York University, New York, NY, USA

New York University, New York, NY, USA
View Profile

,
Jason Weston

NEC Labs America Inc, Princeton, NJ, USA

NEC Labs America Inc, Princeton, NJ, USA
View Profile

CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementNovember 2009Pages 1737–1740https://doi.org/10.1145/1645953.1646218

Published:02 November 2009Publication History

CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Pages 1737–1740

ABSTRACT

We describe a novel simple and highly scalable semi-supervised method called Word-Class Distribution Learning (WCDL), and apply it task of information extraction (IE) by utilizing unlabeled sentences to improve supervised classification methods. WCDL iteratively builds class label distributions for each word in the dictionary by averaging predicted labels over all cases in the unlabeled corpus, and re-training a base classifier adding these distributions as word features. In contrast, traditional self-training or co-training methods self-labeled examples (rather than features) which can degrade performance due to incestuous learning bias. WCDL exhibits robust behavior, and has no difficult parameters to tune. We applied our method on German and English name entity recognition (NER) tasks. WCDL shows improvements over self-training, multi-task semi-supervision or supervision alone, in particular yielding a state-of-the art 75.72 F1 score on the German NER task.

References

R. K. Ando and T. Zhang. A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research, 6:1817--1853, 2005. Google ScholarDigital Library
A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. In COLT' 98: Proceedings of the eleventh annual conference on Computational learning theory, pages 92--100, New York, NY, USA, 1998. ACM. Google ScholarDigital Library
O. Chapelle, B. Schölkopf, and A. Zien, editors. Semi-Supervised Learning (Adaptive Computation and Machine Learning). MIT Press, 2006. Google ScholarDigital Library
M. Collins and Y. Singer. Unsupervised models for named entity classification. In Proceedings of the Joint SIGDAT Conference on EMNLP, pages 100--110, 1999.Google Scholar
R. Collobert and J. Weston. A unified architecture for nlp: deep neural networks with multitask learning. In ICML '08: Proceedings of the 25th international conference on Machine learning, pages 160--167, 2008. Google ScholarDigital Library
H. Daumé III. Cross-task knowledge-constrained self training. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pages 680--688, Honolulu, Hawaii, October 2008. ACL. Google ScholarDigital Library
G. Druck, G. Mann, and A. McCallum. Learning from labeled features using generalized expectation criteria. In SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 595--602. ACM, 2008. Google ScholarDigital Library
Erik and F. De Meulder. Introduction to the conll-2003 shared task: language-independent named entity recognition. In Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003, pages 142--147, 2003. Google ScholarDigital Library
R. Florian, A. Ittycheriah, H. Jing, and T. Zhang. Named entity recognition through classifier combination. In W. Daelemans and M. Osborne, editors, Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003, pages 168--171, 2003. Google ScholarDigital Library
Y. Grandvalet and Y. Bengio. Semi-supervised learning by entropy minimization. In L. K. Saul, Y. Weiss, and L. Bottou, editors, Advances in Neural Information Processing Systems 17, pages 529--536, Cambridge, MA, 2005. MIT Press.Google Scholar
T. Joachims. Transductive inference for text classification using support vector machines. In ICML '99: Proceedings of the Sixteenth International Conference on Machine Learning, pages 200--209, San Francisco, CA, USA, 1999. Morgan Kaufmann Publishers Inc. Google ScholarDigital Library
R. J. Kate and R. J. Mooney. Semi-supervised learning for semantic parsing using support vector machines. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, Short Papers (NAACL/HLT-2007), pages 81--84, 2007. Google ScholarDigital Library
Z. Kozareva, B. Bonev, and A. Montoyo. Self-training and co-training applied to spanish named entity recognition. In MICAI 2005: Advances in Artificial Intel ligence, 2005. Google ScholarDigital Library
D. McClosky, E. Charniak, and M. Johnson. Effective self-training for parsing. In Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pages 152--159, 2006. Google ScholarDigital Library
D. Nadeau and S. Sekine. A survey of named entity recognition and classfication. Linguisticae Investigationes, 30(1):3--26, 2007.Google ScholarCross Ref
K. Nigam, A. K. McCallum, S. Thrun, and T. Mitchell. Text classification from labeled and unlabeled documents using em. volume 39, pages 103--134, Hingham, MA, USA, 2000. Kluwer Academic Publishers. Google ScholarDigital Library
R. E. Schapire, M. Rochery, M. G. Rahim, and N. Gupta. Incorporating prior knowledge into boosting. In ICML '02: Proceedings of the Nineteenth International Conference on Machine Learning, pages 538--545, San Francisco, CA, USA, 2002. Morgan Kaufmann Publishers Inc. Google ScholarDigital Library
H. Scudder. Probability of error of some adaptive pattern-recognition machines. IEEE Transactions on Information Theory, 11(3):363--371, 1965.Google ScholarDigital Library
H. Shan and D. Gildea. Self-training and co-training for semantic role labeling: Primary report. Technical Report TR891, University of Rochester, Comp. Sci. Dept., 2006.Google Scholar
X. Wu and R. Srihari. Incorporating prior knowledge with weighted margin support vector machines. In KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 326--333, New York, NY, USA, 2004. ACM. Google ScholarDigital Library
T. Zhang, F. Damerau, and D. Johnson. Text chunking using regularized winnow. In ACL '01: Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, pages 539--546, Morristown, NJ, USA, 2001. Association for Computational Linguistics. Google ScholarDigital Library
X. Zhu, Z. Ghahramani, and J. Laerty. Semi-supervised learning using gaussian fields and harmonic functions. In ICML'03: Proceedings of the 20th International Conference on Machine Learning, pages 912--919, 2003.Google Scholar

Index Terms

Combining labeled and unlabeled data with word-class distribution learning
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources

Recommendations

Learning Instance Weighted Naive Bayes from labeled and unlabeled data

In real-world data mining applications, it is often the case that unlabeled instances are abundant, while available labeled instances are very limited. Thus, semi-supervised learning, which attempts to benefit from large amount of unlabeled data ...
Read More
Semi-Supervised Sequence Labeling with Self-Learned Features
ICDM '09: Proceedings of the 2009 Ninth IEEE International Conference on Data Mining

Typical information extraction (IE) systems can be seen as tasks assigning labels to words in a natural language sequence. The performance is restricted by the availability of labeled words. To tackle this issue, we propose a semi-supervised approach to ...
Read More
Training object detectors from few weakly-labeled and many unlabeled images
Highlights
- A novel method to train detector by few weakly-labeled images and lots of unlabeled images.
Abstract
Weakly-supervised object detection attempts to limit the amount of supervision by dispensing the need for bounding boxes, but still assumes image-level labels on the entire training set. In this work, we study the problem of training ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management
November 2009
2162 pages
ISBN:9781605585123
DOI:10.1145/1645953
General Chairs:
David Cheung
University of Hong Kong, Hong Kong
,
Il-Yeol Song
Drexel University, USA
,
Program Chairs:
Wesley Chu
UCLA, USA
,
Xiaohua Hu
Drexel University, USA
,
Jimmy Lin
University of Maryland, USA
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 November 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
information extraction
name entity recognition
semi-supervised feature learning
semi-supervised learning
Qualifiers
- poster
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 218
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Combining labeled and unlabeled data with word-class distribution learning

CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Learning Instance Weighted Naive Bayes from labeled and unlabeled data

Semi-Supervised Sequence Labeling with Self-Learned Features

Training object detectors from few weakly-labeled and many unlabeled images

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Combining labeled and unlabeled data with word-class distribution learning

CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Learning Instance Weighted Naive Bayes from labeled and unlabeled data

Semi-Supervised Sequence Labeling with Self-Learned Features

Training object detectors from few weakly-labeled and many unlabeled images

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media