Article

Two-view feature generation model for semi-supervised learning

Authors:
Rie Kubota Ando

IBM T. J. Watson Research Center, Hawthorne, New York

IBM T. J. Watson Research Center, Hawthorne, New York
View Profile

,
Tong Zhang

Yahoo Inc., New York, New York

Yahoo Inc., New York, New York
View Profile

ICML '07: Proceedings of the 24th international conference on Machine learningJune 2007Pages 25–32https://doi.org/10.1145/1273496.1273500

Published:20 June 2007Publication History

ICML '07: Proceedings of the 24th international conference on Machine learning

Pages 25–32

ABSTRACT

We consider a setting for discriminative semi-supervised learning where unlabeled data are used with a generative model to learn effective feature representations for discriminative training. Within this framework, we revisit the two-view feature generation model of co-training and prove that the optimum predictor can be expressed as a linear combination of a few features constructed from unlabeled data. From this analysis, we derive methods that employ two views but are very different from co-training. Experiments show that our approach is more robust than co-training and EM, under various data generation conditions.

References

Ando, R. K., & Zhang, T. (2005). A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research, 6, 1817--1853. Google ScholarDigital Library
Blum, A., & Mitchell, T. (1998). Combining labeled and unlabeled data with co-training. Proceedings of the eleventh annual conference on Computational learning theory (pp. 92--100). Google ScholarDigital Library
Dasgupta, S., Littman, M., & McAllester, D. (2001). PAC generalization bounds for co-training. NIPS' 01.Google Scholar
Nigam, K., McCallum, A. K., Thrun, S., & Mitchell, T. (2000). Text classification from labeled and unlabeled documents using EM. Machine Learning, Special issue on information retrieval, 103--134. Google ScholarDigital Library
Vapnik, V. (1998). Statistical learning theory. New York: John Wiley & Sons.Google Scholar
Zhang, T., & Oles, F. J. (2000). A probability analysis on the value of unlabeled data for classification problems. ICML 2000 (pp. 1191--1198).Google Scholar
Zhu, X., Ghahramani, Z., & Lafferty, J. (2003). Semi-supervised learning using gaussian fields and harmonic functions. ICML 2003.Google ScholarDigital Library

Recommendations

Automatic feature generation for machine learning--based optimising compilation

Recent work has shown that machine learning can automate and in some cases outperform handcrafted compiler optimisations. Central to such an approach is that machine learning techniques typically rely upon summaries or features of the program. The ...
Read More
Semi-supervised learning using randomized mincuts
ICML '04: Proceedings of the twenty-first international conference on Machine learning

In many application domains there is a large amount of unlabeled data but only a very limited amount of labeled training data. One general approach that has been explored for utilizing this unlabeled data is to construct a graph on all the data points ...
Read More
Combining labeled and unlabeled data with co-training
COLT' 98: Proceedings of the eleventh annual conference on Computational learning theory
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICML '07: Proceedings of the 24th international conference on Machine learning
June 2007
1233 pages
ISBN:9781595937933
DOI:10.1145/1273496
Editor:
Zoubin Ghahramani
University of Cambridge, United Kingdom
Copyright © 2007 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 June 2007
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate140of548submissions,26%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 46
  Total Citations
  View Citations
- 478
  Total Downloads
- Downloads (Last 12 months)14
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Two-view feature generation model for semi-supervised learning

ICML '07: Proceedings of the 24th international conference on Machine learning

ABSTRACT

References

Cited By

Recommendations

Automatic feature generation for machine learning--based optimising compilation

Semi-supervised learning using randomized mincuts

Combining labeled and unlabeled data with co-training

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Two-view feature generation model for semi-supervised learning

ICML '07: Proceedings of the 24th international conference on Machine learning

ABSTRACT

References

Cited By

Recommendations

Automatic feature generation for machine learning--based optimising compilation

Semi-supervised learning using randomized mincuts

Combining labeled and unlabeled data with co-training

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media