Skip to main content
Erschienen in: Neural Computing and Applications 3/2007

01.05.2007 | Original Article

Learning with partly labeled data

verfasst von: Abdelhamid Bouchachia

Erschienen in: Neural Computing and Applications | Ausgabe 3/2007

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Learning with partly labeled data aims at combining labeled and unlabeled data in order to boost the accuracy of a classifier. This paper outlines the two main classes of learning methods to deal with partly labeled data: pre-labeling-based learning and semi-supervised learning. Concretely, we introduce and discuss three methods from each class. The first three ones are two-stage methods consisting of selecting the data to be labeled and then training the classifier using the pre-labeled and the originally labeled data. The last three ones show how labeled and unlabeled data can be combined in a symbiotic way during training. The empirical evaluation of these methods shows: (1) pre-labeling methods tend be better than semi-supervised learning methods, (2) both labeled and unlabeled have positive effect on the classification accuracy of each of the proposed methods, (3) the combination of all the methods improve the accuracy, and (4) the proposed methods compare very well with the state-of-art methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Ahmed M, Yamany S, Mohamed N, Farag A, Moriarty T (2002) A modified fuzzy c-means algorithm for bias field estimation and segmentation of mri data. IEEE Trans Med Imaging 21(3):193–199CrossRef Ahmed M, Yamany S, Mohamed N, Farag A, Moriarty T (2002) A modified fuzzy c-means algorithm for bias field estimation and segmentation of mri data. IEEE Trans Med Imaging 21(3):193–199CrossRef
2.
Zurück zum Zitat Amini M, Gallinari P (2003) Semi-supervised learning with explicit misclassification modeling. In: Proceedings of the 18th international joint conference on artificial intelligence, pp 555–561 Amini M, Gallinari P (2003) Semi-supervised learning with explicit misclassification modeling. In: Proceedings of the 18th international joint conference on artificial intelligence, pp 555–561
3.
Zurück zum Zitat Basu S, Banerjee A, Mooney R (2002) Semi-supervised clustering by seeding. In: Proceedings of the international conference on machine learning, pp 19–26 Basu S, Banerjee A, Mooney R (2002) Semi-supervised clustering by seeding. In: Proceedings of the international conference on machine learning, pp 19–26
4.
Zurück zum Zitat Baum E (1991) Neural net algorithms that learn in polynomial time from examples and queries. IEEE Trans Neural Netw 2(1):5–19CrossRef Baum E (1991) Neural net algorithms that learn in polynomial time from examples and queries. IEEE Trans Neural Netw 2(1):5–19CrossRef
5.
Zurück zum Zitat Bennett K, Demiriz A (1999) Semi-supervised support vector machines. Adv Neural Inf Process Syst 11:368–374 Bennett K, Demiriz A (1999) Semi-supervised support vector machines. Adv Neural Inf Process Syst 11:368–374
6.
Zurück zum Zitat Bensaid A, Bezdek J (1996) Partial supervision based on point-prototype clustering algorithms. In: Proceedings of the 4th European congress on intelligent techniques and soft computing, pp 1402–1406 Bensaid A, Bezdek J (1996) Partial supervision based on point-prototype clustering algorithms. In: Proceedings of the 4th European congress on intelligent techniques and soft computing, pp 1402–1406
7.
Zurück zum Zitat Bezdek J (1981) Pattern recognition with fuzzy objective function algorithms. Plenum, New YorkMATH Bezdek J (1981) Pattern recognition with fuzzy objective function algorithms. Plenum, New YorkMATH
8.
Zurück zum Zitat Blum A, Lafferty J, Rwebangira M, Reddy R (2004) Semi-supervised learning using randomized mincuts. In: Proceedings of the 21st international conference on ML, pp 92–100 Blum A, Lafferty J, Rwebangira M, Reddy R (2004) Semi-supervised learning using randomized mincuts. In: Proceedings of the 21st international conference on ML, pp 92–100
9.
Zurück zum Zitat Blum A, Mitchell T (1998) Combining labeld and unlabaled data with co-training. In: Proceedings of the 11th annual conference on computational learning theory, pp 92–100 Blum A, Mitchell T (1998) Combining labeld and unlabaled data with co-training. In: Proceedings of the 11th annual conference on computational learning theory, pp 92–100
10.
Zurück zum Zitat Bouchachia A (2004) Model-based clustering guided by swarm intelligence. In: Proceedings of the international conference on recent advances in soft computing, Notingham, UK, pp 33–38 Bouchachia A (2004) Model-based clustering guided by swarm intelligence. In: Proceedings of the international conference on recent advances in soft computing, Notingham, UK, pp 33–38
11.
Zurück zum Zitat Bouchachia A (2005) RBF Networks for learning from partially labeled data. In: Proceedings of the workshop on learning with partially classified training data at the 22nd international conference on machine learning, Bonn, Germany, pp 10–19 Bouchachia A (2005) RBF Networks for learning from partially labeled data. In: Proceedings of the workshop on learning with partially classified training data at the 22nd international conference on machine learning, Bonn, Germany, pp 10–19
12.
Zurück zum Zitat Bouchachia A, Pedrycz W (2003) A semi-supervised clustering algorithm for data exploration. In: Proceedings of the IFSA world congress, pp 328–337 Bouchachia A, Pedrycz W (2003) A semi-supervised clustering algorithm for data exploration. In: Proceedings of the IFSA world congress, pp 328–337
13.
Zurück zum Zitat Bouchachia A, Pedrycz W (2004) Adaptive clustering with partial supervision. In: Proceedings of the international conference on recent advances in soft computing, Notingham, UK, pp 51–56 Bouchachia A, Pedrycz W (2004) Adaptive clustering with partial supervision. In: Proceedings of the international conference on recent advances in soft computing, Notingham, UK, pp 51–56
14.
Zurück zum Zitat Bouchachia A, Pedrycz W (2006) Data clustering with partial supervision. Int J Data Mining Knowl Discov 12(1):47–78CrossRef Bouchachia A, Pedrycz W (2006) Data clustering with partial supervision. Int J Data Mining Knowl Discov 12(1):47–78CrossRef
15.
Zurück zum Zitat Bouchachia A, Pedrycz W (2006) Enhancement of fuzzy clustering by mechanisms of partial supervision. Int J Fuzzy Sets Syst 157(13):1733–1759MATHCrossRef Bouchachia A, Pedrycz W (2006) Enhancement of fuzzy clustering by mechanisms of partial supervision. Int J Fuzzy Sets Syst 157(13):1733–1759MATHCrossRef
16.
Zurück zum Zitat Brefeld U, Scheffer T (2004) Co-EM support vector learning. In: Proceedings of the 21st international conference on machine learning Brefeld U, Scheffer T (2004) Co-EM support vector learning. In: Proceedings of the 21st international conference on machine learning
17.
Zurück zum Zitat Burges C, Platt J (2006) Semi-supervised learning, chapter semi-supervised learning with conditional harmonic mixing. MIT Press, pp 239–259 Burges C, Platt J (2006) Semi-supervised learning, chapter semi-supervised learning with conditional harmonic mixing. MIT Press, pp 239–259
18.
Zurück zum Zitat Celeux G, Govaert G (1995) Gaussain parsimonious clustering models. Pattern Recognit 28:781–793CrossRef Celeux G, Govaert G (1995) Gaussain parsimonious clustering models. Pattern Recognit 28:781–793CrossRef
19.
Zurück zum Zitat Chapelle O, Schölkopf B, Zien A (eds) (2006) Semi-supervised learning. MIT Press, Cambridge Chapelle O, Schölkopf B, Zien A (eds) (2006) Semi-supervised learning. MIT Press, Cambridge
20.
Zurück zum Zitat Chapelle O, Zien A (2005) Semi-supervised classification by low density separation. In: Proceedings of the tenth international workshop on artificial intelligence and statistics, pp 705–712 Chapelle O, Zien A (2005) Semi-supervised classification by low density separation. In: Proceedings of the tenth international workshop on artificial intelligence and statistics, pp 705–712
21.
Zurück zum Zitat Cohen D, Ghahramani Z, Jordan M (1995) Active learning with statistical models. In: Neural information processing systems, pp 705–712 Cohen D, Ghahramani Z, Jordan M (1995) Active learning with statistical models. In: Neural information processing systems, pp 705–712
22.
Zurück zum Zitat Corduneanu A, Jaakkola T (2006) Semi-supervised learning, chapter data-dependent regularization. MIT Press, pp 163–182 Corduneanu A, Jaakkola T (2006) Semi-supervised learning, chapter data-dependent regularization. MIT Press, pp 163–182
23.
Zurück zum Zitat Dara R, Kremer S, Stacey D (2002) Clustering unlabelled data with SOMs improves classification of labelled real-world data. In: Proceedings of the workshop on IEEE world congress on computational intelligence, pp 2237–2242 Dara R, Kremer S, Stacey D (2002) Clustering unlabelled data with SOMs improves classification of labelled real-world data. In: Proceedings of the workshop on IEEE world congress on computational intelligence, pp 2237–2242
24.
Zurück zum Zitat Demiriz A, Bennett K, Embrechts M (1999) Semi-supervised clustering using genetic algorithm. Intelligent engineering systems through ANN, pp 809–814 Demiriz A, Bennett K, Embrechts M (1999) Semi-supervised clustering using genetic algorithm. Intelligent engineering systems through ANN, pp 809–814
25.
Zurück zum Zitat Fraley C (1998) Algorithms for model-based Gaussian hierarchical clustering. J Sci Comput 20:270–281MATH Fraley C (1998) Algorithms for model-based Gaussian hierarchical clustering. J Sci Comput 20:270–281MATH
26.
Zurück zum Zitat Freund Y, Shapire R (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139MATHCrossRef Freund Y, Shapire R (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139MATHCrossRef
27.
Zurück zum Zitat Ghani R (2002) Combining labeled and unlabeled data for multiclass text categorization. In: Proceedings of the 19th internaltional conference on machine learning, pp 187–194 Ghani R (2002) Combining labeled and unlabeled data for multiclass text categorization. In: Proceedings of the 19th internaltional conference on machine learning, pp 187–194
28.
Zurück zum Zitat Heinke D, Hamker F (1998) Comparing neural networks: a benchmark on growing neural gas, growing cell structures, and fuzzy artmap. IEEE Trans Neural Netw 9:1279–1291CrossRef Heinke D, Hamker F (1998) Comparing neural networks: a benchmark on growing neural gas, growing cell structures, and fuzzy artmap. IEEE Trans Neural Netw 9:1279–1291CrossRef
30.
Zurück zum Zitat Huang W, Lippmann R (1988) Neural net and traditional classifiers. In: Neural information processing systems, pp 387–396 Huang W, Lippmann R (1988) Neural net and traditional classifiers. In: Neural information processing systems, pp 387–396
31.
Zurück zum Zitat Jaakkola T, Meila M, Jebara T (1999) Maximum entropy discrimination. In: NIPS, pp 470–476 Jaakkola T, Meila M, Jebara T (1999) Maximum entropy discrimination. In: NIPS, pp 470–476
32.
Zurück zum Zitat Jacobs R (1995) Methods of combining experts probability assessments. Neural Comput 7:865–888 Jacobs R (1995) Methods of combining experts probability assessments. Neural Comput 7:865–888
33.
Zurück zum Zitat Jacobs R, Jordan M, Nowlan S, Hinton G (1991) Adaptive mixtures of local experts. Neural Comput 3:79–87 Jacobs R, Jordan M, Nowlan S, Hinton G (1991) Adaptive mixtures of local experts. Neural Comput 3:79–87
34.
Zurück zum Zitat Joachims T (1999) Transductive inference for text classification using support vector machines. In: Proceedings of the 16th international conference on machine learning, pp 290–297 Joachims T (1999) Transductive inference for text classification using support vector machines. In: Proceedings of the 16th international conference on machine learning, pp 290–297
35.
Zurück zum Zitat Joachims T (2003) Transductive learning via spectral graph partitioning. In: Proceedings of the 20th international conference on machine learning, pp 290–297 Joachims T (2003) Transductive learning via spectral graph partitioning. In: Proceedings of the 20th international conference on machine learning, pp 290–297
36.
Zurück zum Zitat Joachims T (2006) Semi-supervised learning, chapter transductive support vector machines. MIT Press, pp 101–112 Joachims T (2006) Semi-supervised learning, chapter transductive support vector machines. MIT Press, pp 101–112
37.
Zurück zum Zitat Kittler J, Hatef M, Duin R, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20(3):226–239CrossRef Kittler J, Hatef M, Duin R, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20(3):226–239CrossRef
38.
Zurück zum Zitat Kuncheva L, Bezdek J, Duin R (2001) Decision templates for multiple classifier fusion: an experimental comparison. Pattern Recognit 34(2):299–314MATHCrossRef Kuncheva L, Bezdek J, Duin R (2001) Decision templates for multiple classifier fusion: an experimental comparison. Pattern Recognit 34(2):299–314MATHCrossRef
39.
Zurück zum Zitat Lam L, Suen C (1995) Optimal combinations of pattern classifiers. Pattern Recognit Lett 16:945–954 Lam L, Suen C (1995) Optimal combinations of pattern classifiers. Pattern Recognit Lett 16:945–954
40.
Zurück zum Zitat Lawrence D, Jordan I (2005) Semi-supervised learning via Gaussian processes. In: NIPS, pp 753–760 Lawrence D, Jordan I (2005) Semi-supervised learning via Gaussian processes. In: NIPS, pp 753–760
41.
Zurück zum Zitat Miller D, Uyar H (1997) A mixture of experts classifiers with learning based on both labeled and unlabeled data. In: Neural information processing systems, pp 571–577 Miller D, Uyar H (1997) A mixture of experts classifiers with learning based on both labeled and unlabeled data. In: Neural information processing systems, pp 571–577
42.
Zurück zum Zitat Moody J, Darken C (1989) Fast learning in networks of locally-tuned processing units. Neural Comput 1:284–294 Moody J, Darken C (1989) Fast learning in networks of locally-tuned processing units. Neural Comput 1:284–294
43.
Zurück zum Zitat Muslea I, Minton S, Knoblock C (2000) Selective sampling with redundant views. In: Proceedings of the AAAI conference, pp 621–626 Muslea I, Minton S, Knoblock C (2000) Selective sampling with redundant views. In: Proceedings of the AAAI conference, pp 621–626
44.
Zurück zum Zitat Nigam K, McCallum A, Thrun S, Mitchell T (2000) Text classification from labeled and unlabeled documents using EM. Mach Learn 39(2/3):103–134MATHCrossRef Nigam K, McCallum A, Thrun S, Mitchell T (2000) Text classification from labeled and unlabeled documents using EM. Mach Learn 39(2/3):103–134MATHCrossRef
45.
Zurück zum Zitat Oh S, Marks R II, El-Sharkawi M (1991) Query based learning in a multilayered perceptron in the presence of data jitter. In: Proceedings of the first international forum on applications of neural networks to power systems, pp 57–62 Oh S, Marks R II, El-Sharkawi M (1991) Query based learning in a multilayered perceptron in the presence of data jitter. In: Proceedings of the first international forum on applications of neural networks to power systems, pp 57–62
46.
Zurück zum Zitat Pedrycz W, Waletzky J (1997) Fuzzy clustering with partial supervision. IEEE Trans on Syst Man Cybern B 27(5):787–795CrossRef Pedrycz W, Waletzky J (1997) Fuzzy clustering with partial supervision. IEEE Trans on Syst Man Cybern B 27(5):787–795CrossRef
47.
Zurück zum Zitat Saul L, Weinberger K, Sha F, Ham J, Lee DD (2006) Semi-supervised learning, chapter spectral methods for dimensionality reduction. MIT Press, pp 279–291 Saul L, Weinberger K, Sha F, Ham J, Lee DD (2006) Semi-supervised learning, chapter spectral methods for dimensionality reduction. MIT Press, pp 279–291
48.
Zurück zum Zitat Seung H, Opper M, Sompolinsky H (1992) Query by committee. Computational learning theory. pp 287–294 Seung H, Opper M, Sompolinsky H (1992) Query by committee. Computational learning theory. pp 287–294
49.
Zurück zum Zitat Sindhwani S, Misha Belkin M, Niyogi P (2006) Semi-supervised learning, chapter the geometric basis of semi-supervised learning. MIT Press, pp 209–226 Sindhwani S, Misha Belkin M, Niyogi P (2006) Semi-supervised learning, chapter the geometric basis of semi-supervised learning. MIT Press, pp 209–226
50.
Zurück zum Zitat Szummer M, Jaakkola T (2002) Information regularization with partially labeled data. Adv Neural Inf Process Syst 15:1025–1032 Szummer M, Jaakkola T (2002) Information regularization with partially labeled data. Adv Neural Inf Process Syst 15:1025–1032
51.
Zurück zum Zitat Verikas A, Gelzinis A, Malmqvist K (2001) Using unlabeled data to train a multilayer perceptron. Neural Process Lett 14:179–201MATHCrossRef Verikas A, Gelzinis A, Malmqvist K (2001) Using unlabeled data to train a multilayer perceptron. Neural Process Lett 14:179–201MATHCrossRef
52.
Zurück zum Zitat Weston J, Leslie C, Ie E, Noble W (2006) Semi-supervised learning, chapter semi-supervised protein classification using cluster kernels. MIT Press, pp 329–345 Weston J, Leslie C, Ie E, Noble W (2006) Semi-supervised learning, chapter semi-supervised protein classification using cluster kernels. MIT Press, pp 329–345
53.
Zurück zum Zitat Wiratunga N, Craw S, Massie S (2003) Index driven selective sampling for cbr. In: Proceedings of the 5th international conference on case-based reasoning, pp 57–62 Wiratunga N, Craw S, Massie S (2003) Index driven selective sampling for cbr. In: Proceedings of the 5th international conference on case-based reasoning, pp 57–62
54.
Zurück zum Zitat Woods K, Kegelmeyer W, Bowyer K (1997) Combination of multiple classifiers using local accuracy estimates. IEEE Trans Pattern Anal Mach Intell 19:405–410CrossRef Woods K, Kegelmeyer W, Bowyer K (1997) Combination of multiple classifiers using local accuracy estimates. IEEE Trans Pattern Anal Mach Intell 19:405–410CrossRef
55.
Zurück zum Zitat Wu T, Pottenger W (2005) A semi-supervised active learning algorithm for information extraction from textual data: research articles. J ASIST 56(3):258–271 Wu T, Pottenger W (2005) A semi-supervised active learning algorithm for information extraction from textual data: research articles. J ASIST 56(3):258–271
56.
Zurück zum Zitat Zhong S, Ghosh J (2003) A unified framework for model-based clustering. J Mach Learn Res 4:1001–1037CrossRef Zhong S, Ghosh J (2003) A unified framework for model-based clustering. J Mach Learn Res 4:1001–1037CrossRef
57.
Zurück zum Zitat Zhou D, Bernhard Schölkopf B (2006) Semi-supervised learning, chapter discrete regularization. MIT Press, pp 227–238 Zhou D, Bernhard Schölkopf B (2006) Semi-supervised learning, chapter discrete regularization. MIT Press, pp 227–238
58.
Zurück zum Zitat Zhou Y, Goldman S (2004) Democratic co-learning. In: Proceedings of the 16th IEEE international conference on tools with artificial intelligence, pp 1082–3409 Zhou Y, Goldman S (2004) Democratic co-learning. In: Proceedings of the 16th IEEE international conference on tools with artificial intelligence, pp 1082–3409
Metadaten
Titel
Learning with partly labeled data
verfasst von
Abdelhamid Bouchachia
Publikationsdatum
01.05.2007
Verlag
Springer-Verlag
Erschienen in
Neural Computing and Applications / Ausgabe 3/2007
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-007-0091-0

Weitere Artikel der Ausgabe 3/2007

Neural Computing and Applications 3/2007 Zur Ausgabe

Premium Partner