Skip to main content

2001 | OriginalPaper | Buchkapitel

Iterative Double Clustering for Unsupervised and Semi-supervised Learning

verfasst von : Ran El-Yaniv, Oren Souroujon

Erschienen in: Machine Learning: ECML 2001

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

This paper studies the Iterative Double Clustering (IDC) meta-clustering algorithm, a new extension of the recent Double Clustering (DC) method of Slonim and Tishby that exhibited impressive performance on text categorization tasks [1]. Using synthetically gener ated data we empirically demonstrate that whenever the DC procedure is successful in recovering some of the structure hidden in the data, the extended IDC procedure can incrementally compute a dramatically better classification, with minor additional computational resources.We demonstrate that the IDC algorithm is especially advantageous when the data exhibits high attribute noise. Our simulation results also show the effectiveness of IDC in text categorization problems. Surprisingly, this unsupervised procedure can be competitive with a (supervised) SVM trained with a small training set. Finally, we propose a natural extension of IDC for (semi-supervised) transductive learning where we are given both labeled and unlabeled examples, and present preliminary empirical results showing the plausibility of the extended method in a semi-supervised setting.

Metadaten
Titel
Iterative Double Clustering for Unsupervised and Semi-supervised Learning
verfasst von
Ran El-Yaniv
Oren Souroujon
Copyright-Jahr
2001
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/3-540-44795-4_11