2006 | OriginalPaper | Buchkapitel
The Structure of Version Space
verfasst von : Ralf Herbrich, Thore Graepel, Robert C. Williamson
Erschienen in: Innovations in Machine Learning
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
We investigate the generalisation performance of consistent classifiers, i.e. classifiers that are contained in the so-called
version space
, both from a theoretical and experimental angle. In contrast to classical VC analysis—where no single classifier within version space is singled out on grounds of a generalisation error bound—the data dependent structural risk minimisation framework suggests that there exists one
particular
classifier that is to be preferred because it minimises the generalisation error bound. This is usually taken to provide a theoretical justification for learning algorithms such as the well known support vector machine. A reinterpretation of a recent PAC-Bayesian result, however, reveals that given a suitably chosen hypothesis space there exists a large fraction of classifiers with small generalisation error albeit we cannot identify them for a specific learning task. In the particular case of linear classifiers we show that classifiers found by the classical p erceptron algorithm have guarantees bounded by the size of version space. These results are complemented with an empirical study for kernel classifiers on the task of handwritten digit recognition which demonstrates that even classifiers with a small margin may exhibit excellent generalisation. In order to perform this analysis we introduce the kernel Gibbs sampler—an algorithm which can be used to sample consistent kernel classifiers.