Skip to main content

2002 | OriginalPaper | Buchkapitel

Iterative Data Squashing for Boosting Based on a Distribution-Sensitive Distance

verfasst von : Yuta Choki, Einoshin Suzuki

Erschienen in: Principles of Data Mining and Knowledge Discovery

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

This paper proposes, for boosting, a novel method which prevents deterioration of accuracy inherent to data squashing methods. Boosting, which constructs a highly accurate classification model by combining multiple classification models, requires long computational time. Data squashing, which speeds-up a learning method by abstracting the training data set to a smaller data set, typically lowers accuracy. Our SB (Squashing-Boosting) loop, based on a distribution-sensitive distance, alternates data squashing and boosting, and iteratively refines an SF (Squashed-Feature) tree, which provides an appropriately squashed data set. Experimental evaluation with artificial data sets and the KDD Cup 1999 data set clearly shows superiority of our method compared with conventional methods. We have also empirically evaluated our distance measure as well as our SF tree, and found them superior to alternatives.

Metadaten
Titel
Iterative Data Squashing for Boosting Based on a Distribution-Sensitive Distance
verfasst von
Yuta Choki
Einoshin Suzuki
Copyright-Jahr
2002
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/3-540-45681-3_8

Premium Partner