2006 | OriginalPaper | Buchkapitel
Less Biased Measurement of Feature Selection Benefits
verfasst von : Juha Reunanen
Erschienen in: Subspace, Latent Structure and Feature Selection
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
In feature selection, classification accuracy typically needs to be estimated in order to guide the search towards the useful subsets. It has earlier been shown [1] that such estimates should not be used directly to determine the optimal subset size, or the benefits due to choosing the optimal set. The reason is a phenomenon called overfitting, thanks to which these estimates tend to be biased. Previously, an outer loop of cross-validation has been suggested for fighting this problem. However, this paper points out that a straightforward implementation of such an approach still gives biased estimates for the increase in accuracy that could be obtained by selecting the best-performing subset. In addition, two methods are suggested that are able to circumvent this problem and give virtually unbiased results without adding almost any computational overhead.