Skip to main content

2004 | OriginalPaper | Buchkapitel

The Treatment of Missing Values and its Effect on Classifier Accuracy

verfasst von : Edgar Acuña, Caroline Rodriguez

Erschienen in: Classification, Clustering, and Data Mining Applications

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

The presence of missing values in a dataset can affect the performance of a classifier constructed using that dataset as a training sample. Several methods have been proposed to treat missing data and the one used most frequently deletes instances containing at least one missing value of a feature. In this paper we carry out experiments with twelve datasets to evaluate the effect on the misclassification error rate of four methods for dealing with missing values: the case deletion method, mean imputation, median imputation, and the KNN imputation procedure. The classifiers considered were the Linear Discriminant Analysis (LDA) and the KNN classifier. The first one is a parametric classifier whereas the second one is a nonparametric classifier.

Metadaten
Titel
The Treatment of Missing Values and its Effect on Classifier Accuracy
verfasst von
Edgar Acuña
Caroline Rodriguez
Copyright-Jahr
2004
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-642-17103-1_60