Feature selection is usually motivated by improved computational complexity, economy and problem understanding, but it can also improve classification accuracy in many cases. In this paper we investigate the relationship between the optimal number of features and the training set size. We present a new and simple analysis of the well-studied two-Gaussian setting. We explicitly find the optimal number of features as a function of the training set size for a few special cases and show that accuracy declines dramatically by adding too many features. Then we show empirically that
Support Vector Machine
(SVM), that was designed to work in the presence of a large number of features produces the same qualitative result for these examples. This suggests that good feature selection is still an important component in accurate classification.