Abstract
Recently, machine learning techniques are often applied in real world scenarios where learning signals are provided as a stream of data points, and models need to be adapted online according to the current information. A severe problem of such settings consists in the fact that the underlying data distribution might change over time and concept drift or change of the feature characteristics have to be dealt with. In addition, data are often imbalanced because training signals for rare classes are particularly sparse. In the last years, a number of learning technologies have been proposed, which can reliably learn in the presence of drift, whereby non-parametric approaches such as the recent model SAM-kNN
[
10] can deal particularly well with heterogeneous or priorly unknown types of drift. Yet these methods share the deficiencies of the underlying vanilla-kNN classifier when dealing with imbalanced classes. In this contribution, we propose intuitive extensions of SAM-kNN, which incorporate successful balancing techniques for kNN, namely SMOTE-sampling
[
1] and kENN
[
9], respectively, into the online learning scenario. Besides, we propose a new method, Informed Downsampling, for solving class imbalance in non-stationary settings with underlying drift, and demonstrate its superiority in a number of benchmarks.