2015 | OriginalPaper | Buchkapitel
Tackling Class Imbalance Problem in Binary Classification using Augmented Neighborhood Cleaning Algorithm
verfasst von : Nadyah Obaid Al Abdouli, Zeyar Aung, Wei Lee Woon, Davor Svetinovic
Erschienen in: Information Science and Applications
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
Many natural processes generate some observations more frequently than others. These processes result in an imbalanced distributions which cause classifiers to bias toward the majority class because most classifiers assume a normal distribution. In order to address the problem of class imbalance, a number of data preprocessing techniques, which can be generally categorized into over-sampling and under-sampling methods, have been proposed throughout the years. The Neighborhood cleaning rule (NCL) method proposed by Laurikkala is among the most popular under-sampling methods. In this paper, we augment the original NCL algorithm by cleaning the unwanted samples using CHC evolutionary algorithm instead of a simple nearest neighbor-based cleaning as in NCL. We name our augmented algorithm as NCL+. The performance of NCL+ is compared to that of NCL on 9 imbalanced datasets using 11 different classifiers. Experimental results show noticeable accuracy improvements by NCL+ over NCL. Moreover, NCL+ is also compared to another popular over-sampling method called Synthetic minority over-sampling technique (SMOTE), and is found to offer better results as well.