Skip to main content

Über dieses Buch

This book presents recent developments and research trends in the field of feature selection for data and pattern recognition, highlighting a number of latest advances.
The field of feature selection is evolving constantly, providing numerous new algorithms, new solutions, and new applications. Some of the advances presented focus on theoretical approaches, introducing novel propositions highlighting and discussing properties of objects, and analysing the intricacies of processes and bounds on computational complexity, while others are dedicated to the specific requirements of application domains or the particularities of tasks waiting to be solved or improved.
Divided into four parts – nature and representation of data; ranking and exploration of features; image, shape, motion, and audio detection and recognition; decision support systems, it is of great interest to a large section of researchers including students, professors and practitioners.



Chapter 1. Advances in Feature Selection for Data and Pattern Recognition: An Introduction

Technological progress of the ever evolving world is connected with the need of developing methods for extracting knowledge from available data, distinguishing variables that are relevant from irrelevant, and reduction of dimensionality by selection of the most informative and important descriptors. As a result, the field of feature selection for data and pattern recognition is studied with such unceasing intensity by researchers, that it is not possible to present all facets of their investigations. The aim of this chapter is to provide a brief overview of some recent advances in the domain, presented as chapters included in this monograph.
Urszula Stańczyk, Beata Zielosko, Lakhmi C. Jain

Nature and Representation of Data


Chapter 2. Attribute Selection Based on Reduction of Numerical Attributes During Discretization

Some numerical attributes may be reduced during discretization. It happens when a discretized attribute has only one interval, i.e., the entire domain of a numerical attribute is mapped into a single interval. The problem is how such reduction of data sets affects the error rate measured by the C4.5 decision tree generation system using ten-fold cross-validation. Our experiments on 15 numerical data sets show that for a Dominant Attribute discretization method the error rate is significantly larger (5% significance level, two-tailed test) for the reduced data sets. However, decision trees generated from the reduced data sets are significantly simpler than the decision trees generated from the original data sets.
Jerzy W. Grzymała-Busse, Teresa Mroczek

Chapter 3. Improving Bagging Ensembles for Class Imbalanced Data by Active Learning

Extensions of under-sampling bagging ensemble classifiers for class imbalanced data are considered. We propose a two phase approach, called Actively Balanced Bagging, which aims to improve recognition of minority and majority classes with respect to so far proposed extensions of bagging. Its key idea consists in additional improving of an under-sampling bagging classifier (learned in the first phase) by updating in the second phase the bootstrap samples with a limited number of examples selected according to an active learning strategy. The results of an experimental evaluation of Actively Balanced Bagging show that this approach improves predictions of the two different baseline variants of under-sampling bagging. The other experiments demonstrate the differentiated influence of four active selection strategies on the final results and the role of tuning main parameters of the ensemble.
Jerzy Błaszczyński, Jerzy Stefanowski

Chapter 4. Attribute-Based Decision Graphs and Their Roles in Machine Learning Related Tasks

Recently, new supervised machine learning algorithm has been proposed which is heavily supported by the construction of an attribute-based decision graph (AbDG) structure, for representing, in a condensed way, the training set associated with a learning task. Such structure has been successfully used for the purposes of classification and imputation in both, stationary and non-stationary environments. This chapter provides a detailed presentation of the motivations and main technicalities involved in the process of constructing AbDGs, as well as stresses some of the strengths of this graph-based structure, such as robustness and low computational costs associated with both, training and memory use. Given a training set, a collection of algorithms for constructing a weighted graph (i.e., an AbDG) based on such data is presented. The chapter describes in details algorithms involved in creating the set of vertices, the set of edges and, also, assigning labels to vertices and weights to edges. Ad-hoc algorithms for using AbDGs for both, classification or imputation purposes, are also addressed.
João Roberto Bertini Junior, Maria do Carmo Nicoletti

Chapter 5. Optimization of Decision Rules Relative to Length Based on Modified Dynamic Programming Approach

This chapter is devoted to the modification of an extension of dynamic programming approach for optimization of decision rules relative to length. “Classical” dynamic programming approach allows one to obtain optimal rules, i.e., rules with the minimum length. This fact is important from the point of view of knowledge representation. The idea of the dynamic programming approach for optimization of decision rules is based on a partitioning of a decision table into subtables. The algorithm constructs a directed acyclic graph. Basing on the constructed graph, sets of rules with the minimum length, attached to each row of a decision table, can be described. Proposed modification is based on the idea that not the complete graph is constructed but its part. It allows one to obtain values of length of decision rules close to optimal ones, and the size of the graph is smaller than in case of “classical” dynamic programming approach. The chapter also contains results of experiments with decision tables from UCI Machine Learning Repository.
Beata Zielosko, Krzysztof Żabiński

Ranking and Exploration of Features


Chapter 6. Generational Feature Elimination and Some Other Ranking Feature Selection Methods

Feature selection methods are effective in reducing dimensionality, removing irrelevant data, increasing learning accuracy, and improving result comprehensibility. However, the recent increase of dimensionality of data poses a severe challenge to many existing feature selection methods with respect to efficiency and effectiveness. In this chapter, both an overview of reasons for using ranking feature selection methods and the main general classes of this kind of algorithms are described. Moreover, some background of ranking method issues is defined. Next, we are focused on selected algorithms based on random forests and rough sets. Additionally, a newly implemented method, called Generational Feature Elimination (GFE), based on decision tree models, is introduced. This method is based on feature occurrences at given levels inside decision trees created in subsequent generations. Detailed information, about its particular properties and results of performance with comparison to other presented methods, is also included. Experiments are performed on real-life data sets as well as on an artificial benchmark data set.
Wiesław Paja, Krzysztof Pancerz, Piotr Grochowalski

Chapter 7. Ranking-Based Rule Classifier Optimisation

Ranking is a strategy widely used for estimating relevance or importance of available characteristic features. Depending on the applied methodology, variables are assessed individually or as subsets, by some statistics referring to information theory, machine learning algorithms, or specialised procedures that execute systematic search through the feature space. The information about importance of attributes can be used in the pre-processing step of initial data preparation, to remove irrelevant or superfluous elements. It can also be employed in post-processing, for optimisation of already constructed classifiers. The chapter describes research on the latter approach, involving filtering inferred decision rules while exploiting ranking positions and scores of features. The optimised rule classifiers were applied in the domain of stylometric analysis of texts for the task of binary authorship attribution.
Urszula Stańczyk

Chapter 8. Attribute Selection in a Dispersed Decision-Making System

In this chapter, the use of a method for attribute selection in a dispersed decision-making system is discussed. Dispersed knowledge is understood to be the knowledge that is stored in the form of several decision tables. Different methods for solving the problem of classification based on dispersed knowledge are considered. In the first method, a static structure of the system is used. In more advanced techniques, a dynamic structure is applied. Different types of dynamic structures are analyzed: a dynamic structure with disjoint clusters, a dynamic structure with inseparable clusters and a dynamic structure with negotiations. A method for attribute selection, which is based on the rough set theory, is used in all of the methods described here. The results obtained for five data sets from the UCI Repository are compared and some conclusions are drawn.
Małgorzata Przybyła-Kasperek

Chapter 9. Feature Selection Approach for Rule-Based Knowledge Bases

The subject-matter of this study is knowledge representation in rule-based knowledge bases. The two following issues will be discussed herein: feature selection as a part of mining knowledge bases from a knowledge engineer’s perspective (it is usually aimed at completeness analysis, consistency of the knowledge base and detection of redundancy and unusual rules) as well as from a domain expert’s point of view (domain expert intends to explore the rules with regard to their optimization, improved interpretation and a view to improve the quality of knowledge recorded in the rules). In this sense, exploration of rules, in order to select the most important knowledge, is based, in a great extent, on the analysis of similarities across the rules and their clusters. Building the representatives for created clusters of rules bases on the analysis of the left-hand sides of this rules and then selection of the best descriptive once. Thus we may treat this approach as a feature selection process.
Agnieszka Nowak-Brzezińska

Image, Shape, Motion, and Audio Detection and Recognition


Chapter 10. Feature Selection with a Genetic Algorithm for Classification of Brain Imaging Data

Recent advances in brain imaging technology, coupled with large-scale brain research projects, such as the BRAIN initiative in the U.S. and the European Human Brain Project, allow us to capture brain activity in unprecedented details. In principle, the observed data is expected to substantially shape our knowledge about brain activity, which includes the development of new biomarkers of brain disorders. However, due to the high dimensionality, the analysis of the data is challenging, and selection of relevant features is one of the most important analytic tasks. In many cases, due to the complexity of search space, evolutionary algorithms are appropriate to solve the aforementioned task. In this chapter, we consider the feature selection task from the point of view of classification tasks related to functional magnetic resonance imaging (fMRI) data. Furthermore, we present an empirical comparison of conventional LASSO-based feature selection and a novel feature selection approach designed for fMRI data based on a simple genetic algorithm.
Annamária Szenkovits, Regina Meszlényi, Krisztian Buza, Noémi Gaskó, Rodica Ioana Lung, Mihai Suciu

Chapter 11. Shape Descriptions and Classes of Shapes. A Proximal Physical Geometry Approach

This chapter introduces the notion of classes of shapes that have descriptive proximity to each other in planar digital 2D image object shape detection. A finite planar shape is planar region with a boundary (shape contour) and a nonempty interior (shape surface). The focus here is on the triangulation of image object shapes, resulting in maximal nerve complexes (MNCs) from which shape contours and shape interiors can be detected and described. An MNC is collection of filled triangles (called 2-simplexes) that have a vertex in common. Interesting MNCs are those collections of 2-simplexes that have a shape vertex in common. The basic approach is to decompose an planar region containing an image object shape into 2-simplexes in such a way that the filled triangles cover either part or all of a shape. After that, an unknown shape can be compared with a known shape by comparing the measurable areas of a collection of 2-simplexes covering both known and unknown shapes. Each known shape with a known triangulation belongs to a class of shapes that is used to classify unknown triangulated shapes. Unlike the conventional Delaunay triangulation of spatial regions, the proposed triangulation results in simplexes that are filled triangles, derived by the intersection of half spaces, where the edge of each half space contains a line segment connected between vertices called sites (generating points). A straightforward result of this approach to image geometry is a rich source of simple descriptions of plane shapes of image objects based on the detection of nerve complexes that are maximal nerve complexes or MNCs. The end result of this work is a proximal physical geometric approach to detecting and classifying image object shapes.
James Francis Peters, Sheela Ramanna

Chapter 12. Comparison of Classification Methods for EEG Signals of Real and Imaginary Motion

The classification of EEG signals provides an important element of brain-computer interface (BCI) applications, underlying an efficient interaction between a human and a computer application. The BCI applications can be especially useful for people with disabilities. Numerous experiments aim at recognition of motion intent of left or right hand being useful for locked-in-state or paralyzed subjects in controlling computer applications. The chapter presents an experimental study of several methods for real motion and motion intent classification (rest/upper/lower limbs motion, and rest/left/right hand motion). First, our approach to EEG recordings segmentation and feature extraction is presented. Then, 5 classifiers (Naïve Bayes, Decision Trees, Random Forest, Nearest-Neighbors NNge, Rough Set classifier) are trained and tested using examples from an open database. Feature subsets are selected for consecutive classification experiments, reducing the number of required EEG electrodes. Methods comparison and obtained results are presented, and a study of features feeding the classifiers is provided. Differences among participating subjects and accuracies for real and imaginary motion are discussed. It is shown that though classification accuracy varies from person to person, it could exceed 80% for some classifiers.
Piotr Szczuko, Michał Lech, Andrzej Czyżewski

Chapter 13. Application of Tolerance Near Sets to Audio Signal Classification

This chapter is an extension of our work presented where the problem of classifying audio signals using a supervised tolerance class learning algorithm (TCL) based on tolerance near sets was first proposed. In the tolerance near set method(TNS), tolerance classes are directly induced from the data set using a tolerance level and a distance function. The TNS method lends itself to applications where features are real-valued such as image data, audio and video signal data. Extensive experimentation with different audio-video data sets were performed to provide insights into the strengths and weaknesses of the TCL algorithm compared to granular (fuzzy and rough) and classical machine learning algorithms.
Ashmeet Singh, Sheela Ramanna

Decision Support Systems


Chapter 14. Visual Analysis of Relevant Features in Customer Loyalty Improvement Recommendation

This chapter describes a practical application of decision reducts to a real-life business problem. It presents a feature selection (attribute reduction) methodology based on the decision reducts theory, which is supported by a designed and developed visualization system. The chapter overviews an application area - Customer Loyalty Improvement Recommendation, which has become a very popular and important topic area in today’s business decision problems. The chapter describes a real-world dataset, which consists of about 400,000 surveys on customer satisfaction collected in years 2011–2016. Major machine learning techniques used to develop knowledge-based recommender system, such as decision reducts, classification, clustering, action rules, are described. Next, visualization techniques used for the implemented interactive system are presented. The experimental results on the customer dataset illustrate the correlation between classification features and the decision feature called “Promoter Score” and how these help to understand changes in customer sentiment.
Katarzyna A. Tarnowska, Zbigniew W. Raś, Lynn Daniel, Doug Fowler

Chapter 15. Evolutionary and Aggressive Sampling for Pattern Revelation and Precognition in Building Energy Managing System with Nature-Based Methods for Energy Optimization

This chapter presents a discussion on an alternative attempt to manage the grids that are in intelligent buildings such as central heating, heat recovery ventilation or air conditioning for energy cost minimization. It includes a review and explanation of the existing methodology and smart management system. A suggested matrix-like grid that includes methods for achieving the expected minimization goals is also presented. Common techniques are limited to central management using fuzzy-logic drivers, but referred redefining of the model is used to achieve the best possible solution with a surplus of extra energy. Ordinary grids do not permit significant development in the present state. A modified structure enhanced with a matrix-like grid is one way to eliminate basic faults of ordinary grids model, but such an intricate grid can result in sub-optimal resource usage and excessive costs. The expected solution is a challenge for different Ant Colony Optimization (ACO) techniques with an evolutionary or aggressive approach taken into consideration. Different opportunities create many latent patterns to recover, evaluate and rate. Increasing building structure can surpass a point of complexity, which would limit the creation of an optimal grid pattern in real time using the conventional methods. It is extremely important to formulate more aggressive ways to find an approximation of the optimal pattern within an acceptable time frame.
Jarosław Utracki, Mariusz Boryczka


Weitere Informationen

Premium Partner