Hierarchical Feature Selection for Knowledge Discovery

Application of Data Mining to the Biology of Ageing

verfasst von: Dr. Cen Wan

Verlag: Springer International Publishing

Buchreihe : Advanced Information and Knowledge Processing

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

This book is the first work that systematically describes the procedure of data mining and knowledge discovery on Bioinformatics databases by using the state-of-the-art hierarchical feature selection algorithms. The novelties of this book are three-fold. To begin with, this book discusses the hierarchical feature selection in depth, which is generally a novel research area in Data Mining/Machine Learning. Seven different state-of-the-art hierarchical feature selection algorithms are discussed and evaluated by working with four types of interpretable classification algorithms (i.e. three types of Bayesian network classification algorithms and the k-nearest neighbours classification algorithm). Moreover, this book discusses the application of those hierarchical feature selection algorithms on the well-known Gene Ontology database, where the entries (terms) are hierarchically structured. Gene Ontology database that unifies the representations of gene and gene products annotation provides the resource for mining valuable knowledge about certain biological research topics, such as the Biology of Ageing. Furthermore, this book discusses the mined biological patterns by the hierarchical feature selection algorithms relevant to the ageing-associated genes. Those patterns reveal the potential ageing-associated factors that inspire future research directions for the Biology of Ageing research.

Inhaltsverzeichnis

Frontmatter

Chapter 1. Introduction

Abstract

Data mining (or machine learning) techniques have attracted considerable attention from both academia and industry, due to their significant contributions to intelligent data analysis. The importance of data mining and its applications is likely to increase even further in the future, given that organisations keep collecting increasingly larger amounts of data and more diverse types of data. Due to the rapid growth of data from real world applications, it is timely to adopt Knowledge Discovery in Databases (KDD) methods to extract knowledge or valuable information from data. Indeed, KDD has already been successfully adopted in real world applications, both in science and in business.

Cen Wan

Chapter 2. Data Mining Tasks and Paradigms

Abstract

Data Mining tasks are types of problems to be solved by a data mining or machine learning algorithm. The main types of data mining tasks can be categorised as classification, regression, clustering and association rule mining. The former two tasks (classification and regression) are also grouped as the supervised learning paradigm, whereas the latter one (clustering) is categorised as unsupervised learning.

Cen Wan

Chapter 3. Feature Selection Paradigms

Abstract

Feature selection is a type of data pre-processing task that consists of removing irrelevant and redundant features in order to improve the predictive performance of classifiers. The dataset with the full set of features is input to the feature selection method, which will select a subset of features to be used for building the classifier. Then the built classifier will be evaluated, by measuring its predictive accuracy. Irrelevant features can be defined as features which are not correlated with the class variable, and so removing such features will not be harmful for the predictive performance. Redundant features can be defined as those features which are strongly correlated with other features, so that removing those redundant features should also not be harmful for the predictive performance.

Cen Wan

Chapter 4. Background on Biology of Ageing and Bioinformatics

Abstract

Ageing is an ancient research topic that has attracted scientists’ attention for a long time, not only for its practical implications on extending the longevity of human beings, but also due to its high complexity. With the help of modern biological science, it is possible to start to reveal the mysteries of ageing. This book focuses on research about the biology of ageing, which is an application topic associated with the hierarchical feature selection methods, which will be described in the next three chapters. This chapter will briefly review basic concept of Molecular Biology; Biology of Ageing; and Bioinformatics.

Cen Wan

Chapter 5. Lazy Hierarchical Feature Selection

Abstract

This chapter describes three different lazy hierarchical feature selection methods, namely Select Hierarchical Information-Preserving Features (HIP) (Wan and Freitas, Artificial intelligence review, [5], Wan et al., IEEE/ACM Trans Comput Biol Bioinform 12(2):262–275, [6]), Select Most Relevant Features (MR) (Wan and Freitas, Artificial intelligence review, [5], Wan et al., IEEE/ACM Trans Comput Biol Bioinform 12(2):262–275, [6]) and the hybrid Select Hierarchical Information-Preserving and Most Relevant Features (HIP–MR) (Wan and Freitas, Proceedings of IEEE international conference on bioinformatics and biomedicine (BIBM 2013), Shanghai, China, pp 373–380, [3], Wan et al., IEEE/ACM Trans Comput Biol Bioinform 12(2):262–275, [6]). Those three hierarchical feature selection methods are categorised as filter methods (discussed in Chap. 2, i.e. feature selection is conducted before the learning process of classifier).

Cen Wan

Chapter 6. Eager Hierarchical Feature Selection

Abstract

This chapter discusses four different eager hierarchical feature selection methods, i.e. Tree-based Feature Selection (TSEL) (Jeong and Myaeng, Proceedings of the international joint conference on natural language processing, Nagoya, Japan, 2013, [1]), Bottom-up Hill Climbing Feature Selection (HC) (Wang et al, Proceedings of the 26th Australasian computer science conference, Darlinghurst, Australia, 2003, [5]), Greedy Top-down Feature Selection (GTD) (Lu et al, Proceedings of the international conference conference on collaborative computing, Austin, USA, 2013, [2]) and Hierarchy-based Feature Selection (SHSEL) (Ristoski and Paulheim, Proceedings of the international conference on discovery science (DS 2014), 2014, [3]). All of those four hierarchical feature selection methods are also categorised as filter methods. Those methods aim to alleviate the feature redundancy by considering the hierarchical structure between features and the predictive power of features (e.g. information gain). Unlike the lazy hierarchical feature selection methods discussed in last chapter, those eager hierarchical feature selection methods only consider the relevance value of those features calculated by the training dataset and the hierarchical information, without considering the actual value of features for individual testing instance.

Cen Wan

Chapter 7. Comparison of Lazy and Eager Hierarchical Feature Selection Methods and Biological Interpretation on Frequently Selected Gene Ontology Terms Relevant to the Biology of Ageing

Abstract

This chapter compares the predictive performance of all different hierarchical feature selection methods working with different classifiers on 28 datasets. The number of features selected by different feature selection methods are also reported. Finally, the features (GO terms) selected by the optimal hierarchical feature selection methods are interpreted for revealing potential patterns relevant to the biology of ageing.

Cen Wan

Chapter 8. Conclusions and Research Directions

Abstract

Overall, the hierarchical feature selection methods (especially the lazy learning-based ones) show the capacity on improving the predictive performance of different classifiers. Their better performance also proves that exploiting the hierarchical dependancy information as a type of searching constraint usually leads to a feature subset containing higher predictive power. However, note that, those hierarchical feature selection methods still have some drawbacks. For example, as one of the top-performing methods, HIP eliminates hierarchical redundancy and selects a feature subset that retains all hierarchical information, whereas it ignores the relevance of individual features - since it does not consider any measure of association between a feature and the class attribute. Analogously, MR method eliminates hierarchical redundancy and selects features by considering both the hierarchical information and the features relevance, but the selected features might not retain the complete hierarchical information.

Cen Wan

Backmatter

Titel: Hierarchical Feature Selection for Knowledge Discovery
verfasst von: Dr. Cen Wan
Verlag: Springer International Publishing
Electronic ISBN: 978-3-319-97919-9
Print ISBN: 978-3-319-97918-2
DOI: https://doi.org/10.1007/978-3-319-97919-9