Elsevier

Applied Soft Computing

Volume 13, Issue 1, January 2013, Pages 654-666
Applied Soft Computing

Meta-cognitive RBF Network and its Projection Based Learning algorithm for classification problems

https://doi.org/10.1016/j.asoc.2012.08.047Get rights and content

Abstract

‘Meta-cognitive Radial Basis Function Network’ (McRBFN) and its ‘Projection Based Learning’ (PBL) algorithm for classification problems in sequential framework is proposed in this paper and is referred to as PBL-McRBFN. McRBFN is inspired by human meta-cognitive learning principles. McRBFN has two components, namely the cognitive component and the meta-cognitive component. The cognitive component is a single hidden layer radial basis function network with evolving architecture. In the cognitive component, the PBL algorithm computes the optimal output weights with least computational effort by finding analytical minima of the nonlinear energy function. The meta-cognitive component controls the learning process in the cognitive component by choosing the best learning strategy for the current sample and adapts the learning strategies by implementing self-regulation. In addition, sample overlapping conditions are considered for proper initialization of new hidden neurons, thus minimizes the misclassification. The interaction of cognitive component and meta-cognitive component address the what-to-learn, when-to-learn and how-to-learn human learning principles efficiently. The performance of the PBL-McRBFN is evaluated using a set of benchmark classification problems from UCI machine learning repository and two practical problems, viz., the acoustic emission signal classification and the mammogram for cancer classification. The statistical performance evaluation on these problems has proven the superior performance of PBL-McRBFN classifier over results reported in the literature.

Highlights

► Meta-cognitive learning to emulate human learning components such as what-to-learn, when-to-learn and how-to-learn from sequence of training data. ► Sample learning, sample deletion and sample reserve strategy are proposed. Meta-cognitive component in PBL-McRBFN choose of the strategy based on information present in current sample and existing knowledge in RBF. ► PBL-McRBFN evolves the network architecture automatically and the strategies are also adapted to accommodate coarse knowledge first followed by fine tuning. ► Sequential learning algorithm uses computationally less intensive project based learning algorithm. ► Performance of proposed algorithm is compared with well-known fast learning neural networks reported in the literature using UCI data sets.

Introduction

Neural networks are powerful tools that can be used to approximate the complex nonlinear input–output relationships efficiently. Hence, from the last few decades neural networks are extensively employed to solve real world classification problems [1]. In a classification problem, the objective is to learn the decision surface that accurately maps an input feature space to an output space of class labels. Several learning algorithms for different neural network architectures have been used in various problems in science, business, industry and medicine, including the handwritten character recognition [2], speech recognition [3], biomedical medical diagnosis [4], prediction of bankruptcy [5], text categorization [6] and information retrieval [7]. Among various architectures reported in the literature, Radial Basis Function (RBF) network gaining attention due to its localization property of Gaussian function, and widely used in classification problems. Significant contributions to RBF learning algorithms for classification problems are broadly classified into two categories: (a) Batch learning algorithms: Gradient descent based learning was used to determine the network parameters [8]. Here, the complete training data are presented multiple times, until the training error is minimum. Alternatively, one can implement random input parameter selection with least square solution for the output weight [9], [10]. In both cases, the number of Gaussian functions required to approximate the true function is determined heuristically. (b) Sequential learning algorithms: The number of Gaussian neurons required to approximate the input–output relationship is determined automatically [11], [12], [13], [14], [15]. Here, the training samples are presented one-by-one and discarded after learning. Resource Allocation Network (RAN) [11] was the first sequential learning algorithm introduced in the literature. RAN evolves the network architecture required to approximate the true function using novelty based neuron growth criterion. Minimal Resource Allocation Network (MRAN) [12] uses a similar approach, but it incorporates error based neuron growing/pruning criterion. Hence, MRAN determines compact network architecture than RAN algorithm. Growing and Pruning Radial Basis Function Network [13] selects growing/pruning criteria of the network based on the significance of a neuron. A sequential learning algorithm using recursive least squares presented in [14], referred as an On-line Sequential Extreme Learning Machine (OS-ELM). OS-ELM chooses input weights randomly with fixed number of hidden neurons and analytically determines the output weights using minimum norm least-squares. In case of sparse and imbalance data sets, the random selection of input weights with fixed number of hidden neurons in the OS-ELM affects the performance significantly as shown in [16]. In neural-fuzzy framework, Evolving Fuzzy Neural Networks (EFuNNs) [17] is the novel sequential learning algorithm. It has been shown in [15] that the aforementioned algorithms works well for the function approximation problems than the classification problems. A Sequential Multi-Category Radial Basis Function network (SMC-RBF) [15] considers the similarity measure within class, misclassification rate and prediction error are used in neuron growing and parameter update criterion. SMC-RBF has been shown that updating the nearest neuron parameters in the same class as that of the current sample helps in improving the performance than updating a nearest neuron in any class.

Aforementioned neural network algorithms use all the samples in the training data set to gain knowledge about the information contained in the samples. In other words, they possess information-processing abilities of humans, including perception, learning, remembering, judging, and problem-solving, and these abilities are cognitive in nature. However, recent studies on human learning has revealed that the learning process is effective when the learners adopt self-regulation in learning process using meta-cognition [18], [19]. Meta-cognition means cognition about cognition. In a meta-cognitive framework, human-beings think about their cognitive processes, develop new strategies to improve their cognitive skills and evaluate the information contained in their memory. If a radial basis function network analyzes its cognitive process and chooses suitable learning strategies adaptively to improve its cognitive process then it is referred to as ‘Meta-Cognitive Radial Basis Function Network’ (McRBFN). Such a McRBFN must be capable of deciding what-to-learn, when-to-learn and how-to-learn the decision function from the stream of training data by emulating the human self-regulated learning.

Self-adaptive Resource Allocation Network (SRAN) [20] and Complex-valued Self-regulating Resource Allocation Network (CSRAN) [21] address the what-to-learn component of meta-cognition by selecting significant samples using misclassification error and hinge loss error. It has been shown that the selecting appropriate samples for learning and removing repetitive samples helps in improving the generalization performance. Therefore, it is evident that emulating the three components of human learning with suitable learning strategies would improve the generalization ability of a neural network. The drawbacks in these algorithms are: (a) the samples for training are selected based on simple error criterion which is not sufficient to address the significance of samples; (b) the new hidden neuron center is allocated independently which may overlap with already existed neuron centers leading to misclassification; (c) knowledge gained from past samples is not used; and (d) uses computationally intensive extended Kalman filter for parameter update. Meta-cognitive Neural Network (McNN) [22] and Meta-cognitive Neuro-Fuzzy Inference System (McFIS) [23] address the first two issues efficiently by using three components of meta-cognition. However, McNN and McFIS use computationally intensive parameter update and does not utilize the past knowledge stored in the network. Similar works using meta-cognition in complex domain are reported in [24], [25]. Recently proposed Projection Based Learning in meta-cognitive radial basis function network [26] addresses the above issues in batch mode except proper utilization of the past knowledge stored in the network and applied to solve biomedical problems in [27], [28], [29]. In this paper, we propose a meta-cognitive radial basis function network and its fast and efficient projection based sequential learning algorithm.

There are several meta-cognition models available in human physiology and a brief survey of various meta-cognition models are reported in [30]. Among the various models, the model proposed by Nelson and Narens in [31] is simple and clearly highlights the various actions in human meta-cognition as shown in Fig. 1(a). The model is analogous to the meta-cognition in human-beings and has two components, the cognitive component and the meta-cognitive component. The information flow from the cognitive component to meta-cognitive component is considered monitoring, while the information flow in the reverse direction is considered control. The information flowing from the meta-cognitive component to the cognitive component either changes the state of the cognitive component or changes the cognitive component itself. Monitoring informs the meta-cognitive component about the state of cognitive component, thus continuously updating the meta-cognitive component's model of cognitive component, including, ‘no change in state’.

McRBFN is developed based on the Nelson and Narens meta-cognition model [31] as shown in Fig. 1(b). Analogous to the Nelson and Narens meta-cognition model [31], McRBFN has two components namely the cognitive component and the meta-cognitive component as shown in Fig. 1(b). The cognitive component is a single hidden layer radial basis function network with evolving architecture. The cognitive component learns from the training data by adding new hidden neurons and updating the output weights of hidden neurons to approximate the true function. The input weights of hidden neurons (center and width) are determined based on the training data and output weights of hidden neurons are estimated using the projection based sequential learning algorithm. When a neuron is added to the cognitive component, the input/hidden layer parameters are fixed based on the input of the sample and the output weights are estimated by minimizing an energy function given by the hinge loss error as in [32]. The problem of finding optimal weights is first formulated as a linear programming problem using the principles of minimization and real calculus [33], [34]. The Projection Based Learning (PBL) algorithm then converts the linear programming problem into a system of linear equations and provides a solution for the optimal weights, corresponding to the minimum energy point of the energy function. The meta-cognitive component of McRBFN contains a dynamic model of the cognitive component, knowledge measures and self-regulated thresholds. Meta-cognitive component controls the learning process of the cognitive component by choosing one of the four strategies for each sample in the training data set. When a sample is presented to McRBFN, the meta-cognitive component of McRBFN measures the knowledge contained in the current training sample with respect to the cognitive component using its knowledge measures. Predicted class label, maximum hinge error and class-wise significance are considered as knowledge measures of the meta-cognitive component. Class-wise significance is obtained from spherical potential, which is used widely in kernel methods to determine whether all the data points are enclosed tightly by the Gaussian kernels [35]. Here, the squared distance between the current sample and the hyper-dimensional projection helps in measuring the novelty in the data. Since, in this paper, McRBFN addresses classification problems, we redefine the spherical potential in class-wise framework and is used in devising the learning strategies. Using the above mentioned measures the meta-cognitive component constructs two sample based learning strategies and two neuron based learning strategies. One of these strategies is selected for the current training sample such that the cognitive component learns the true function accurately and achieves better generalization performance. These learning strategies are adapted by meta-cognitive component using self-regulated thresholds. In addition, the meta-cognitive component identifies the overlapping/non-overlapping conditions by measuring the distance from nearest neuron in the inter/intra-class. The McRBFN using the PBL to obtain the network parameters is referred to as, ‘Projection Based Learning algorithm for a Meta-cognitive Radial Basis Network’ (PBL-McRBFN).

The performance of the proposed PBL-McRBFN classifier is evaluated using set of benchmark binary/multi-category classification problems from University of California, Irvine (UCI) machine learning repository [36]. We consider five multi-category and five binary classification problems with varying values of imbalance factor. In all these problems, the performance of PBL-McRBFN is compared against the best performing classifiers available in the literature using class-wise performance measures like overall/average efficiency and a non-parametric statistical significance test [37]. The non-parametric Friedman test based on the mean ranking of each algorithm over multiple data sets indicate the statistical significance of the proposed PBL-McRBFN classifier. Finally, the performance of PBL-McRBFN classifier has also been evaluated using two practical classification problems viz., the acoustic emission signal classification [38] and the mammogram classification for breast cancer detection [39]. The results clearly highlight that PBL-McRBFN classifier provides a better generalization performance than the results reported in the literature.

The outline of this paper is as follows: Section 2 describes the meta-cognitive radial basis network for classification problems. Section 3 presents the performance evaluation of PBL-McRBFN classifier on a set of benchmark and practical classification problems, and compares with the best performing classifiers available in the literature. Section 4 summarizes the conclusions from this study.

Section snippets

Meta-cognitive radial basis function network for classification problems

In this section, we describe the meta-cognitive radial basis function network for solving classification problems. First, we define the classification problem. Next, we present the meta-cognitive radial basis function network architecture. Finally, we present the sequential learning algorithm and summarize in a pseudo-code form.

Performance evaluation of PBL-McRBFN classifier

PBL-McRBFN classifier performance is evaluated on benchmark multi-category and binary classification problems from UCI machine learning repository. The performance is compared with the best performing sequential learning algorithm reported in the literature (SRAN) [20], batch ELM classifier [16] and also with the standard support vector machines [42]. The data sets are chosen with varying sample imbalance. The sample imbalance is measured using Imbalance Factor (I.F) asI.F=1nN*minj=1nNjwhere N

Conclusions

In this paper, we have presented a Meta-cognitive Radial Basis Function Network (McRBFN) and its Projection Based Learning (PBL) algorithm for classification problems in sequential framework. The meta-cognitive component in McRBFN controls the learning of the cognitive component in McRBFN. The meta-cognitive component adapts the learning process appropriately by implementing self-regulation and hence it decides what-to-learn, when-to-learn and how-to-learn efficiently. In addition, the

Acknowledgements

The authors would like to thank the Nanyang Technological University-Ministry of Defence (NTU-MINDEF), Singapore, for the financial support (Grant number: MINDEF-NTU-JPP/11/02/05) to conduct this study.

Mr. Giduthuri Sateesh Babu received the B.Tech degree in electrical and electronics engineering from Jawaharlal Nehru Technological University, India, in 2007, and M.Tech degree in electrical engineering from Indian Institute of Technology Delhi, India, in 2009. From 2009 to 2010, he worked as a senior software engineer in Samsung R&D centre, India. He is currently a Ph.D. student with School of Computer Engineering, Nanyang Technological University, Singapore. His research interests include

References (48)

  • R. Savitha et al.

    A meta-cognitive learning algorithm for a Fully Complex-valued Relaxation Network

    Neural Networks

    (2012)
  • M.T. Cox

    Metacognition in computation: a selected research review

    Artificial Intelligence

    (2005)
  • S. Suresh et al.

    Risk-sensitive loss functions for sparse multi-category classification problems

    Information Sciences

    (2008)
  • H. Hoffmann

    Kernel PCA for novelty detection

    Pattern Recognition

    (2007)
  • S. Suresh et al.

    Lift coefficient prediction at high angle of attack using recurrent neural network

    Aerospace Science and Technology

    (2003)
  • G.B. Zhang

    Neural network for classification: a survey, IEEE Transactions on Systems

    Man and Cybernetics Part C: Applications and Reviews

    (2000)
  • Y. LeCun et al.

    Backpropagation applied to handwritten zip code recognition

    Neural Computation.

    (1989)
  • M.E. Ruiz et al.

    Hierarchical text categorization using neural networks

    Information Retrieval

    (2002)
  • D.E. Rumelhart et al.

    Learning representations by back-propagation errors, nature

    Nature

    (1986)
  • G.-B. Huang et al.

    Extreme learning machine: a new learning scheme of feedforward neural networks

    IEEE International Joint Conference on Neural Networks. Proceedings

    (2004)
  • J.C. Platt

    A resource allocation network for function interpolation

    Neural Computation

    (1991)
  • L. Yingwei et al.

    A sequential learning scheme for function approximation using minimal radial basis function neural networks

    Neural Computation

    (1997)
  • G.-B. Huang et al.

    An efficient sequential learning algorithm for growing and pruning RBF (GAP-RBF) networks

    IEEE transactions on Systems, Man, and Cybernetics. Part B, Cybernetics

    (2004)
  • N.-Y. Liang et al.

    A fast and accurate online sequential learning algorithm for feedforward networks.

    IEEE Transactions on Neural Networks

    (2006)
  • Cited by (103)

    • Brain-inspired evolving and spiking connectionist systems

      2023, Artificial Intelligence in the Age of Neural Networks and Brain Computing, Second Edition
    • Adaptive one-pass passive-aggressive radial basis function for classification problems

      2022, Neurocomputing
      Citation Excerpt :

      The most common paradigm in such methods is determining the best possible strategy to choose an action among deleting data, adding a new neuron, updating weights, or adding data to a fine-tuning set. These systems are designed to avoid overfitting, to find the most informative data and to reduce the structural complexity in contexts of Fuzzy Systems [44–47], Radial Basis Function (RBF) neural networks [48,49,13] and extreme learning machines [50–53]. The most well-known meta-cognitive approach is the Meta-cognitive Neural Network (McNN) algorithm [13].

    • Learning elastic memory online for fast time series forecasting

      2020, Neurocomputing
      Citation Excerpt :

      There have been plenty of work in the field of online regression analysis such as [33,34] which use SGD. Weights can also be updated with a Projection Based Learning (PBL) [35] in an one shot manner using this least square solution approach. Apart from that, PBL also benefits from its immunity against the challenges associated with very small or large gradients.

    View all citing articles on Scopus

    Mr. Giduthuri Sateesh Babu received the B.Tech degree in electrical and electronics engineering from Jawaharlal Nehru Technological University, India, in 2007, and M.Tech degree in electrical engineering from Indian Institute of Technology Delhi, India, in 2009. From 2009 to 2010, he worked as a senior software engineer in Samsung R&D centre, India. He is currently a Ph.D. student with School of Computer Engineering, Nanyang Technological University, Singapore. His research interests include machine learning, cognitive computing, neural networks, control systems, optimization and medical informatics.

    Dr. Sundaram Suresh received the B.E degree in electrical and electronics engineering from Bharathiyar University in 1999, and M.E (2001) and Ph.D. (2005) degrees in aerospace engineering from Indian Institute of Science, India. He was post-doctoral researcher in school of electrical engineering, Nanyang Technological University from 2005 to 2007. From 2007 to 2008, he was in INRIA-Sophia Antipolis, France as ERCIM research fellow. He was in Korea University for a short period as a visiting faculty in Industrial Engineering. From January 2009 to December 2009, he was in Indian Institute of Technology-Delhi as an Assistant Professor in Department of Electrical Engineering. Currently, he is working as an Assistant Professor in School of Computer Engineering, Nanyang Technological University, Singapore since 2010. He was awarded best young faculty for the year 2009 by IIT-Delhi His research interest includes flight control, unmanned aerial vehicle design, machine learning, applied game theory, optimization and computer vision.

    View full text