Robust regularized extreme learning machine for regression using iteratively reweighted least squares
Introduction
The extreme learning machine (ELM) [1] is proposed for training single-hidden layer feedforward networks (SLFNs). It directly approximates nonlinear mapping of input data by randomly generating the hidden node parameters without tuning. This model has been proven to exhibit the universal approximation capability [2]. ELM has the following merits: (1) easy-implementation, (2) extremely fast training speed, (3) good generalization performance. ELM has recently gained increasing interest in regression problems, such as stock market forecasting [3], electricity price forecasting [4], wind power forecasting [5], affective analogical reasoning [6], because of the aforementioned merits.
The performance of ELM regression crucially relies on the given labels of training data. The basic ELM with the ℓ2-norm loss function assumes that the training labels is a normal error distribution. However, training samples for real tasks cannot be guaranteed to have a normal error distribution. Many factors can corrupt the training samples with outliers, such as instrument errors, sample errors and modeling errors. The performance of basic ELM regression is heavily deteriorated because ℓ2-norm loss can be easily effected by the large deviations of the outliers.
To solve this problem, Deng et al. [7] proposed a regularized ELM with weighted least square to enhance the robustness. Their algorithm consists of two stages of the reweighted ELM. Zhang et al. [8] proposed the outlier-robust ELM with the ℓ1-norm loss function and the ℓ2-norm regularization term. They used augmented Lagrange multiplier algorithm to solve the objective loss function and effectively reduced the influence of outliers. Horata et al. [9] adopted the Huber function to enhance the robustness. They used iteratively reweighted least squares (IRLS) algorithm to solve the Huber loss function without a regularization term. The model without regularization is easy to overfit.
However, the loss functions of existing robust ELM regression, namely, ℓ1-norm or Huber function, can also be effected by the outliers with large deviations because ℓ1-norm or Huber loss functions are linear with the deviations. Moreover, existing robust ELM methods use only ℓ2-norm regularization or have no regularization term. When the number of hidden nodes is large, the ℓ2-norm regularization will train a large ELM model due to non-zero output weights of the network. In a word, there lacks a study that considers different loss functions and regularization terms simultaneously.
Thus, we conduct a comprehensive study on the loss function and regularization term of the robust ELM regression in this work. We propose a unified model for robust regularized ELM regression using IRLS (RELM-IRLS). Four loss functions (i.e., ℓ1-norm, Huber, Bisquare and Welsch) are used to enhance the robustness, and two types of regularization (ℓ2-norm and ℓ1-norm) are used to avoid overfitting. These loss functions, also known as M-estimation functions, have been widely used in robust statistics [10]. IRLS is used to optimize the objective function with robust loss function and regularization term. Each IRLS iteration is equivalent to solving a weighted least-squares ELM regression. Our RELM-IRLS algorithm can also be trained efficiently because of the fast training speed of ELM. The experimental results on synthetic and real data sets show that our proposed RELM-IRLS is stable and accurate at outlier levels.
Compared to existing ELM methods for robust regression, the main contributions of this paper are highlighted as follows:
- (1)
A unified model is proposed for robust regularized ELM regression. Different kinds of robust loss functions and regularization terms can be used in this model.
- (2)
RELM-IRLS with ℓ2-norm regularization is proposed to achieve better generalization.
- (3)
RELM-IRLS with ℓ1-norm regularization is proposed to realize better generalization performance and more compact network architecture.
The rest of this paper is organized as follows. Basic ELM and its robust variants are reviewed in Section 2. In Section 3, we present the unified model and the RELM-IRLS with ℓ2-norm regularization and ℓ1-norm regularization. Section 4 demonstrates the experimental results of our proposed algorithms and Section 5 presents our conclusion.
Section snippets
ELM for regression
For a given set of training samples for regression problem, ELM is a unified SLFN whose output with L hidden nodes can be represented aswhere the and . is the hidden layer function between the input layer and the ith hidden node. and bi are randomly generated independent of the training data. βi is the output weight between the ith hidden node to the output node. The
Proposed method
In this section, we explain our proposed RELM-IRLS algorithm. First, we provide a unified model for robust regularized regression. Then, we present the RELM-IRLS with ℓ2-norm and ℓ1-norm regularization respectively. Finally, we discuss the advantages and disadvantages of our proposed method.
Experiment
MATLAB code for our algorithm is available at: https://github.com/KaenChan/robust-elm-irls.
Conclusion
In this paper, we propose a robust regularized ELM for regression problem. A unified model is proposed for robust regularized ELM regression using IRLS optimization method. Four robust loss functions (i.e. ℓ1-norm, Huber, Bisquare and Welsch) are used in our algorithms to enhance the robustness of basic ELM. Moreover, two types of regularization terms (ℓ2-norm and ℓ1-norm) are used to consider the structural risk minimization. We propose robust ELM with ℓ2-norm regularization, which can achieve
Acknowledgement
This work is partially supported by Natural Science Foundation of China (61125201, 61303070, U1435219).
Kai Chen received his B.S. degree in computer science and technology at Northwestern Polytechnical University, in 2010, and received his M.S. degree in computer science and technology at National University of Defense Technology in 2012, and now he is a Ph.D. candidate at National University of Defense Technology. His research interests include high performance computer architecture, machine learning, and constraint satisfaction.
References (19)
- et al.
Extreme learning machine: theory and applications
Neurocomputing
(2006) - et al.
Online sequential extreme learning machine with forgetting mechanism
Neurocomputing
(2012) - et al.
An elm-based model for affective analogical reasoning
Neurocomputing
(2015) - et al.
Outlier-robust extreme learning machine for regression problems
Neurocomputing
(2015) - et al.
Robust extreme learning machine
Neurocomputing
(2013) - et al.
Pr-elm: parallel regularized extreme learning machine based on cluster
Neurocomputing
(2016) - et al.
Universal approximation using incremental constructive feedforward networks with random hidden nodes
IEEE Trans. Neural Netw.
(2006) - et al.
Electricity price forecasting with extreme learning machine and bootstrapping
IEEE Trans. Power Syst.
(2012) - et al.
Probabilistic forecasting of wind power generation using extreme learning machine
IEEE Trans. Power Syst.
(2014)
Cited by (0)
Kai Chen received his B.S. degree in computer science and technology at Northwestern Polytechnical University, in 2010, and received his M.S. degree in computer science and technology at National University of Defense Technology in 2012, and now he is a Ph.D. candidate at National University of Defense Technology. His research interests include high performance computer architecture, machine learning, and constraint satisfaction.
Qi Lv received his B.S. degree in computer science and technology in Tsinghua University, Beijing, in 2009, and received his M.S. degree in computer science and technology at National University of Defense Technology in 2011, and now he is a Ph.D. candidate at National University of Defense Technology. His research interests include high performance computer architecture, machine learning, and remote sensing image processing.
Yao Lu received his B.S. degree in Computer Science and Technology in Shihezi University, in 2010, and received his M.S. degree in Computer Science and Technology at National University of Defense Technology in 2012, and now he is an assistant Engineer at National University of Defense Technology. His research interests include high performance computer architecture, parallel computing, and machine learning.
Yong Dou is professor, Ph.D. supervisor, senior membership of China Computer Federation. He received his B.S., M.S., and Ph.D. degrees in Computer Science and Technology at National University of Defense Technology. His research interests include high performance computer architecture, high performance embedded microprocessor, reconfigurable computing, and bioinformatics, machine learning. He is a member of the IEEE and the ACM.