Extreme learning machines: a survey

Huang, Guang-Bin; Wang, Dian Hui; Lan, Yuan

doi:10.1007/s13042-011-0019-y

Guang-Bin Huang¹,
Dian Hui Wang² &
Yuan Lan¹

13k Accesses
1433 Citations
13 Altmetric
Explore all metrics

Abstract

Computational intelligence techniques have been used in wide applications. Out of numerous computational intelligence techniques, neural networks and support vector machines (SVMs) have been playing the dominant roles. However, it is known that both neural networks and SVMs face some challenging issues such as: (1) slow learning speed, (2) trivial human intervene, and/or (3) poor computational scalability. Extreme learning machine (ELM) as emergent technology which overcomes some challenges faced by other techniques has recently attracted the attention from more and more researchers. ELM works for generalized single-hidden layer feedforward networks (SLFNs). The essence of ELM is that the hidden layer of SLFNs need not be tuned. Compared with those traditional computational intelligence techniques, ELM provides better generalization performance at a much faster learning speed and with least human intervene. This paper gives a survey on ELM and its variants, especially on (1) batch learning mode of ELM, (2) fully complex ELM, (3) online sequential ELM, (4) incremental ELM, and (5) ensemble of ELM.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Development and Application of Artificial Neural Network

Article 30 December 2017

A comparative analysis of gradient boosting algorithms

Article 24 August 2020

Feature dimensionality reduction: a review

Article Open access 21 January 2022

References

Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagation errors. Nature 323:533–536
Article Google Scholar
Cortes C, Vapnik V (1995) Support vector networks. Mach Learn 20(3):273–297
MATH Google Scholar
Rosenblatt F (1962) Principles of neurodynamics: perceptrons and the theory of brain mechanisms. Spartan Books, New York
MATH Google Scholar
Lowe D (1989) Adaptive radial basis function nonlinearities and the problem of generalisation. In: Proceedings of first IEE international conference on artificial neural networks, pp 171–175
Huang G-B, Zhu Q-Y, Siew C-K (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: Proceedings of international joint conference on neural networks (IJCNN2004), vol 2, Budapest, Hungary, 25–29 July 2004, pp 985–990
Huang G-B, Zhu Q-Y, Siew C-K (2006) Extreme learning machine: theory and applications. Neurocomputing 70:489–501
Article Google Scholar
Huang G-B, Chen L, Siew C-K (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17(4):879–892
Article Google Scholar
Huang G-B, Chen L (2007) Convex incremental extreme learning machine. Neurocomputing 70:3056–3062
Article Google Scholar
Huang G-B, Chen L (2008) Enhanced random search based incremental extreme learning machine. Neurocomputing 71:3460–3468
Article Google Scholar
Bartlett PL (1998) The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network. IEEE Trans Inf Theory 44(2):525–536
Article MathSciNet MATH Google Scholar
Huang S-C, Huang Y-F (1991) Bounds on the number of hidden neurons in multilayer perceptrons. IEEE Trans Neural Netw 2(1):47–55
Article Google Scholar
Sartori MA, Antsaklis PJ (1991) A simple method to derive bounds on the size and to train multilayer neural networks. IEEE Trans Neural Netw 2(4):467–471
Article Google Scholar
Huang G-B, Babri HA (1998) Upper bounds on the number of hidden neurons in feedforward networks with arbitrary bounded nonlinear activation functions. IEEE Trans Neural Netw 9(1):224–229
Article Google Scholar
Gallant A, White H (1992) There exists a neural network that does not make avoidable mistakes. In: White H (ed) Artificial neural networks: approximation and learning theory. Blackwell, Oxford, pp 5–11
Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Netw 4:251–257
Article Google Scholar
Leshno M, Lin VY, Pinkus A, Schocken S (1993) Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Netw 6:861–867
Article Google Scholar
Park J, Sandberg IW (1991) Universal approximation using radial-basis-function networks. Neural Comput 3:246–257
Article Google Scholar
Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2:359–366
Article Google Scholar
Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst 2(4):303–314
Article MathSciNet MATH Google Scholar
Funahashi K (1989) On the approximate realization of continuous mappings by neural networks. Neural Netw 2:183–192
Article Google Scholar
Stinchcombe M, White H (1992) Universal approximation using feedforward networks with non-sigmoid hidden layer activation functions. In: White H (ed) Artificial neural networks: approximation and learning theory. Blackwell, Oxford, pp 29–40
Barron AR (1993) Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans Inf Theory 39(3):930–945
Article MathSciNet MATH Google Scholar
Kwok T-Y, Yeung D-Y (1997) Objective functions for training new hidden units in constructive neural networks. IEEE Trans Neural Netw 8(5):1131–1148
Article Google Scholar
Meir R, Maiorov VE (2000) On the optimality of neural-network approximation using incremental algorithms. IEEE Trans Neural Netw 11(2):323–337
Article Google Scholar
Romero E (2001) Function approximation with SAOCIF: a general sequential method and a particular algorithm with feed-forward neural networks. Departament de Llenguatges i Sistemes Informàtics, Universitat Politècnica de Catalunya. http://www.lsi.upc.es/dept/techreps/html/R01-41.html
Huang G-B (2003) Learning capability and storage capacity of two-hidden-layer feedforward networks. IEEE Trans Neural Netw 14(2):274–281
Article Google Scholar
Corwin EM, Logar AM, Oldham WJB (1994) An iterative method for training multilayer networks with threshold function. IEEE Trans Neural Netw 5(3):507–508
Article Google Scholar
Toms DJ (1990) Training binary node feedforward neural networks by backpropagation of error. Electron Lett 26(21):1745–1746
Article Google Scholar
Goodman RM, Zeng Z (1994) A learning algorithm for multi-layer perceptrons with hard-limiting threshold units. In: Proceedings of the 1994 IEEE workshop of neural networks for signal processing, pp 219–228
Plagianakos VP, Magoulas GD, Nousis NK, Vrahatis MN (2001) Training multilayer networks with discrete activation functions. In: Proceedings of the IEEE international joint conference on neural networks (IJCNN’2001), Washington, DC, USA
Voxman WL, Roy J, Goetschel H (1981) Advanced calculus: an introduction to modern analysis. Marcel Dekker, New York
Broomhead DS, Lowe D (1988) Multivariable functional interpolation and adaptive networks. Complex Syst 2:321–355
MathSciNet MATH Google Scholar
Igelnik B, Pao YH (1995) Stochastic choice of basis functions in adaptive function approximation and the functional-link net. IEEE Trans Neural Netw 6(6):1320–1329
Article Google Scholar
Huang G-B, Li M-B, Chen L, Siew C-K (2008) Incremental extreme learning machine with fully complex hidden nodes. Neurocomputing 71:576–583
Article Google Scholar
Huang G-B, Siew C-K (2004) Extreme learning machine: RBF network case. In: Proceedings of the eighth international conference on control, automation, robotics and vision (ICARCV 2004), vol 2, Kunming, China, 6–9 Dec 2004, pp 1029–1036
Huang G-B, Zhu Q-Y, Mao K-Z, Siew C-K, Saratchandran P, Sundararajan N (2006) Can threshold networks be trained directly?. IEEE Trans Circuits Syst II 53(3):187–191
Article Google Scholar
Serre D (2002) Matrices: theory and applications. Springer, New York
MATH Google Scholar
Rao CR, Mitra SK (1971) Generalized inverse of matrices and its applications. Wiley, New York
MATH Google Scholar
Huang G-B, Zhou H, Ding X, Zhang R (2010) Extreme learning machine for regression and multi-class classification. IEEE Trans Pattern Anal Mach Intell (submitted)
Hoerl AE, Kennard RW (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1):55–67
Article MathSciNet MATH Google Scholar
Toh K-A (2008) Deterministic neural classification. Neural Comput 20(6):1565–1595
Article MathSciNet MATH Google Scholar
Deng W, Zheng Q, Chen L (2009) Regularized extreme learning machine. In: IEEE symposium on computational intelligence and data mining (CIDM2009), 30 March 2009–2 April 2009, pp 389–395
Man Z, Lee K, Wang D, Cao Z, Miao C (2011) A new robust training algorithm for a class of single-hidden layer feedforward neural networks. Neurocomputing (in press)
Miche Y, van Heeswijk M, Bas P, Simula O, Lendasse A (2011) TROP-ELM: a double-regularized elm using lars and tikhonov regularization. Neurocomputing (in press)
Drucker H, Burges CJ, Kaufman L, Smola A, Vapnik V (1997) Support vector regression machines. In: Mozer M, Jordan J, Petscbe T (eds) Neural information processing systems, vol 9. MIT Press, Cambridge, pp 155–161
Hsu C-W, Lin C-J (2002) A comparison of methods for multiclass support vector machines. IEEE Trans Neural Netw 13(2):415–425
Article Google Scholar
Lin K-M, Lin C-J (2003) A study on reduced support vector machines. IEEE Trans Neural Netw 14(6):1449–1459
Article Google Scholar
Lee Y-J, Mangasarian OL (2001) RSVM: reduced support vector machines. In: Proceedings of the SIAM international conference on data mining, Chicago, USA, 5–7 Apr 2001
Suykens JAK, Vandewalle J (1997) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300
Article Google Scholar
Frénay B, Verleysen M (2010) Using SVMs with randomised feature spaces: an extreme learning approach. In: Proceedings of the 18th European symposium on artificial neural networks (ESANN), Bruges, Belgium, 28–30 Apr 2010, pp 315–320
Frénay B, Verleysen M (2011) Parameter-insensitive kernel in extreme learning for non-linear support vector regression. Neurocomputing (in press)
Li M-B, Huang G-B, Saratchandran P, Sundararajan N (2005) Fully complex extreme learning machine. Neurocomputing 68:306–314
Article Google Scholar
Cha I, Kassam SA (1995) Channel equalization using adaptive complex radial basis function networks. IEEE J Sel Areas Commun 13:122–131
Article Google Scholar
Jianping D, Sundararajan N, Saratchandran P (2002) Communication channel equalization using complex-valued minimal radial basis function neural networks. IEEE Trans Neural Netw 13:687–696
Article Google Scholar
Kim T, Adali T (2003) Approximation by fully complex multilayer perseptrons. Neural Comput 15:1641–1666
Article MATH Google Scholar
LeCun Y, Bottou L, Orr GB, Müller K-R (1998) Efficient BackProp. Lect Notes Comput Sci 1524:9–50
Platt J (1991) A resource-allocating network for function interpolation. Neural Comput 3:213–225
Article MathSciNet Google Scholar
Kadirkamanathan V, Niranjan M (1993) A function estimation approach to sequential learning with neural networks. Neural Comput 5:954–975
Article Google Scholar
Yingwei L, Sundararajan N, Saratchandran P (1997) A sequential learning scheme for function approximation using minimal radial basis function (RBF) neural networks. Neural Comput 9:461–478
Article MATH Google Scholar
Yingwei L, Sundararajan N, Saratchandran P (1998) Performance evaluation of a sequential minimal radial basis function (RBF) neural network learning algorithm. IEEE Trans Neural Netw 9(2):308–318
Article Google Scholar
Salmerón M, Ortega J, Puntonet CG, Prieto A (2001) Improved RAN sequential prediction using orthogonal techniques. Neurocomputing 41:153–172
Google Scholar
Rojas I, Pomares H, Bernier JL, Ortega J, Pino B, Pelayo FJ, Prieto A (2002) Time series analysis using normalized PG-RBF network with regression weights. Neurocomputing 42:267–285
Article MATH Google Scholar
Huang G-B, Saratchandran P, Sundararajan N (2004) An efficient sequential learning algorithm for growing and pruning RBF (GAP-RBF) networks. IEEE Trans Syst Man Cybern Part B 34(6):2284–2292
Article Google Scholar
Huang G-B, Saratchandran P, Sundararajan N (2005) A generalized growing and pruning RBF (GGAP-RBF) neural network for function approximation. IEEE Trans Neural Netw 16(1):57–67
Article Google Scholar
Liang N-Y, Huang G-B, Saratchandran P, Sundararajan N (2006) A fast and accurate on-line sequential learning algorithm for feedforward networks. IEEE Trans Neural Netw 17(6):1411–1423
Article Google Scholar
Chong EKP, Zak SH (2001) An introduction to optimization. Wiley, New York
MATH Google Scholar
Golub GH, Loan CFV (1996) Matrix computations, 3rd edn. The Johns Hopkins University Press, Baltimore
Mackey MC, Glass L (1997) Oscillation and chaos in physiological control systems. Science 197:287–289
Article Google Scholar
Vapnik VN (1998) Statistical learning theory. Wiley, New York
MATH Google Scholar
Smola A, Schölkopf B (1998) A tutorial on support vector regression. NeuroCOLT2 technical report NC2-TR-1998-030
Hansen LK, Salamon P (1990) Neural network ensemble. IEEE Trans Pattern Anal Mach Intell 12(10):993–1001
Article Google Scholar
Breiman L (1996) Bagging predictor. Mach Learn 24(2):123–140
MathSciNet MATH Google Scholar
Schapire RE (1990) The strength of weak learnability. Mach Learn 5(2):197–227
Google Scholar
Freund Y (1995) Boosting a weak algorithm by majority. Inf Comput 121(2):256–285
Article MathSciNet MATH Google Scholar
Freund Y, Schapire RE (1997) A decision-theoretic generalization of online learning and an application to boosting. J Comput Syst Sci 55:119–139
Article MathSciNet MATH Google Scholar
Sun Z-L, Choi T-M, Au K-F, Yu Y (2008) Sales forecasting using extreme learning machine with applications in fashion retailing. Decis Support Syst 46(1):411–419
Article Google Scholar
van Heeswijk M, Miche Y, Lindh-Knuutila T, Hilbers PA, Honkela T, Oja E, Lendasse A (2009) Adaptive ensemble models of extreme learning machines for time series prediction. Lect Notes Comput Sci 5769:305–314
Article Google Scholar
van Heeswijk M, Miche Y, Oja E, Lendasse A (2011) Gpu-accelerated and parallelized ELM ensembles for large-scale regression. Neurocomputing (in press)
Minku FL, Inoue H, Yao X (2011) Negative correlation in incremental learning. Nat Comp (in press)
Sun Y, Yuan Y, Wang G (2011) An OS-ELM based distributed ensemble classification framework in p2p networks. Neurocomputing (in press)
Lan Y, Soh YC, Huang G-B (2009) Ensemble of online sequential extreme learning machine. Neurocomputing 72:3391–3395
Article Google Scholar
Rong H-J, Ong Y-S, Tan A-H, Zhu Z (2008) A fast pruned-extreme learning machine for classification problem. Neurocomputing 72:359–366
Article Google Scholar
Miche Y, Sorjamaa A, Lendasse A (2008) OP-ELM: theory, experiments and a toolbox. Lect Notes Comput Sci 5163:145–154
Article Google Scholar
Simila T, Tikka J (2005) Multiresponse sparse regression with application to multidimensional scaling. In: Proceedings in artificial neural networks: formal models and their applications, ICANN 2005, vol 3697, pp 97–102
Feng G, Huang G-B, Lin Q, Gay R (2009) Error minimized extreme learning machine with growth of hidden nodes and incremental learning. IEEE Trans Neural Netw 20(8):1352–1357
Article Google Scholar
Lan Y, Soh YC, Huang G-B (2010) Random search enhancement of error minimized extreme learning machine. In: European symposium on artificial neural networks (ESANN 2010), Bruges, Belgium, Apr 2010, pp 327–332
Li K, Huang G-B, Ge SS (2010) Fast construction of single hidden layer feedforward networks. In: Rozenberg G, Bäck T, Kok JN (eds) Handbook of natural computing. Springer, Berlin, Mar 2010
Mao K-Z, Bilings SA (1997) Algorithms for minimal model structure detection in nonlinear dynamic system identification. Int J Control 68(2):311–330
Article MATH Google Scholar
Lan Y, Soh YC, Huang G-B (2010) Constructive hidden nodes selection of extreme learning machine for regression. Neurocomputing 73:3191–3199
Article Google Scholar
Lan Y, Soh YC, Huang GB (2010) Two-stage extreme learning machine for regression. Neurocomputing 73:3028–3038
Article Google Scholar
Liu Q, He Q, Shi Z (2008) Extreme support vector machine classifier. Lect Notes Comput Sci 5012:222–233
Article Google Scholar
Huang G-B, Ding X, Zhou H (2010) Optimization method based extreme learning machine for classification. Neurocomputing 74:155–163
Article Google Scholar
Fletcher R (1981) Practical methods of optimization. In: Constrained optimization, vol 2. Wiley, New York
Handoko SD, Keong KC, Soon OY, Zhang GL, Brusic V (2006) Extreme learning machine for predicting hla-peptide binding. Lect Notes Comput Sci 3973:716–721
Article Google Scholar
Sun Z-L, Au K-F, Choi T-M (2008) A neuro-fuzzy inference system through integration of fuzzy logic and extreme learning machines. IEEE Trans Syst Man Cybern Part B Cybern 37(5):1321–1331
Article Google Scholar
Tang X, Han M (2009) Partial lanczos extreme learning machine for single-output regression problems. Neurocomputing 72(13-15):3066–3076
Article Google Scholar
Miche Y, Sorjamaa A, Bas P, Simula O, Jutten C, Lendasse A (2010) OP-ELM: optimally pruned extreme learning machine. IEEE Trans Neural Netw 21(1):158–162
Article Google Scholar
Yeu C-WT, Lim M-H, Huang G-B, Agarwal A, Ong Y-S (2006) A new machine learning paradigm for terrain reconstruction. IEEE Geosci Remote Sens Lett 3(3):382–386
Article Google Scholar
Soria-Olivas E, Gomez-Sanchis J, Martin JD, Vila-Frances J, Martinez M, Magdalena JR, Serrano AJ (2011) BELM: Bayesian extreme learning machine. IEEE Trans Neural Netw 22(3):505–509
Article Google Scholar
Xu Y, Dong ZY, Meng K, Zhang R, Wong KP (2011) Real-time transient stability assessment model using extreme learning machine. IET Gener Transm Distrib 5(3):314–322
Article Google Scholar
Barea R, Boquete L, Rodriguez-Ascariz JM, Ortega S, Lopez E (2011) Sensory system for implementing a human-computer interface based on electrooculography. Sensors 11(1):310–328
Article Google Scholar
Chang N-B, Han M, Yao W, Chen L-C, Xu S (2011) Change detection of land use and land cover in an urban region with SPOT-5 images and partial lanczos extreme learning machine. J Appl Remote Sens 4
Saraswathi S, Sundaram S, Sundararajan N, Zimmermann M, Nilsen-Hamilton M (2011) ICGA-PSO-ELM approach for accurate multiclass cancer classification resulting in reduced gene sets in which genes encoding secreted proteins are highly represented. IEEE ACM Trans Comput Biol Bioinforma 6(2):452–463
Article Google Scholar
Li F-C, Wang P-K, Wang G-E (2009) Comparison of the primitive classifiers with extreme learning machine in credit scoring. In: 2009 IEEE international conference on industrial engineering and engineering management, pp 685–688
Choi K, Toh K-A, Byun H (2011) Realtime training on mobile devices for face recognition applications. Pattern Recogn 44(2):386–400
Google Scholar
Chen FL, Ou TY (2011) Sales forecasting system based on gray extreme learning machine with Taguchi method in retail industry. Expert Syst Appl 38(3):1336–1345
Article Google Scholar
Ye Y, Squartim S, Piazza F (2010) Incremental-based extreme learning machine algorithms for time-variant neural networks. Lect Notes Comput Sci 6215:9–16
Article Google Scholar
Suresh S, Saraswathi S, Sundararajan N (2010) Performance enhancement of extreme learning machine for multi-category sparse data classification problems. Eng Appl Artif Intell 23(7):1149–1157
Article Google Scholar
Li G, Liu M, Dong M (2010) A new online learning algorithm for structure-adjustable extreme learning machine. Comput Math Appl 60(3):377–389
Article MathSciNet MATH Google Scholar
Liu Y, Xu X, Wang C (2009) Simple ensemble of extreme learning machine. In: Proceedings of the 2009 2nd international congress on image and signal processing, pp 2177–2181
Deng W, Chen L (2010) Color image watermarking using regularized extreme learning machine. Neural Network World 20(3):317–330
Google Scholar
Mohammed AA, Wu QMJ, Sid-Ahmed MA (2010) Application of wave atoms decomposition and extreme learning machine for fingerprint classification. Lect Notes Comput Sci 6112:246–256
Article Google Scholar
Minhas R, Baradarani A, Seifzadeh S, Wu QMJ (2010) Human action recognition using extreme learning machine based on visual vocabularies. Neurocomputing 73:1906–1917
Article Google Scholar
Malathi V, Marimuthu NS, Baskar S (2010) Intelligent approaches using support vector machine and extreme learning machine for transmission line protection. Neurocomputing 73:2160–2167
Article Google Scholar
Tang X-L, Han M (2010) Ternary reversible extreme learning machines: the incremental tri-training method for semi-supervised classification. Knowl Inf Syst 22(3):345–372
Article Google Scholar
Nizar AH, Dong ZY, Wang Y (2008) Power utility nontechnical loss analysis with extreme learning machine method. IEEE Trans Power Syst 23(3):946–955
Article Google Scholar
Cho JS, White H (2011) Testing correct model specification using extreme learning machines. Neurocomputing (in press)
Wang Y, Cao F, Yuan Y (2011) A study on effectiveness of extreme learning machine. Neurocomputing (in press)
Deng J, Li K, Irwin GW (2011) Fast automatic two-stage nonlinear model identification based on the extreme learning machine. Neurocomputing (in press)

Download references

Acknowledgments

This research was sponsored by the grant from Academic Research Fund (AcRF) Tier 1 under project no. RG 22/08 (M52040128).

Author information

Authors and Affiliations

School of Electrical and Electronic Engineering, Nanyang Technological University, Nanyang Avenue, Singapore, 639798, Singapore
Guang-Bin Huang & Yuan Lan
Department of Computer Science and Computer Engineering, La Trobe University, Melbourne, VIC, 3086, Australia
Dian Hui Wang

Authors

Guang-Bin Huang
View author publications
You can also search for this author in PubMed Google Scholar
Dian Hui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Lan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guang-Bin Huang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, GB., Wang, D.H. & Lan, Y. Extreme learning machines: a survey. Int. J. Mach. Learn. & Cyber. 2, 107–122 (2011). https://doi.org/10.1007/s13042-011-0019-y

Download citation

Received: 01 April 2011
Accepted: 16 April 2011
Published: 25 May 2011
Issue Date: June 2011
DOI: https://doi.org/10.1007/s13042-011-0019-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Extreme learning machines: a survey

Abstract

Access this article

Similar content being viewed by others

Development and Application of Artificial Neural Network

A comparative analysis of gradient boosting algorithms

Feature dimensionality reduction: a review

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Extreme learning machines: a survey

Abstract

Access this article

Similar content being viewed by others

Development and Application of Artificial Neural Network

A comparative analysis of gradient boosting algorithms

Feature dimensionality reduction: a review

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation