Skip to main content
Top
Published in: Neural Computing and Applications 12/2020

24-06-2019 | Original Article

Approximate empirical kernel map-based iterative extreme learning machine for clustering

Authors: Chuangquan Chen, Chi-Man Vong, Pak-Kin Wong, Keng-Iam Tai

Published in: Neural Computing and Applications | Issue 12/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Maximum margin clustering (MMC) is a recent approach of applying margin maximization in supervised learning to unsupervised learning, aiming to partition the data into clusters with high discrimination. Recently, extreme learning machine (ELM) has been applied to MMC (called iterative ELM clustering or ELMCIter) which maximizes the data discrimination by iteratively training a weighted extreme learning machine (W-ELM). In this way, ELMCIter achieves a substantial reduction in training time and provides a unified model for both binary and multi-class clusterings. However, there exist two issues in ELMCIter: (1) random feature mappings adopted in ELMCIter are unable to well obtain high-quality discriminative features for clustering and (2) a large model is usually required in ELMCIter because its performance is affected by the number of hidden nodes, and training such model becomes relatively slow. In this paper, the hidden layer in ELMCIter is encoded by an approximate empirical kernel map (AEKM) rather than the random feature mappings, in order to solve these two issues. AEKM is generated from low-rank approximation of the kernel matrix, derived from the input data through a kernel function. Our proposed method is called iterative AEKM for clustering (AEKMCIter), whose contributions are: (1) AEKM can extract discriminative and robust features from the kernel matrix so that better performance is always achieved in AEKMCIter and (2) AEKMCIter produces an extremely small number of hidden nodes for low memory consumption and fast training. Detailed experiments verified the effectiveness and efficiency of our approach. As an illustration, on the MNIST10 dataset, our approach AEKMCIter improves the clustering accuracy over ELMCIter up to 5%, while significantly reducing the training time and the memory consumption (i.e., the number of hidden nodes) up to 1/7 and 1/20, respectively.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297MATH Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297MATH
2.
go back to reference Huang G-B, Zhu Q-Y, Siew C-K (2006) Extreme learning machine: theory and applications. Neurocomputing 70:489–501CrossRef Huang G-B, Zhu Q-Y, Siew C-K (2006) Extreme learning machine: theory and applications. Neurocomputing 70:489–501CrossRef
3.
go back to reference Huang G-B, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern Part B Cybern 42:513–529CrossRef Huang G-B, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern Part B Cybern 42:513–529CrossRef
4.
go back to reference Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice Hall, Upper Saddle RiverMATH Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice Hall, Upper Saddle RiverMATH
5.
go back to reference Hartigan JA, Wong MA (1979) Algorithm AS 136: a k-means clustering algorithm. J Roy Stat Soc: Ser C (Appl Stat) 28:100–108MATH Hartigan JA, Wong MA (1979) Algorithm AS 136: a k-means clustering algorithm. J Roy Stat Soc: Ser C (Appl Stat) 28:100–108MATH
6.
go back to reference Dhillon IS, Guan Y, Kulis B (2004) Kernel k-means: spectral clustering and normalized cuts. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, pp 551–556 Dhillon IS, Guan Y, Kulis B (2004) Kernel k-means: spectral clustering and normalized cuts. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, pp 551–556
8.
go back to reference Ng AY, Jordan MI, Weiss Y (2002) On spectral clustering: analysis and an algorithm. In: Proceedings of advances in neural information processing systems, pp 849–856 Ng AY, Jordan MI, Weiss Y (2002) On spectral clustering: analysis and an algorithm. In: Proceedings of advances in neural information processing systems, pp 849–856
9.
go back to reference Xu L, Neufeld J, Larson B, Schuurmans D (2005) Maximum margin clustering. In: Proceedings of advances in neural information processing systems, pp 1537–1544 Xu L, Neufeld J, Larson B, Schuurmans D (2005) Maximum margin clustering. In: Proceedings of advances in neural information processing systems, pp 1537–1544
10.
go back to reference Valizadegan H, Jin R (2007) Generalized maximum margin clustering and unsupervised kernel learning. In: Proceedings of advances in neural information processing systems, pp 1417–1424 Valizadegan H, Jin R (2007) Generalized maximum margin clustering and unsupervised kernel learning. In: Proceedings of advances in neural information processing systems, pp 1417–1424
11.
go back to reference Zhang K, Tsang IW, Kwok JT (2009) Maximum margin clustering made practical. IEEE Trans Neural Netw 20:583–596CrossRef Zhang K, Tsang IW, Kwok JT (2009) Maximum margin clustering made practical. IEEE Trans Neural Netw 20:583–596CrossRef
12.
go back to reference Bezdek JC, Hathaway RJ (2003) Convergence of alternating optimization. Neural Parallel Sci Comput 11:351–368MathSciNetMATH Bezdek JC, Hathaway RJ (2003) Convergence of alternating optimization. Neural Parallel Sci Comput 11:351–368MathSciNetMATH
13.
go back to reference Zhang C, Xia S, Liu B, Zhang L (2013) Extreme maximum margin clustering. IEICE Trans Inf Syst 96:1745–1753CrossRef Zhang C, Xia S, Liu B, Zhang L (2013) Extreme maximum margin clustering. IEICE Trans Inf Syst 96:1745–1753CrossRef
14.
go back to reference Huang G, Liu T, Yang Y, Lin Z, Song S, Wu C (2015) Discriminative clustering via extreme learning machine. Neural Netw 70:1–8CrossRef Huang G, Liu T, Yang Y, Lin Z, Song S, Wu C (2015) Discriminative clustering via extreme learning machine. Neural Netw 70:1–8CrossRef
15.
go back to reference Zong W, Huang G-B, Chen Y (2013) Weighted extreme learning machine for imbalance learning. Neurocomputing 101:229–242CrossRef Zong W, Huang G-B, Chen Y (2013) Weighted extreme learning machine for imbalance learning. Neurocomputing 101:229–242CrossRef
16.
go back to reference He Q, Jin X, Du C, Zhuang F, Shi Z (2014) Clustering in extreme learning machine feature space. Neurocomputing 128:88–95CrossRef He Q, Jin X, Du C, Zhuang F, Shi Z (2014) Clustering in extreme learning machine feature space. Neurocomputing 128:88–95CrossRef
17.
go back to reference Huang G, Song S, Gupta JN, Wu C (2014) Semi-supervised and unsupervised extreme learning machines. IEEE Trans Cybern 44:2405–2417CrossRef Huang G, Song S, Gupta JN, Wu C (2014) Semi-supervised and unsupervised extreme learning machines. IEEE Trans Cybern 44:2405–2417CrossRef
18.
go back to reference Achlioptas D (2003) Database-friendly random projections: Johnson–Lindenstrauss with binary coins. J Comput Syst Sci 66:671–687MathSciNetCrossRef Achlioptas D (2003) Database-friendly random projections: Johnson–Lindenstrauss with binary coins. J Comput Syst Sci 66:671–687MathSciNetCrossRef
19.
go back to reference Li C, Deng C, Zhou S, Zhao B, Huang G-B (2018) Conditional random mapping for effective elm feature representation. Cognit Comput 10:1–21CrossRef Li C, Deng C, Zhou S, Zhao B, Huang G-B (2018) Conditional random mapping for effective elm feature representation. Cognit Comput 10:1–21CrossRef
20.
go back to reference Zhang K, Lan L, Wang Z, Moerchen F (2012) Scaling up kernel SVM on limited resources: a low-rank linearization approach. Proc Mach Learn Res 22:1425–1434 Zhang K, Lan L, Wang Z, Moerchen F (2012) Scaling up kernel SVM on limited resources: a low-rank linearization approach. Proc Mach Learn Res 22:1425–1434
21.
go back to reference Golts A, Elad M (2016) Linearized kernel dictionary learning. IEEE J Sel Top Sig Process 10:726–739CrossRef Golts A, Elad M (2016) Linearized kernel dictionary learning. IEEE J Sel Top Sig Process 10:726–739CrossRef
22.
go back to reference Pourkamali-Anaraki F, Becker S (2016) A randomized approach to efficient kernel clustering. In: Proceedings of the IEEE global conference on signal and information processing, pp 207–211 Pourkamali-Anaraki F, Becker S (2016) A randomized approach to efficient kernel clustering. In: Proceedings of the IEEE global conference on signal and information processing, pp 207–211
23.
go back to reference Vong C-M, Chen C, Wong P-K (2018) Empirical kernel map-based multilayer extreme learning machines for representation learning. Neurocomputing 310:265–276CrossRef Vong C-M, Chen C, Wong P-K (2018) Empirical kernel map-based multilayer extreme learning machines for representation learning. Neurocomputing 310:265–276CrossRef
24.
go back to reference Zhang K, Kwok JT (2010) Clustered Nyström method for large scale manifold learning and dimension reduction. IEEE Trans Neural Netw 21:1576–1587CrossRef Zhang K, Kwok JT (2010) Clustered Nyström method for large scale manifold learning and dimension reduction. IEEE Trans Neural Netw 21:1576–1587CrossRef
25.
go back to reference Kumar S, Mohri M, Talwalkar A (2012) Sampling methods for the Nyström method. J Mach Learn Res 13:981–1006MathSciNetMATH Kumar S, Mohri M, Talwalkar A (2012) Sampling methods for the Nyström method. J Mach Learn Res 13:981–1006MathSciNetMATH
26.
go back to reference Scholkopf B, Mika S, Burges CJ, Knirsch P, Muller K-R, Ratsch G, Smola AJ (1999) Input space versus feature space in kernel-based methods. IEEE Trans Neural Netw 10:1000–1017CrossRef Scholkopf B, Mika S, Burges CJ, Knirsch P, Muller K-R, Ratsch G, Smola AJ (1999) Input space versus feature space in kernel-based methods. IEEE Trans Neural Netw 10:1000–1017CrossRef
27.
go back to reference Pourkamali-Anaraki F, Becker S (2018) Randomized clustered Nystrom for large-scale kernel machines. ArXiv preprint arXiv:1612.06470. Accessed 20 May 2018 Pourkamali-Anaraki F, Becker S (2018) Randomized clustered Nystrom for large-scale kernel machines. ArXiv preprint arXiv:​1612.​06470. Accessed 20 May 2018
28.
go back to reference Gittens A, Mahoney MW (2013) Revisiting the Nyström method for improved large-scale machine learning. J Mach Learn Res 28:567–575MATH Gittens A, Mahoney MW (2013) Revisiting the Nyström method for improved large-scale machine learning. J Mach Learn Res 28:567–575MATH
29.
go back to reference Drineas P, Kannan R, Mahoney MW (2006) Fast Monte Carlo algorithms for matrices I: approximating matrix multiplication. SIAM J Comput 36:132–157MathSciNetCrossRef Drineas P, Kannan R, Mahoney MW (2006) Fast Monte Carlo algorithms for matrices I: approximating matrix multiplication. SIAM J Comput 36:132–157MathSciNetCrossRef
30.
go back to reference Drineas P, Mahoney MW (2005) On the Nyström method for approximating a Gram matrix for improved kernel-based learning. J Mach Learn Res 6:2153–2175MathSciNetMATH Drineas P, Mahoney MW (2005) On the Nyström method for approximating a Gram matrix for improved kernel-based learning. J Mach Learn Res 6:2153–2175MathSciNetMATH
31.
go back to reference Williams CK, Seeger M (2001) Using the Nyström method to speed up kernel machines. In: Proceedings of advances in neural information processing systems, pp 682–688 Williams CK, Seeger M (2001) Using the Nyström method to speed up kernel machines. In: Proceedings of advances in neural information processing systems, pp 682–688
32.
go back to reference Zhang K, Tsang IW, Kwok JT (2008) Improved Nyström low-rank approximation and error analysis. In: Proceedings of the 25th international conference on machine learning, pp 1232–1239 Zhang K, Tsang IW, Kwok JT (2008) Improved Nyström low-rank approximation and error analysis. In: Proceedings of the 25th international conference on machine learning, pp 1232–1239
Metadata
Title
Approximate empirical kernel map-based iterative extreme learning machine for clustering
Authors
Chuangquan Chen
Chi-Man Vong
Pak-Kin Wong
Keng-Iam Tai
Publication date
24-06-2019
Publisher
Springer London
Published in
Neural Computing and Applications / Issue 12/2020
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-019-04295-6

Other articles of this Issue 12/2020

Neural Computing and Applications 12/2020 Go to the issue

Hybrid Artificial Intelligence and Machine Learning Technologies

Analysis of Boolean functions based on interaction graphs and their influence in system biology

Premium Partner