Skip to main content
Top
Published in: Data Mining and Knowledge Discovery 6/2020

02-07-2020

Gaussian bandwidth selection for manifold learning and classification

Authors: Ofir Lindenbaum, Moshe Salhov, Arie Yeredor, Amir Averbuch

Published in: Data Mining and Knowledge Discovery | Issue 6/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Kernel methods play a critical role in many machine learning algorithms. They are useful in manifold learning, classification, clustering and other data analysis tasks. Setting the kernel’s scale parameter, also referred to as the kernel’s bandwidth, highly affects the performance of the task in hand. We propose to set a scale parameter that is tailored to one of two types of tasks: classification and manifold learning. For manifold learning, we seek a scale which is best at capturing the manifold’s intrinsic dimension. For classification, we propose three methods for estimating the scale, which optimize the classification results in different senses. The proposed frameworks are simulated on artificial and on real datasets. The results show a high correlation between optimal classification rates and the estimated scales. Finally, we demonstrate the approach on a seismic event classification task.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
go back to reference Abbas A, Abdelsamea MM, Gaber MM (2020) Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network, arXiv preprint arXiv:2003.13815 Abbas A, Abdelsamea MM, Gaber MM (2020) Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network, arXiv preprint arXiv:​2003.​13815
go back to reference Belkin M, Niyogi P (2001) Laplacian eigenmaps and spectral techniques for embedding and clustering. NIPS 14:585–591 Belkin M, Niyogi P (2001) Laplacian eigenmaps and spectral techniques for embedding and clustering. NIPS 14:585–591
go back to reference Beyreuther M, Hammer C, Wassermann M, Ohrnberger M, Megies M (2012) Constructing a hidden markov model based earthquake detector: application to induced seismicity. Geophys J Int 189:602–610CrossRef Beyreuther M, Hammer C, Wassermann M, Ohrnberger M, Megies M (2012) Constructing a hidden markov model based earthquake detector: application to induced seismicity. Geophys J Int 189:602–610CrossRef
go back to reference Blandford R (1982) Seismic event discrimination. Bull Seismol Soc Am 72:569–587 Blandford R (1982) Seismic event discrimination. Bull Seismol Soc Am 72:569–587
go back to reference Camastra F (2003) Data dimensionality estimation methods: a survey. Pattern Recogn 36(12):2945–2954CrossRef Camastra F (2003) Data dimensionality estimation methods: a survey. Pattern Recogn 36(12):2945–2954CrossRef
go back to reference Campbell C, Cristianini N, Shawe-Taylor J (1999) Dynamically adapting kernels in support vector machines. Adv Neural Inf Process Syst 11:204–210 Campbell C, Cristianini N, Shawe-Taylor J (1999) Dynamically adapting kernels in support vector machines. Adv Neural Inf Process Syst 11:204–210
go back to reference Ceruti C, Bassis S, Rozza A, Lombardi G, Casiraghi E, Campadelli P (2014) Danco: Dimensionality from angle and norm concentration. Pattern Recogn 47(8):2569–2581CrossRef Ceruti C, Bassis S, Rozza A, Lombardi G, Casiraghi E, Campadelli P (2014) Danco: Dimensionality from angle and norm concentration. Pattern Recogn 47(8):2569–2581CrossRef
go back to reference Chapelle O, Vapnik V, Bousquet O, Mukherjee S (2002) Choosing multiple parameters for support vector machines. Mach Learn 46(1–3):131–159CrossRef Chapelle O, Vapnik V, Bousquet O, Mukherjee S (2002) Choosing multiple parameters for support vector machines. Mach Learn 46(1–3):131–159CrossRef
go back to reference Cohen I, Tian Q, Zhou XS, Huang TS (2002) Feature selection using principal feature analysis. Univ. of Illinois at Urbana-Champaign, Cohen I, Tian Q, Zhou XS, Huang TS (2002) Feature selection using principal feature analysis. Univ. of Illinois at Urbana-Champaign,
go back to reference Coifman RR, Shkolnisky Y, Sigworth FJ, Singer A (2008) Graph laplacian tomography from unknown random projections. IEEE Trans Image Process 17(10):1891–1899MathSciNetCrossRef Coifman RR, Shkolnisky Y, Sigworth FJ, Singer A (2008) Graph laplacian tomography from unknown random projections. IEEE Trans Image Process 17(10):1891–1899MathSciNetCrossRef
go back to reference Del Pezzo E, Esposito A, Giudicepietro F, Marinaro M, Martini M, Scarpetta S (2003) Discrimination of earthquakes and underwater explosions using neural networks. Bull Seismol Soc Am 93(1):215–223CrossRef Del Pezzo E, Esposito A, Giudicepietro F, Marinaro M, Martini M, Scarpetta S (2003) Discrimination of earthquakes and underwater explosions using neural networks. Bull Seismol Soc Am 93(1):215–223CrossRef
go back to reference Dhillon IS, Guan Y, Kulis B (2004) Kernel k-means: spectral clustering and normalized cuts. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 551–556 Dhillon IS, Guan Y, Kulis B (2004) Kernel k-means: spectral clustering and normalized cuts. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 551–556
go back to reference Ding CH, He X, Simon HD (2005) On the equivalence of nonnegative matrix factorization and spectral clustering. In: SDM, vol 5. SIAM, pp 606–610 Ding CH, He X, Simon HD (2005) On the equivalence of nonnegative matrix factorization and spectral clustering. In: SDM, vol 5. SIAM, pp 606–610
go back to reference Fukunaga K, Olsen DR (1971) An algorithm for finding intrinsic dimensionality of data. IEEE Trans Comput 100(2):176–183CrossRef Fukunaga K, Olsen DR (1971) An algorithm for finding intrinsic dimensionality of data. IEEE Trans Comput 100(2):176–183CrossRef
go back to reference Gaspar P, Carbonell J, Oliveira JL (2012) On the parameter optimization of support vector machines for binary classification. J Integr Bioinform 9(3):201CrossRef Gaspar P, Carbonell J, Oliveira JL (2012) On the parameter optimization of support vector machines for binary classification. J Integr Bioinform 9(3):201CrossRef
go back to reference Hammer C, Ohrnberger M, Fäh D (2013) Classifying seismic waveforms from scratch: a case study in the alpine environment. Geophys J Int 192:425–439CrossRef Hammer C, Ohrnberger M, Fäh D (2013) Classifying seismic waveforms from scratch: a case study in the alpine environment. Geophys J Int 192:425–439CrossRef
go back to reference Hein M, Audibert J-Y (2005) Intrinsic dimensionality estimation of submanifolds in R D. In: Proceedings of the 22nd international conference on machine learning. ACM, pp 289–296 Hein M, Audibert J-Y (2005) Intrinsic dimensionality estimation of submanifolds in R D. In: Proceedings of the 22nd international conference on machine learning. ACM, pp 289–296
go back to reference Jolliffe I (2002) Principal component analysis. Wiley Online Library Jolliffe I (2002) Principal component analysis. Wiley Online Library
go back to reference Joswig M (1990) Pattern recognition for earthquake detection. Bull Seismol Soc Am 80(1):170–186 Joswig M (1990) Pattern recognition for earthquake detection. Bull Seismol Soc Am 80(1):170–186
go back to reference Kortström J, Uski M, Tiira T (2016) Automatic classification of seismic events within a regional seismograph network. Comput Geosci 87:22–30CrossRef Kortström J, Uski M, Tiira T (2016) Automatic classification of seismic events within a regional seismograph network. Comput Geosci 87:22–30CrossRef
go back to reference Kruskal JB, Wish M (1977) Multidimensional scaling. Sage Publications, Beverly Hills Kruskal JB, Wish M (1977) Multidimensional scaling. Sage Publications, Beverly Hills
go back to reference Lafon S, Keller Y, Coifman RR (2006) Data fusion and multicue data matching by diffusion maps. IEEE Trans Pattern Anal Mach Intell 28(11):1784–1797CrossRef Lafon S, Keller Y, Coifman RR (2006) Data fusion and multicue data matching by diffusion maps. IEEE Trans Pattern Anal Mach Intell 28(11):1784–1797CrossRef
go back to reference Lafon S, Keller Y, Coifman R (2006) Data fusion and multicue data matching by diffusion maps. IEEE Trans Pattern Anal Mach Intell 28(11):1784–1797CrossRef Lafon S, Keller Y, Coifman R (2006) Data fusion and multicue data matching by diffusion maps. IEEE Trans Pattern Anal Mach Intell 28(11):1784–1797CrossRef
go back to reference Lederman RR, Talmon R (2014) Common manifold learning using alternating-diffusion, submitted, Tech. Report YALEU/DCS/TR1497, Tech. Rep Lederman RR, Talmon R (2014) Common manifold learning using alternating-diffusion, submitted, Tech. Report YALEU/DCS/TR1497, Tech. Rep
go back to reference Lichman M (2013) UCI machine learning repository [Online] Lichman M (2013) UCI machine learning repository [Online]
go back to reference Lindenbaum O, Yeredor A, Cohen I (2015) Musical key extraction using diffusion maps. Sig Process 117:198–207CrossRef Lindenbaum O, Yeredor A, Cohen I (2015) Musical key extraction using diffusion maps. Sig Process 117:198–207CrossRef
go back to reference Lindenbaum O, Bregman Y, Rabin N, Averbuch A (2018) Multi-view kernels for low-dimensional modeling of seismic events. IEEE Trans Geosci Remote Sens 56(6):3300–3310CrossRef Lindenbaum O, Bregman Y, Rabin N, Averbuch A (2018) Multi-view kernels for low-dimensional modeling of seismic events. IEEE Trans Geosci Remote Sens 56(6):3300–3310CrossRef
go back to reference Lindenbaum O, Yeredor A, Salhov M, Averbuch A (2020) Multiview diffusion maps. Inf Fusion 55:127–149CrossRef Lindenbaum O, Yeredor A, Salhov M, Averbuch A (2020) Multiview diffusion maps. Inf Fusion 55:127–149CrossRef
go back to reference Lindenbaum O, Yeredor A, Averbuch A (2016) Bandwidth selection for kernel-based classification. In: IEEE international conference on the science of electrical engineering (ICSEE), pp 1–5. IEEE Lindenbaum O, Yeredor A, Averbuch A (2016) Bandwidth selection for kernel-based classification. In: IEEE international conference on the science of electrical engineering (ICSEE), pp 1–5. IEEE
go back to reference Lin T, Zha H, Lee SU (2006) Riemannian manifold learning for nonlinear dimensionality reduction. In: European conference on computer vision. Springer, pp 44–55 Lin T, Zha H, Lee SU (2006) Riemannian manifold learning for nonlinear dimensionality reduction. In: European conference on computer vision. Springer, pp 44–55
go back to reference Lu Y, Cohen I, Zhou XS, Tian Q (2007) Feature selection using principal feature analysis. In: Proceedings of the 15th ACM international conference on Multimedia. ACM, pp 301–304 Lu Y, Cohen I, Zhou XS, Tian Q (2007) Feature selection using principal feature analysis. In: Proceedings of the 15th ACM international conference on Multimedia. ACM, pp 301–304
go back to reference Luo W (2011) Face recognition based on laplacian eigenmaps. In: International conference on computer science and service system (CSSS). IEEE, pp 416-419 Luo W (2011) Face recognition based on laplacian eigenmaps. In: International conference on computer science and service system (CSSS). IEEE, pp 416-419
go back to reference Ng AY, Jordan MI, Weiss Y et al (2002) On spectral clustering: analysis and an algorithm. Adv Neural Inf Process Syst 2:849–856 Ng AY, Jordan MI, Weiss Y et al (2002) On spectral clustering: analysis and an algorithm. Adv Neural Inf Process Syst 2:849–856
go back to reference Ohrnberger M (2001) Continuous automatic classification of seismic signals of volcanic origin at Mt. Merapi, Java, Indonesia, PhD thesis, University of Potsdam Ohrnberger M (2001) Continuous automatic classification of seismic signals of volcanic origin at Mt. Merapi, Java, Indonesia, PhD thesis, University of Potsdam
go back to reference Pettis KW, Bailey TA, Jain AK, Dubes RC (1979) An intrinsic dimensionality estimator from near-neighbor information. IEEE Trans Pattern Anal Mach Intell 1:25–37CrossRef Pettis KW, Bailey TA, Jain AK, Dubes RC (1979) An intrinsic dimensionality estimator from near-neighbor information. IEEE Trans Pattern Anal Mach Intell 1:25–37CrossRef
go back to reference Rabin N, Bregman Y, Lindenbaum O, Ben-Horin Y, Averbuch A (2016) Earthquake-explosion discrimination using diffusion maps. Geophys J Int 207(3):1484–1492CrossRef Rabin N, Bregman Y, Lindenbaum O, Ben-Horin Y, Averbuch A (2016) Earthquake-explosion discrimination using diffusion maps. Geophys J Int 207(3):1484–1492CrossRef
go back to reference Rabin N, Bregman Y, Lindenbaum O, Ben-Horin Y, Averbuch A (2016) Multi-channel fusion for seismic event detection and classification. In: IEEE international conference on the science of electrical engineering (ICSEE). IEEE, pp 1–5 Rabin N, Bregman Y, Lindenbaum O, Ben-Horin Y, Averbuch A (2016) Multi-channel fusion for seismic event detection and classification. In: IEEE international conference on the science of electrical engineering (ICSEE). IEEE, pp 1–5
go back to reference Rodgers AJ, Lay T, Walter WR, Mayeda KM (1997) A comparison of regional-phase amplitude ratio measurement techniques. Bull Seismol Soc Am 87(6):1613–1621 Rodgers AJ, Lay T, Walter WR, Mayeda KM (1997) A comparison of regional-phase amplitude ratio measurement techniques. Bull Seismol Soc Am 87(6):1613–1621
go back to reference Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by local linear embedding. Science 290(5500):2323–2326CrossRef Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by local linear embedding. Science 290(5500):2323–2326CrossRef
go back to reference Ruano AE, Madureira G, Barros O, Khosravani HR, Ruano MG, Ferreira PM (2014) Seismic detection using support vector machines. Neurocomputing 135:273–283CrossRef Ruano AE, Madureira G, Barros O, Khosravani HR, Ruano MG, Ferreira PM (2014) Seismic detection using support vector machines. Neurocomputing 135:273–283CrossRef
go back to reference Salhov M, Lindenbaum O, Silberschatz A, Shkolnisky Y, Averbuch A (2019) Multi-view kernel consensus for data analysis and signal processing. Applied and Computational Harmonic Analysis Salhov M, Lindenbaum O, Silberschatz A, Shkolnisky Y, Averbuch A (2019) Multi-view kernel consensus for data analysis and signal processing. Applied and Computational Harmonic Analysis
go back to reference Scholkopf B, Smola AJ (2001) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press Scholkopf B, Smola AJ (2001) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press
go back to reference Sethy PK, Behera SK (2020) Detection of Coronavirus Disease (COVID-19) Based on Deep Features. Preprints 2020, 2020030300 Sethy PK, Behera SK (2020) Detection of Coronavirus Disease (COVID-19) Based on Deep Features. Preprints 2020, 2020030300
go back to reference Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905CrossRef Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905CrossRef
go back to reference Shuai W, Bo Kang K, Jinlu M, Xianjun Z, Mingming X, Jia G, Mengjiao C, Jingyi Y, Yaodong L, Xiangfei M, Bo X (2020) A deep learning algorithm using CT images to screen for Corona Virus Disease (COVID-19) medRxiv Shuai W, Bo Kang K, Jinlu M, Xianjun Z, Mingming X, Jia G, Mengjiao C, Jingyi Y, Yaodong L, Xiangfei M, Bo X (2020) A deep learning algorithm using CT images to screen for Corona Virus Disease (COVID-19) medRxiv
go back to reference Singer A, Erban R, Kevrekidis I, Coifman RR (2009) Detecting intrinsic slow variables in stochastic dynamical systems by anisotropic diffusion maps. PNAS 106(38):16090–16095CrossRef Singer A, Erban R, Kevrekidis I, Coifman RR (2009) Detecting intrinsic slow variables in stochastic dynamical systems by anisotropic diffusion maps. PNAS 106(38):16090–16095CrossRef
go back to reference Song F, Guo Z, Mei D (2010) Feature selection using principal component analysis. In: International conference on system science, engineering design and manufacturing informatization (ICSEM), 2010, vol 1. IEEE, pp 27–30 Song F, Guo Z, Mei D (2010) Feature selection using principal component analysis. In: International conference on system science, engineering design and manufacturing informatization (ICSEM), 2010, vol 1. IEEE, pp 27–30
go back to reference Song Y, Zheng S, Li L, Zhang X, Zhang X, Huang Z, et al. (2020) Deep learning enables accurate diagnosis of novel coronavirus (COVID-19) with CT images. medRxiv Song Y, Zheng S, Li L, Zhang X, Zhang X, Huang Z, et al. (2020) Deep learning enables accurate diagnosis of novel coronavirus (COVID-19) with CT images. medRxiv
go back to reference Staelin C (2003) Parameter selection for support vector machines, Hewlett-Packard Company, Tech. Rep. HPL-2002-354R1, Staelin C (2003) Parameter selection for support vector machines, Hewlett-Packard Company, Tech. Rep. HPL-2002-354R1,
go back to reference Stewart GW (1990) Matrix perturbation theory. Citeseer, Stewart GW (1990) Matrix perturbation theory. Citeseer,
go back to reference Taseska M, Van Waterschoot T, Habets EA, Talmon R (2019) Nonlinear filtering with variable-bandwidth exponential kernels. IEEE Trans Signal Process Taseska M, Van Waterschoot T, Habets EA, Talmon R (2019) Nonlinear filtering with variable-bandwidth exponential kernels. IEEE Trans Signal Process
go back to reference Tenenbaum J, de Silva V, Langford J (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323CrossRef Tenenbaum J, de Silva V, Langford J (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323CrossRef
go back to reference Tiira T (1996) Discrimination of nuclear explosions and earthquakes from teleseismic distances with a local network of short period seismic stations using artificial neural networks. Phys Earth Planet Inter 97(1–4):247–268CrossRef Tiira T (1996) Discrimination of nuclear explosions and earthquakes from teleseismic distances with a local network of short period seismic stations using artificial neural networks. Phys Earth Planet Inter 97(1–4):247–268CrossRef
go back to reference Trunk GV (1976) Stastical estimation of the intrinsic dimensionality of a noisy signal collection. IEEE Trans Comput 100(2):165–171CrossRef Trunk GV (1976) Stastical estimation of the intrinsic dimensionality of a noisy signal collection. IEEE Trans Comput 100(2):165–171CrossRef
go back to reference Vasiloglou N, Gray AG, Anderson DV (2006) Parameter estimation for manifold learning, through density estimation. In: 2006 16th IEEE signal processing society workshop on machine learning for signal processing, pp 211–216 Vasiloglou N, Gray AG, Anderson DV (2006) Parameter estimation for manifold learning, through density estimation. In: 2006 16th IEEE signal processing society workshop on machine learning for signal processing, pp 211–216
go back to reference Verveer PJ, Duin RPW (1995) An evaluation of intrinsic dimensionality estimators. IEEE Trans Pattern Anal Mach Intell 17(1):81–86CrossRef Verveer PJ, Duin RPW (1995) An evaluation of intrinsic dimensionality estimators. IEEE Trans Pattern Anal Mach Intell 17(1):81–86CrossRef
go back to reference Wang D, Shi L, Cao J (2013) Fast algorithm for approximate k-nearest neighbor graph construction. In: 2013 IEEE 13th international conference on data mining workshops, pp 349–356 Wang D, Shi L, Cao J (2013) Fast algorithm for approximate k-nearest neighbor graph construction. In: 2013 IEEE 13th international conference on data mining workshops, pp 349–356
go back to reference Wang L, Wong A (2020) COVID-net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest radiography images, arXiv preprint arXiv:2003.09871 Wang L, Wong A (2020) COVID-net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest radiography images, arXiv preprint arXiv:​2003.​09871
go back to reference Zelnik-Manor L, Perona P (2004) Self-tuning spectral clustering. In: Advances in neural information processing systems, pp 1601–1608 Zelnik-Manor L, Perona P (2004) Self-tuning spectral clustering. In: Advances in neural information processing systems, pp 1601–1608
Metadata
Title
Gaussian bandwidth selection for manifold learning and classification
Authors
Ofir Lindenbaum
Moshe Salhov
Arie Yeredor
Amir Averbuch
Publication date
02-07-2020
Publisher
Springer US
Published in
Data Mining and Knowledge Discovery / Issue 6/2020
Print ISSN: 1384-5810
Electronic ISSN: 1573-756X
DOI
https://doi.org/10.1007/s10618-020-00692-x

Other articles of this Issue 6/2020

Data Mining and Knowledge Discovery 6/2020 Go to the issue

Premium Partner