Skip to main content
Erschienen in: Data Mining and Knowledge Discovery 6/2020

02.07.2020

Gaussian bandwidth selection for manifold learning and classification

verfasst von: Ofir Lindenbaum, Moshe Salhov, Arie Yeredor, Amir Averbuch

Erschienen in: Data Mining and Knowledge Discovery | Ausgabe 6/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Kernel methods play a critical role in many machine learning algorithms. They are useful in manifold learning, classification, clustering and other data analysis tasks. Setting the kernel’s scale parameter, also referred to as the kernel’s bandwidth, highly affects the performance of the task in hand. We propose to set a scale parameter that is tailored to one of two types of tasks: classification and manifold learning. For manifold learning, we seek a scale which is best at capturing the manifold’s intrinsic dimension. For classification, we propose three methods for estimating the scale, which optimize the classification results in different senses. The proposed frameworks are simulated on artificial and on real datasets. The results show a high correlation between optimal classification rates and the estimated scales. Finally, we demonstrate the approach on a seismic event classification task.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
Zurück zum Zitat Abbas A, Abdelsamea MM, Gaber MM (2020) Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network, arXiv preprint arXiv:2003.13815 Abbas A, Abdelsamea MM, Gaber MM (2020) Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network, arXiv preprint arXiv:​2003.​13815
Zurück zum Zitat Belkin M, Niyogi P (2001) Laplacian eigenmaps and spectral techniques for embedding and clustering. NIPS 14:585–591 Belkin M, Niyogi P (2001) Laplacian eigenmaps and spectral techniques for embedding and clustering. NIPS 14:585–591
Zurück zum Zitat Beyreuther M, Hammer C, Wassermann M, Ohrnberger M, Megies M (2012) Constructing a hidden markov model based earthquake detector: application to induced seismicity. Geophys J Int 189:602–610CrossRef Beyreuther M, Hammer C, Wassermann M, Ohrnberger M, Megies M (2012) Constructing a hidden markov model based earthquake detector: application to induced seismicity. Geophys J Int 189:602–610CrossRef
Zurück zum Zitat Blandford R (1982) Seismic event discrimination. Bull Seismol Soc Am 72:569–587 Blandford R (1982) Seismic event discrimination. Bull Seismol Soc Am 72:569–587
Zurück zum Zitat Camastra F (2003) Data dimensionality estimation methods: a survey. Pattern Recogn 36(12):2945–2954CrossRef Camastra F (2003) Data dimensionality estimation methods: a survey. Pattern Recogn 36(12):2945–2954CrossRef
Zurück zum Zitat Campbell C, Cristianini N, Shawe-Taylor J (1999) Dynamically adapting kernels in support vector machines. Adv Neural Inf Process Syst 11:204–210 Campbell C, Cristianini N, Shawe-Taylor J (1999) Dynamically adapting kernels in support vector machines. Adv Neural Inf Process Syst 11:204–210
Zurück zum Zitat Ceruti C, Bassis S, Rozza A, Lombardi G, Casiraghi E, Campadelli P (2014) Danco: Dimensionality from angle and norm concentration. Pattern Recogn 47(8):2569–2581CrossRef Ceruti C, Bassis S, Rozza A, Lombardi G, Casiraghi E, Campadelli P (2014) Danco: Dimensionality from angle and norm concentration. Pattern Recogn 47(8):2569–2581CrossRef
Zurück zum Zitat Chapelle O, Vapnik V, Bousquet O, Mukherjee S (2002) Choosing multiple parameters for support vector machines. Mach Learn 46(1–3):131–159CrossRef Chapelle O, Vapnik V, Bousquet O, Mukherjee S (2002) Choosing multiple parameters for support vector machines. Mach Learn 46(1–3):131–159CrossRef
Zurück zum Zitat Cohen I, Tian Q, Zhou XS, Huang TS (2002) Feature selection using principal feature analysis. Univ. of Illinois at Urbana-Champaign, Cohen I, Tian Q, Zhou XS, Huang TS (2002) Feature selection using principal feature analysis. Univ. of Illinois at Urbana-Champaign,
Zurück zum Zitat Coifman RR, Shkolnisky Y, Sigworth FJ, Singer A (2008) Graph laplacian tomography from unknown random projections. IEEE Trans Image Process 17(10):1891–1899MathSciNetCrossRef Coifman RR, Shkolnisky Y, Sigworth FJ, Singer A (2008) Graph laplacian tomography from unknown random projections. IEEE Trans Image Process 17(10):1891–1899MathSciNetCrossRef
Zurück zum Zitat Del Pezzo E, Esposito A, Giudicepietro F, Marinaro M, Martini M, Scarpetta S (2003) Discrimination of earthquakes and underwater explosions using neural networks. Bull Seismol Soc Am 93(1):215–223CrossRef Del Pezzo E, Esposito A, Giudicepietro F, Marinaro M, Martini M, Scarpetta S (2003) Discrimination of earthquakes and underwater explosions using neural networks. Bull Seismol Soc Am 93(1):215–223CrossRef
Zurück zum Zitat Dhillon IS, Guan Y, Kulis B (2004) Kernel k-means: spectral clustering and normalized cuts. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 551–556 Dhillon IS, Guan Y, Kulis B (2004) Kernel k-means: spectral clustering and normalized cuts. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 551–556
Zurück zum Zitat Ding CH, He X, Simon HD (2005) On the equivalence of nonnegative matrix factorization and spectral clustering. In: SDM, vol 5. SIAM, pp 606–610 Ding CH, He X, Simon HD (2005) On the equivalence of nonnegative matrix factorization and spectral clustering. In: SDM, vol 5. SIAM, pp 606–610
Zurück zum Zitat Fukunaga K, Olsen DR (1971) An algorithm for finding intrinsic dimensionality of data. IEEE Trans Comput 100(2):176–183CrossRef Fukunaga K, Olsen DR (1971) An algorithm for finding intrinsic dimensionality of data. IEEE Trans Comput 100(2):176–183CrossRef
Zurück zum Zitat Gaspar P, Carbonell J, Oliveira JL (2012) On the parameter optimization of support vector machines for binary classification. J Integr Bioinform 9(3):201CrossRef Gaspar P, Carbonell J, Oliveira JL (2012) On the parameter optimization of support vector machines for binary classification. J Integr Bioinform 9(3):201CrossRef
Zurück zum Zitat Hammer C, Ohrnberger M, Fäh D (2013) Classifying seismic waveforms from scratch: a case study in the alpine environment. Geophys J Int 192:425–439CrossRef Hammer C, Ohrnberger M, Fäh D (2013) Classifying seismic waveforms from scratch: a case study in the alpine environment. Geophys J Int 192:425–439CrossRef
Zurück zum Zitat Hein M, Audibert J-Y (2005) Intrinsic dimensionality estimation of submanifolds in R D. In: Proceedings of the 22nd international conference on machine learning. ACM, pp 289–296 Hein M, Audibert J-Y (2005) Intrinsic dimensionality estimation of submanifolds in R D. In: Proceedings of the 22nd international conference on machine learning. ACM, pp 289–296
Zurück zum Zitat Jolliffe I (2002) Principal component analysis. Wiley Online Library Jolliffe I (2002) Principal component analysis. Wiley Online Library
Zurück zum Zitat Joswig M (1990) Pattern recognition for earthquake detection. Bull Seismol Soc Am 80(1):170–186 Joswig M (1990) Pattern recognition for earthquake detection. Bull Seismol Soc Am 80(1):170–186
Zurück zum Zitat Kortström J, Uski M, Tiira T (2016) Automatic classification of seismic events within a regional seismograph network. Comput Geosci 87:22–30CrossRef Kortström J, Uski M, Tiira T (2016) Automatic classification of seismic events within a regional seismograph network. Comput Geosci 87:22–30CrossRef
Zurück zum Zitat Kruskal JB, Wish M (1977) Multidimensional scaling. Sage Publications, Beverly Hills Kruskal JB, Wish M (1977) Multidimensional scaling. Sage Publications, Beverly Hills
Zurück zum Zitat Lafon S, Keller Y, Coifman RR (2006) Data fusion and multicue data matching by diffusion maps. IEEE Trans Pattern Anal Mach Intell 28(11):1784–1797CrossRef Lafon S, Keller Y, Coifman RR (2006) Data fusion and multicue data matching by diffusion maps. IEEE Trans Pattern Anal Mach Intell 28(11):1784–1797CrossRef
Zurück zum Zitat Lafon S, Keller Y, Coifman R (2006) Data fusion and multicue data matching by diffusion maps. IEEE Trans Pattern Anal Mach Intell 28(11):1784–1797CrossRef Lafon S, Keller Y, Coifman R (2006) Data fusion and multicue data matching by diffusion maps. IEEE Trans Pattern Anal Mach Intell 28(11):1784–1797CrossRef
Zurück zum Zitat Lederman RR, Talmon R (2014) Common manifold learning using alternating-diffusion, submitted, Tech. Report YALEU/DCS/TR1497, Tech. Rep Lederman RR, Talmon R (2014) Common manifold learning using alternating-diffusion, submitted, Tech. Report YALEU/DCS/TR1497, Tech. Rep
Zurück zum Zitat Lichman M (2013) UCI machine learning repository [Online] Lichman M (2013) UCI machine learning repository [Online]
Zurück zum Zitat Lindenbaum O, Yeredor A, Cohen I (2015) Musical key extraction using diffusion maps. Sig Process 117:198–207CrossRef Lindenbaum O, Yeredor A, Cohen I (2015) Musical key extraction using diffusion maps. Sig Process 117:198–207CrossRef
Zurück zum Zitat Lindenbaum O, Bregman Y, Rabin N, Averbuch A (2018) Multi-view kernels for low-dimensional modeling of seismic events. IEEE Trans Geosci Remote Sens 56(6):3300–3310CrossRef Lindenbaum O, Bregman Y, Rabin N, Averbuch A (2018) Multi-view kernels for low-dimensional modeling of seismic events. IEEE Trans Geosci Remote Sens 56(6):3300–3310CrossRef
Zurück zum Zitat Lindenbaum O, Yeredor A, Salhov M, Averbuch A (2020) Multiview diffusion maps. Inf Fusion 55:127–149CrossRef Lindenbaum O, Yeredor A, Salhov M, Averbuch A (2020) Multiview diffusion maps. Inf Fusion 55:127–149CrossRef
Zurück zum Zitat Lindenbaum O, Yeredor A, Averbuch A (2016) Bandwidth selection for kernel-based classification. In: IEEE international conference on the science of electrical engineering (ICSEE), pp 1–5. IEEE Lindenbaum O, Yeredor A, Averbuch A (2016) Bandwidth selection for kernel-based classification. In: IEEE international conference on the science of electrical engineering (ICSEE), pp 1–5. IEEE
Zurück zum Zitat Lin T, Zha H, Lee SU (2006) Riemannian manifold learning for nonlinear dimensionality reduction. In: European conference on computer vision. Springer, pp 44–55 Lin T, Zha H, Lee SU (2006) Riemannian manifold learning for nonlinear dimensionality reduction. In: European conference on computer vision. Springer, pp 44–55
Zurück zum Zitat Lu Y, Cohen I, Zhou XS, Tian Q (2007) Feature selection using principal feature analysis. In: Proceedings of the 15th ACM international conference on Multimedia. ACM, pp 301–304 Lu Y, Cohen I, Zhou XS, Tian Q (2007) Feature selection using principal feature analysis. In: Proceedings of the 15th ACM international conference on Multimedia. ACM, pp 301–304
Zurück zum Zitat Luo W (2011) Face recognition based on laplacian eigenmaps. In: International conference on computer science and service system (CSSS). IEEE, pp 416-419 Luo W (2011) Face recognition based on laplacian eigenmaps. In: International conference on computer science and service system (CSSS). IEEE, pp 416-419
Zurück zum Zitat Ng AY, Jordan MI, Weiss Y et al (2002) On spectral clustering: analysis and an algorithm. Adv Neural Inf Process Syst 2:849–856 Ng AY, Jordan MI, Weiss Y et al (2002) On spectral clustering: analysis and an algorithm. Adv Neural Inf Process Syst 2:849–856
Zurück zum Zitat Ohrnberger M (2001) Continuous automatic classification of seismic signals of volcanic origin at Mt. Merapi, Java, Indonesia, PhD thesis, University of Potsdam Ohrnberger M (2001) Continuous automatic classification of seismic signals of volcanic origin at Mt. Merapi, Java, Indonesia, PhD thesis, University of Potsdam
Zurück zum Zitat Pettis KW, Bailey TA, Jain AK, Dubes RC (1979) An intrinsic dimensionality estimator from near-neighbor information. IEEE Trans Pattern Anal Mach Intell 1:25–37CrossRef Pettis KW, Bailey TA, Jain AK, Dubes RC (1979) An intrinsic dimensionality estimator from near-neighbor information. IEEE Trans Pattern Anal Mach Intell 1:25–37CrossRef
Zurück zum Zitat Rabin N, Bregman Y, Lindenbaum O, Ben-Horin Y, Averbuch A (2016) Earthquake-explosion discrimination using diffusion maps. Geophys J Int 207(3):1484–1492CrossRef Rabin N, Bregman Y, Lindenbaum O, Ben-Horin Y, Averbuch A (2016) Earthquake-explosion discrimination using diffusion maps. Geophys J Int 207(3):1484–1492CrossRef
Zurück zum Zitat Rabin N, Bregman Y, Lindenbaum O, Ben-Horin Y, Averbuch A (2016) Multi-channel fusion for seismic event detection and classification. In: IEEE international conference on the science of electrical engineering (ICSEE). IEEE, pp 1–5 Rabin N, Bregman Y, Lindenbaum O, Ben-Horin Y, Averbuch A (2016) Multi-channel fusion for seismic event detection and classification. In: IEEE international conference on the science of electrical engineering (ICSEE). IEEE, pp 1–5
Zurück zum Zitat Rodgers AJ, Lay T, Walter WR, Mayeda KM (1997) A comparison of regional-phase amplitude ratio measurement techniques. Bull Seismol Soc Am 87(6):1613–1621 Rodgers AJ, Lay T, Walter WR, Mayeda KM (1997) A comparison of regional-phase amplitude ratio measurement techniques. Bull Seismol Soc Am 87(6):1613–1621
Zurück zum Zitat Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by local linear embedding. Science 290(5500):2323–2326CrossRef Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by local linear embedding. Science 290(5500):2323–2326CrossRef
Zurück zum Zitat Ruano AE, Madureira G, Barros O, Khosravani HR, Ruano MG, Ferreira PM (2014) Seismic detection using support vector machines. Neurocomputing 135:273–283CrossRef Ruano AE, Madureira G, Barros O, Khosravani HR, Ruano MG, Ferreira PM (2014) Seismic detection using support vector machines. Neurocomputing 135:273–283CrossRef
Zurück zum Zitat Salhov M, Lindenbaum O, Silberschatz A, Shkolnisky Y, Averbuch A (2019) Multi-view kernel consensus for data analysis and signal processing. Applied and Computational Harmonic Analysis Salhov M, Lindenbaum O, Silberschatz A, Shkolnisky Y, Averbuch A (2019) Multi-view kernel consensus for data analysis and signal processing. Applied and Computational Harmonic Analysis
Zurück zum Zitat Scholkopf B, Smola AJ (2001) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press Scholkopf B, Smola AJ (2001) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press
Zurück zum Zitat Sethy PK, Behera SK (2020) Detection of Coronavirus Disease (COVID-19) Based on Deep Features. Preprints 2020, 2020030300 Sethy PK, Behera SK (2020) Detection of Coronavirus Disease (COVID-19) Based on Deep Features. Preprints 2020, 2020030300
Zurück zum Zitat Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905CrossRef Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905CrossRef
Zurück zum Zitat Shuai W, Bo Kang K, Jinlu M, Xianjun Z, Mingming X, Jia G, Mengjiao C, Jingyi Y, Yaodong L, Xiangfei M, Bo X (2020) A deep learning algorithm using CT images to screen for Corona Virus Disease (COVID-19) medRxiv Shuai W, Bo Kang K, Jinlu M, Xianjun Z, Mingming X, Jia G, Mengjiao C, Jingyi Y, Yaodong L, Xiangfei M, Bo X (2020) A deep learning algorithm using CT images to screen for Corona Virus Disease (COVID-19) medRxiv
Zurück zum Zitat Singer A, Erban R, Kevrekidis I, Coifman RR (2009) Detecting intrinsic slow variables in stochastic dynamical systems by anisotropic diffusion maps. PNAS 106(38):16090–16095CrossRef Singer A, Erban R, Kevrekidis I, Coifman RR (2009) Detecting intrinsic slow variables in stochastic dynamical systems by anisotropic diffusion maps. PNAS 106(38):16090–16095CrossRef
Zurück zum Zitat Song F, Guo Z, Mei D (2010) Feature selection using principal component analysis. In: International conference on system science, engineering design and manufacturing informatization (ICSEM), 2010, vol 1. IEEE, pp 27–30 Song F, Guo Z, Mei D (2010) Feature selection using principal component analysis. In: International conference on system science, engineering design and manufacturing informatization (ICSEM), 2010, vol 1. IEEE, pp 27–30
Zurück zum Zitat Song Y, Zheng S, Li L, Zhang X, Zhang X, Huang Z, et al. (2020) Deep learning enables accurate diagnosis of novel coronavirus (COVID-19) with CT images. medRxiv Song Y, Zheng S, Li L, Zhang X, Zhang X, Huang Z, et al. (2020) Deep learning enables accurate diagnosis of novel coronavirus (COVID-19) with CT images. medRxiv
Zurück zum Zitat Staelin C (2003) Parameter selection for support vector machines, Hewlett-Packard Company, Tech. Rep. HPL-2002-354R1, Staelin C (2003) Parameter selection for support vector machines, Hewlett-Packard Company, Tech. Rep. HPL-2002-354R1,
Zurück zum Zitat Stewart GW (1990) Matrix perturbation theory. Citeseer, Stewart GW (1990) Matrix perturbation theory. Citeseer,
Zurück zum Zitat Taseska M, Van Waterschoot T, Habets EA, Talmon R (2019) Nonlinear filtering with variable-bandwidth exponential kernels. IEEE Trans Signal Process Taseska M, Van Waterschoot T, Habets EA, Talmon R (2019) Nonlinear filtering with variable-bandwidth exponential kernels. IEEE Trans Signal Process
Zurück zum Zitat Tenenbaum J, de Silva V, Langford J (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323CrossRef Tenenbaum J, de Silva V, Langford J (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323CrossRef
Zurück zum Zitat Tiira T (1996) Discrimination of nuclear explosions and earthquakes from teleseismic distances with a local network of short period seismic stations using artificial neural networks. Phys Earth Planet Inter 97(1–4):247–268CrossRef Tiira T (1996) Discrimination of nuclear explosions and earthquakes from teleseismic distances with a local network of short period seismic stations using artificial neural networks. Phys Earth Planet Inter 97(1–4):247–268CrossRef
Zurück zum Zitat Trunk GV (1976) Stastical estimation of the intrinsic dimensionality of a noisy signal collection. IEEE Trans Comput 100(2):165–171CrossRef Trunk GV (1976) Stastical estimation of the intrinsic dimensionality of a noisy signal collection. IEEE Trans Comput 100(2):165–171CrossRef
Zurück zum Zitat Vasiloglou N, Gray AG, Anderson DV (2006) Parameter estimation for manifold learning, through density estimation. In: 2006 16th IEEE signal processing society workshop on machine learning for signal processing, pp 211–216 Vasiloglou N, Gray AG, Anderson DV (2006) Parameter estimation for manifold learning, through density estimation. In: 2006 16th IEEE signal processing society workshop on machine learning for signal processing, pp 211–216
Zurück zum Zitat Verveer PJ, Duin RPW (1995) An evaluation of intrinsic dimensionality estimators. IEEE Trans Pattern Anal Mach Intell 17(1):81–86CrossRef Verveer PJ, Duin RPW (1995) An evaluation of intrinsic dimensionality estimators. IEEE Trans Pattern Anal Mach Intell 17(1):81–86CrossRef
Zurück zum Zitat Wang D, Shi L, Cao J (2013) Fast algorithm for approximate k-nearest neighbor graph construction. In: 2013 IEEE 13th international conference on data mining workshops, pp 349–356 Wang D, Shi L, Cao J (2013) Fast algorithm for approximate k-nearest neighbor graph construction. In: 2013 IEEE 13th international conference on data mining workshops, pp 349–356
Zurück zum Zitat Wang L, Wong A (2020) COVID-net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest radiography images, arXiv preprint arXiv:2003.09871 Wang L, Wong A (2020) COVID-net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest radiography images, arXiv preprint arXiv:​2003.​09871
Zurück zum Zitat Zelnik-Manor L, Perona P (2004) Self-tuning spectral clustering. In: Advances in neural information processing systems, pp 1601–1608 Zelnik-Manor L, Perona P (2004) Self-tuning spectral clustering. In: Advances in neural information processing systems, pp 1601–1608
Metadaten
Titel
Gaussian bandwidth selection for manifold learning and classification
verfasst von
Ofir Lindenbaum
Moshe Salhov
Arie Yeredor
Amir Averbuch
Publikationsdatum
02.07.2020
Verlag
Springer US
Erschienen in
Data Mining and Knowledge Discovery / Ausgabe 6/2020
Print ISSN: 1384-5810
Elektronische ISSN: 1573-756X
DOI
https://doi.org/10.1007/s10618-020-00692-x

Weitere Artikel der Ausgabe 6/2020

Data Mining and Knowledge Discovery 6/2020 Zur Ausgabe

Premium Partner