Skip to main content
Top
Published in: Data Mining and Knowledge Discovery 3/2018

14-02-2018

Kernel mixture model for probability density estimation in Bayesian classifiers

Authors: Wenyu Zhang, Zhenjiang Zhang, Han-Chieh Chao, Fan-Hsun Tseng

Published in: Data Mining and Knowledge Discovery | Issue 3/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Estimating reliable class-conditional probability is the prerequisite to implement Bayesian classifiers, and how to estimate the probability density functions (PDFs) is also a fundamental problem for other probabilistic induction algorithms. The finite mixture model (FMM) is able to represent arbitrary complex PDFs by using a mixture of mutimodal distributions, but it assumes that the component mixtures follows a given distribution, which may not be satisfied for real world data. This paper presents a non-parametric kernel mixture model (KMM) based probability density estimation approach, in which the data sample of a class is assumed to be drawn by several unknown independent hidden subclasses. Unlike traditional FMM schemes, we simply use the k-means clustering algorithm to partition the data sample into several independent components, and the regional density diversities of components are combined using the Bayes theorem. On the basis of the proposed kernel mixture model, we present a three-step Bayesian classifier, which includes partitioning, structure learning, and PDF estimation. Experimental results show that KMM is able to improve the quality of estimated PDFs of conventional kernel density estimation (KDE) method, and also show that KMM-based Bayesian classifiers outperforms existing Gaussian, GMM, and KDE-based Bayesian classifiers.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Babich GA, Camps OI (1996) Weighted parzen windows for pattern classification. IEEE Trans Pattern Anal Mach Intell 18(5):567–570CrossRef Babich GA, Camps OI (1996) Weighted parzen windows for pattern classification. IEEE Trans Pattern Anal Mach Intell 18(5):567–570CrossRef
go back to reference Bouckaert RR (2004) Naive bayes classifiers that perform well with continuous variables. In: AI 2004: advances in artificial intelligence, Springer, Berlin, pp 1089–1094 Bouckaert RR (2004) Naive bayes classifiers that perform well with continuous variables. In: AI 2004: advances in artificial intelligence, Springer, Berlin, pp 1089–1094
go back to reference Castillo E, Gutierrez JM, Hadi AS (2012) Expert systems and probabilistic network models. Springer, BerlinMATH Castillo E, Gutierrez JM, Hadi AS (2012) Expert systems and probabilistic network models. Springer, BerlinMATH
go back to reference Chickering DM (2010) Learning bayesian networks is np-complete. Lect. Notes Stat. 112(2):121–130MathSciNet Chickering DM (2010) Learning bayesian networks is np-complete. Lect. Notes Stat. 112(2):121–130MathSciNet
go back to reference Chow CK, Liu CN, Liu c (1968) Approximating discrete probability distributions with dependence trees. IEEE Trans. Inf. Theory 14(3):462–467 IEEE Transactions on Information Theory 14(3), 462–467CrossRefMATH Chow CK, Liu CN, Liu c (1968) Approximating discrete probability distributions with dependence trees. IEEE Trans. Inf. Theory 14(3):462–467 IEEE Transactions on Information Theory 14(3), 462–467CrossRefMATH
go back to reference Dehnad K (1986) Density estimation for statistics and data analysis. Chapman and Hall, Boca Raton Dehnad K (1986) Density estimation for statistics and data analysis. Chapman and Hall, Boca Raton
go back to reference Domingos P, Pazzani M (1997) On the optimality of the simple bayesian classifier under zero-one loss. Mach Learn 29(2–3):103–130CrossRefMATH Domingos P, Pazzani M (1997) On the optimality of the simple bayesian classifier under zero-one loss. Mach Learn 29(2–3):103–130CrossRefMATH
go back to reference Duda RO, Hart PE, Stork DG (2012) Pattern classification. Wiley, New YorkMATH Duda RO, Hart PE, Stork DG (2012) Pattern classification. Wiley, New YorkMATH
go back to reference Figueiredo MAT, Jain AK (2002) Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 24(3):381–396CrossRef Figueiredo MAT, Jain AK (2002) Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 24(3):381–396CrossRef
go back to reference Friedman N, Dan G, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2–3):131–163CrossRefMATH Friedman N, Dan G, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2–3):131–163CrossRefMATH
go back to reference Girolami M, He C (2003) Probability density estimation from optimally condensed data samples. IEEE Trans Pattern Anal Mach Intell 25(10):1253–1264CrossRef Girolami M, He C (2003) Probability density estimation from optimally condensed data samples. IEEE Trans Pattern Anal Mach Intell 25(10):1253–1264CrossRef
go back to reference Hand DJ, Till RJ (2001) A simple generalisation of the area under the roc curve for multiple class classification problems. Mach Learn 45(2):171–186CrossRefMATH Hand DJ, Till RJ (2001) A simple generalisation of the area under the roc curve for multiple class classification problems. Mach Learn 45(2):171–186CrossRefMATH
go back to reference Hand DJ, Yu K (2001) Idiot’s bayesłnot so stupid after all? Int Stat Rev 69(3):385–398MATH Hand DJ, Yu K (2001) Idiot’s bayesłnot so stupid after all? Int Stat Rev 69(3):385–398MATH
go back to reference Heckerman D, Dan G, Chickering DM (1995) Learning bayesian networks: the combination of knowledge and statistical data. Mach Learn 20(3):197–243MATH Heckerman D, Dan G, Chickering DM (1995) Learning bayesian networks: the combination of knowledge and statistical data. Mach Learn 20(3):197–243MATH
go back to reference Heidenreich NB, Schindler A, Sperlich S (2010) Bandwidth selection methods for kernel density estimation—a review of performance. Social Science Electronic Publishing, Rochester Heidenreich NB, Schindler A, Sperlich S (2010) Bandwidth selection methods for kernel density estimation—a review of performance. Social Science Electronic Publishing, Rochester
go back to reference Holmström L (2000) The accuracy and the computational complexity of a multivariate binned kernel density estimator. J Multivar Anal 72(2):264–309MathSciNetCrossRefMATH Holmström L (2000) The accuracy and the computational complexity of a multivariate binned kernel density estimator. J Multivar Anal 72(2):264–309MathSciNetCrossRefMATH
go back to reference Holmström L, Hämäläinen A (1993) The self-organizing reduced kernel density estimator. In: IEEE international conference on neural networks, IEEE, pp 417–421 Holmström L, Hämäläinen A (1993) The self-organizing reduced kernel density estimator. In: IEEE international conference on neural networks, IEEE, pp 417–421
go back to reference Jeon B, Landgrebe DA (1994) Fast parzen density estimation using clustering-based branch and bound. IEEE Trans Pattern Anal Mach Intell 16(9):950–954CrossRef Jeon B, Landgrebe DA (1994) Fast parzen density estimation using clustering-based branch and bound. IEEE Trans Pattern Anal Mach Intell 16(9):950–954CrossRef
go back to reference Jeon J, Taylor JW (2012) Using conditional kernel density estimation for wind power density forecasting. J Am Stat Assoc 107(497):66–79MathSciNetCrossRefMATH Jeon J, Taylor JW (2012) Using conditional kernel density estimation for wind power density forecasting. J Am Stat Assoc 107(497):66–79MathSciNetCrossRefMATH
go back to reference Jiang L, Cai Z, Wang D, Zhang H (2012) Improving tree augmented naive bayes for class probability estimation. Knowl-Based Syst 26:239–245CrossRef Jiang L, Cai Z, Wang D, Zhang H (2012) Improving tree augmented naive bayes for class probability estimation. Knowl-Based Syst 26:239–245CrossRef
go back to reference John GH, Langley P (2013) Estimating continuous distributions in bayesian classifiers. In: Proceedings of the eleventh conference on Uncertainty in artificial intelligence, pp 338–345 John GH, Langley P (2013) Estimating continuous distributions in bayesian classifiers. In: Proceedings of the eleventh conference on Uncertainty in artificial intelligence, pp 338–345
go back to reference Kayabol K, Zerubia J (2013) Unsupervised amplitude and texture classification of sar images with multinomial latent model. IEEE Trans Image Process 22(2):561–572MathSciNetCrossRefMATH Kayabol K, Zerubia J (2013) Unsupervised amplitude and texture classification of sar images with multinomial latent model. IEEE Trans Image Process 22(2):561–572MathSciNetCrossRefMATH
go back to reference Leray P, Francois O (2004) BNT structure learning package: documentation and experiments. Technical Report FRE CNRS 2645, Laboratoire PSI, Universite et INSA de Rouen Leray P, Francois O (2004) BNT structure learning package: documentation and experiments. Technical Report FRE CNRS 2645, Laboratoire PSI, Universite et INSA de Rouen
go back to reference Pérez A, Larrañaga P, Inza I (2009) Bayesian classifiers based on kernel density estimation: flexible classifiers. Int J Approx Reason 50(2):341–362CrossRefMATH Pérez A, Larrañaga P, Inza I (2009) Bayesian classifiers based on kernel density estimation: flexible classifiers. Int J Approx Reason 50(2):341–362CrossRefMATH
go back to reference Raykar VC, Duraiswami R (2006) Fast optimal bandwidth selection for kernel density estimation. In: SIAM international conference on data mining, April 20–22, Bethesda, MD, USA Raykar VC, Duraiswami R (2006) Fast optimal bandwidth selection for kernel density estimation. In: SIAM international conference on data mining, April 20–22, Bethesda, MD, USA
go back to reference Reynolds DA, Rose RC (1995) Robust text-independent speaker identification using gaussian mixture speaker models. IEEE Trans Speech Audio Process 3(1):72–83CrossRef Reynolds DA, Rose RC (1995) Robust text-independent speaker identification using gaussian mixture speaker models. IEEE Trans Speech Audio Process 3(1):72–83CrossRef
go back to reference Rish I (2001) An empirical study of the naive bayes classifier. J Univ Comput Sci 1(2):127 Rish I (2001) An empirical study of the naive bayes classifier. J Univ Comput Sci 1(2):127
go back to reference Schwander O, Nielsen F (2012) Model centroids for the simplification of kernel density estimators. In: IEEE international conference on acoustics, speech and signal processing, pp 737–740 Schwander O, Nielsen F (2012) Model centroids for the simplification of kernel density estimators. In: IEEE international conference on acoustics, speech and signal processing, pp 737–740
go back to reference Schwander O, Nielsen F (2013) Learning mixtures by simplifying kernel density estimators. Matrix Information Geometry. Springer, Berlin, pp 403–426MATH Schwander O, Nielsen F (2013) Learning mixtures by simplifying kernel density estimators. Matrix Information Geometry. Springer, Berlin, pp 403–426MATH
go back to reference Scott DW (2015) Multivariate density estimation: theory, practice, and visualization. Wiley, New YorkCrossRefMATH Scott DW (2015) Multivariate density estimation: theory, practice, and visualization. Wiley, New YorkCrossRefMATH
go back to reference Scott DW, Sheather SJ (1985) Kernel density estimation with binned data. Commun Stat Theory Methods 14(6):1353–1359CrossRef Scott DW, Sheather SJ (1985) Kernel density estimation with binned data. Commun Stat Theory Methods 14(6):1353–1359CrossRef
go back to reference Shen W, Tokdar ST, Ghosal S (2013) Adaptive bayesian multivariate density estimation with dirichlet mixtures. Biometrika 100(3):623–640MathSciNetCrossRefMATH Shen W, Tokdar ST, Ghosal S (2013) Adaptive bayesian multivariate density estimation with dirichlet mixtures. Biometrika 100(3):623–640MathSciNetCrossRefMATH
go back to reference Topchy AP, Jain AK, Punch WF (2004) A mixture model for clustering ensembles. In: SDM, SIAM, pp 379–390 Topchy AP, Jain AK, Punch WF (2004) A mixture model for clustering ensembles. In: SDM, SIAM, pp 379–390
go back to reference Wang F, Zhang C, Lu N (2005) Boosting GMM and its two applications. In: International workshop on multiple classifier systems, vol 3541. Springer, Berlin, Heidelberg, pp 12–21 Wang F, Zhang C, Lu N (2005) Boosting GMM and its two applications. In: International workshop on multiple classifier systems, vol 3541. Springer, Berlin, Heidelberg, pp 12–21
go back to reference Wang S, Wang J, Chung FL (2013) Kernel density estimation, kernel methods, and fast learning in large data sets. IEEE Trans Cybern 44(1):1–20CrossRef Wang S, Wang J, Chung FL (2013) Kernel density estimation, kernel methods, and fast learning in large data sets. IEEE Trans Cybern 44(1):1–20CrossRef
go back to reference Xiong F, Liu Y, Cheng J (2017a) Modeling and predicting opinion formation with trust propagation in online social networks. Commun Nonlinear Sci Numer Simul 44:513–524MathSciNetCrossRef Xiong F, Liu Y, Cheng J (2017a) Modeling and predicting opinion formation with trust propagation in online social networks. Commun Nonlinear Sci Numer Simul 44:513–524MathSciNetCrossRef
go back to reference Xiong F, Liu Y, Wang L, Wang X (2017b) Analysis and application of opinion model with multiple topic interactions. Chaos 27(8):083,113MathSciNetCrossRef Xiong F, Liu Y, Wang L, Wang X (2017b) Analysis and application of opinion model with multiple topic interactions. Chaos 27(8):083,113MathSciNetCrossRef
go back to reference Xu X, Yan Z, Xu S (2015) Estimating wind speed probability distribution by diffusion-based kernel density method. Electr Power Syst Res 121:28–37CrossRef Xu X, Yan Z, Xu S (2015) Estimating wind speed probability distribution by diffusion-based kernel density method. Electr Power Syst Res 121:28–37CrossRef
go back to reference Yang Y, Webb GI (2009) Discretization for naive-bayes learning: managing discretization bias and variance. Mach Learn 74(1):39–74CrossRef Yang Y, Webb GI (2009) Discretization for naive-bayes learning: managing discretization bias and variance. Mach Learn 74(1):39–74CrossRef
go back to reference Yin H, Allinson NM (2001) Self-organizing mixture networks for probability density estimation. IEEE Trans Neural Netw 12(2):405–411CrossRef Yin H, Allinson NM (2001) Self-organizing mixture networks for probability density estimation. IEEE Trans Neural Netw 12(2):405–411CrossRef
Metadata
Title
Kernel mixture model for probability density estimation in Bayesian classifiers
Authors
Wenyu Zhang
Zhenjiang Zhang
Han-Chieh Chao
Fan-Hsun Tseng
Publication date
14-02-2018
Publisher
Springer US
Published in
Data Mining and Knowledge Discovery / Issue 3/2018
Print ISSN: 1384-5810
Electronic ISSN: 1573-756X
DOI
https://doi.org/10.1007/s10618-018-0550-5

Other articles of this Issue 3/2018

Data Mining and Knowledge Discovery 3/2018 Go to the issue

Premium Partner