Top

Data Mining and Knowledge Discovery

Published in:

14-02-2018

Kernel mixture model for probability density estimation in Bayesian classifiers

Authors: Wenyu Zhang, Zhenjiang Zhang, Han-Chieh Chao, Fan-Hsun Tseng

Published in: Data Mining and Knowledge Discovery | Issue 3/2018

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Estimating reliable class-conditional probability is the prerequisite to implement Bayesian classifiers, and how to estimate the probability density functions (PDFs) is also a fundamental problem for other probabilistic induction algorithms. The finite mixture model (FMM) is able to represent arbitrary complex PDFs by using a mixture of mutimodal distributions, but it assumes that the component mixtures follows a given distribution, which may not be satisfied for real world data. This paper presents a non-parametric kernel mixture model (KMM) based probability density estimation approach, in which the data sample of a class is assumed to be drawn by several unknown independent hidden subclasses. Unlike traditional FMM schemes, we simply use the k-means clustering algorithm to partition the data sample into several independent components, and the regional density diversities of components are combined using the Bayes theorem. On the basis of the proposed kernel mixture model, we present a three-step Bayesian classifier, which includes partitioning, structure learning, and PDF estimation. Experimental results show that KMM is able to improve the quality of estimated PDFs of conventional kernel density estimation (KDE) method, and also show that KMM-based Bayesian classifiers outperforms existing Gaussian, GMM, and KDE-based Bayesian classifiers.

previous article An optimization model for collaborative recommendation using a covariance-based regularizer

next article Provenance Network Analytics

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Babich GA, Camps OI (1996) Weighted parzen windows for pattern classification. IEEE Trans Pattern Anal Mach Intell 18(5):567–570CrossRef

Bielza C (2014) Discrete bayesian network classifiers: a survey. ACM Comput Surv 47(1):1–43MathSciNetCrossRefMATH

Bouckaert RR (2004) Naive bayes classifiers that perform well with continuous variables. In: AI 2004: advances in artificial intelligence, Springer, Berlin, pp 1089–1094

Castillo E, Gutierrez JM, Hadi AS (2012) Expert systems and probabilistic network models. Springer, BerlinMATH

Chickering DM (2010) Learning bayesian networks is np-complete. Lect. Notes Stat. 112(2):121–130MathSciNet

Chow CK, Liu CN, Liu c (1968) Approximating discrete probability distributions with dependence trees. IEEE Trans. Inf. Theory 14(3):462–467 IEEE Transactions on Information Theory 14(3), 462–467CrossRefMATH

Dehnad K (1986) Density estimation for statistics and data analysis. Chapman and Hall, Boca Raton

Domingos P, Pazzani M (1997) On the optimality of the simple bayesian classifier under zero-one loss. Mach Learn 29(2–3):103–130CrossRefMATH

Duda RO, Hart PE, Stork DG (2012) Pattern classification. Wiley, New YorkMATH

Escobar MD, West M (1995) Bayesian density estimation and inference using mixtures. J Am Stat Assoc 90(430):577–588MathSciNetCrossRefMATH

Figueiredo MAT, Jain AK (2002) Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 24(3):381–396CrossRef

Friedman N, Dan G, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2–3):131–163CrossRefMATH

Girolami M, He C (2003) Probability density estimation from optimally condensed data samples. IEEE Trans Pattern Anal Mach Intell 25(10):1253–1264CrossRef

Hand DJ, Till RJ (2001) A simple generalisation of the area under the roc curve for multiple class classification problems. Mach Learn 45(2):171–186CrossRefMATH

Hand DJ, Yu K (2001) Idiot’s bayesłnot so stupid after all? Int Stat Rev 69(3):385–398MATH

Heckerman D, Dan G, Chickering DM (1995) Learning bayesian networks: the combination of knowledge and statistical data. Mach Learn 20(3):197–243MATH

Heidenreich NB, Schindler A, Sperlich S (2010) Bandwidth selection methods for kernel density estimation—a review of performance. Social Science Electronic Publishing, Rochester

Holmström L (2000) The accuracy and the computational complexity of a multivariate binned kernel density estimator. J Multivar Anal 72(2):264–309MathSciNetCrossRefMATH

Holmström L, Hämäläinen A (1993) The self-organizing reduced kernel density estimator. In: IEEE international conference on neural networks, IEEE, pp 417–421

Jeon B, Landgrebe DA (1994) Fast parzen density estimation using clustering-based branch and bound. IEEE Trans Pattern Anal Mach Intell 16(9):950–954CrossRef

Jeon J, Taylor JW (2012) Using conditional kernel density estimation for wind power density forecasting. J Am Stat Assoc 107(497):66–79MathSciNetCrossRefMATH

Jiang L, Cai Z, Wang D, Zhang H (2012) Improving tree augmented naive bayes for class probability estimation. Knowl-Based Syst 26:239–245CrossRef

John GH, Langley P (2013) Estimating continuous distributions in bayesian classifiers. In: Proceedings of the eleventh conference on Uncertainty in artificial intelligence, pp 338–345

Kayabol K, Zerubia J (2013) Unsupervised amplitude and texture classification of sar images with multinomial latent model. IEEE Trans Image Process 22(2):561–572MathSciNetCrossRefMATH

Leray P, Francois O (2004) BNT structure learning package: documentation and experiments. Technical Report FRE CNRS 2645, Laboratoire PSI, Universite et INSA de Rouen

Pérez A, Larrañaga P, Inza I (2009) Bayesian classifiers based on kernel density estimation: flexible classifiers. Int J Approx Reason 50(2):341–362CrossRefMATH

Raykar VC, Duraiswami R (2006) Fast optimal bandwidth selection for kernel density estimation. In: SIAM international conference on data mining, April 20–22, Bethesda, MD, USA

Reynolds DA, Rose RC (1995) Robust text-independent speaker identification using gaussian mixture speaker models. IEEE Trans Speech Audio Process 3(1):72–83CrossRef

Rish I (2001) An empirical study of the naive bayes classifier. J Univ Comput Sci 1(2):127

Schwander O, Nielsen F (2012) Model centroids for the simplification of kernel density estimators. In: IEEE international conference on acoustics, speech and signal processing, pp 737–740

Schwander O, Nielsen F (2013) Learning mixtures by simplifying kernel density estimators. Matrix Information Geometry. Springer, Berlin, pp 403–426MATH

Scott DW (2015) Multivariate density estimation: theory, practice, and visualization. Wiley, New YorkCrossRefMATH

Scott DW, Sheather SJ (1985) Kernel density estimation with binned data. Commun Stat Theory Methods 14(6):1353–1359CrossRef

Shen W, Tokdar ST, Ghosal S (2013) Adaptive bayesian multivariate density estimation with dirichlet mixtures. Biometrika 100(3):623–640MathSciNetCrossRefMATH

Simonoff JS (1997) Smoothing methods in statistics. Technometrics 92(3):338–339MathSciNetMATH

Sucar LE (2015) Bayesian classifiers. Springer, LondonCrossRef

Topchy AP, Jain AK, Punch WF (2004) A mixture model for clustering ensembles. In: SDM, SIAM, pp 379–390

Wang F, Zhang C, Lu N (2005) Boosting GMM and its two applications. In: International workshop on multiple classifier systems, vol 3541. Springer, Berlin, Heidelberg, pp 12–21

Wang S, Wang J, Chung FL (2013) Kernel density estimation, kernel methods, and fast learning in large data sets. IEEE Trans Cybern 44(1):1–20CrossRef

Xiong F, Liu Y, Cheng J (2017a) Modeling and predicting opinion formation with trust propagation in online social networks. Commun Nonlinear Sci Numer Simul 44:513–524MathSciNetCrossRef

Xiong F, Liu Y, Wang L, Wang X (2017b) Analysis and application of opinion model with multiple topic interactions. Chaos 27(8):083,113MathSciNetCrossRef

Xu X, Yan Z, Xu S (2015) Estimating wind speed probability distribution by diffusion-based kernel density method. Electr Power Syst Res 121:28–37CrossRef

Yang Y, Webb GI (2009) Discretization for naive-bayes learning: managing discretization bias and variance. Mach Learn 74(1):39–74CrossRef

Yin H, Allinson NM (2001) Self-organizing mixture networks for probability density estimation. IEEE Trans Neural Netw 12(2):405–411CrossRef

Title: Kernel mixture model for probability density estimation in Bayesian classifiers
Authors: Wenyu Zhang
Zhenjiang Zhang
Han-Chieh Chao
Fan-Hsun Tseng
Publication date: 14-02-2018
Publisher: Springer US
Published in: Data Mining and Knowledge Discovery / Issue 3/2018
Print ISSN: 1384-5810
Electronic ISSN: 1573-756X
DOI: https://doi.org/10.1007/s10618-018-0550-5

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 3/2018

Online estimation of discrete, continuous, and conditional joint densities using classifier chains

Targeted interest-driven advertising in cities using Twitter

An optimization model for collaborative recommendation using a covariance-based regularizer

Data-driven generation of spatio-temporal routines in human mobility

Provenance Network Analytics

Editor's Note: Special Section on Data Mining for Smart Cities

Premium Partner