Skip to main content
Top
Published in: Knowledge and Information Systems 2/2014

01-05-2014 | Regular Paper

Sparse regression mixture modeling with the multi-kernel relevance vector machine

Authors: Konstantinos Blekas, Aristidis Likas

Published in: Knowledge and Information Systems | Issue 2/2014

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

A regression mixture model is proposed where each mixture component is a multi-kernel version of the Relevance Vector Machine (RVM). This mixture model exploits the enhanced modeling capability of RVMs, due to their embedded sparsity enforcing properties. In order to deal with the selection problem of kernel parameters, a weighted multi-kernel scheme is employed, where the weights are estimated during training. The mixture model is trained using the maximum a posteriori approach, where the Expectation Maximization (EM) algorithm is applied offering closed form update equations for the model parameters. Moreover, an incremental learning methodology is also presented that tackles the parameter initialization problem of the EM algorithm along with a BIC-based model selection methodology to estimate the proper number of mixture components. We provide comparative experimental results using various artificial and real benchmark datasets that empirically illustrate the efficiency of the proposed mixture model.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Alon J, Sclaroff S, Kollios G, Pavlovic V (2003) Discovering clusters in motion time-series data. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 375–381 Alon J, Sclaroff S, Kollios G, Pavlovic V (2003) Discovering clusters in motion time-series data. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 375–381
2.
go back to reference Antonini G, Thiran J (2006) Counting pedestrians in video sequences using trajectory clustering. IEEE Trans Circuits Syst Video Technol 16(8):1008–1020CrossRef Antonini G, Thiran J (2006) Counting pedestrians in video sequences using trajectory clustering. IEEE Trans Circuits Syst Video Technol 16(8):1008–1020CrossRef
3.
go back to reference Bishop C (2006) Pattern recognition and machine learning. Springer, BerlinMATH Bishop C (2006) Pattern recognition and machine learning. Springer, BerlinMATH
4.
go back to reference Blekas K, Likas A (2012) The mixture of multi-kernel relevance vector machines model. In: International Conference on Data Mining (ICDM), pp 111–120 Blekas K, Likas A (2012) The mixture of multi-kernel relevance vector machines model. In: International Conference on Data Mining (ICDM), pp 111–120
5.
go back to reference Blekas K, Nikou C, Galatsanos N, Tsekos NV (2008) A regression mixture model with spatial constraints for clustering spatiotemporal data. Int J Artif Intell Tools 17(5):1023–1041CrossRef Blekas K, Nikou C, Galatsanos N, Tsekos NV (2008) A regression mixture model with spatial constraints for clustering spatiotemporal data. Int J Artif Intell Tools 17(5):1023–1041CrossRef
6.
go back to reference Chudova D, Gaffney S, Mjolsness E, Smyth P (2003) Mixture models for translation-invariant clustering of sets of multi-dimensional curves. In: Proceedings of the Ninth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, Washington, pp 79–88 Chudova D, Gaffney S, Mjolsness E, Smyth P (2003) Mixture models for translation-invariant clustering of sets of multi-dimensional curves. In: Proceedings of the Ninth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, Washington, pp 79–88
7.
go back to reference Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J Royal Stat Soc B 39:1–38MATHMathSciNet Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J Royal Stat Soc B 39:1–38MATHMathSciNet
9.
go back to reference Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh E (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. In PVLDB, pp 1542–1552 Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh E (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. In PVLDB, pp 1542–1552
10.
go back to reference Fraley C, Raftery AE (1998) Bayesian regularization for normal mixture estimation and model-based clustering. Comput J 41:578–588CrossRefMATH Fraley C, Raftery AE (1998) Bayesian regularization for normal mixture estimation and model-based clustering. Comput J 41:578–588CrossRefMATH
11.
go back to reference Gaffney S, Smyth P (2003) Curve clustering with random effects regression mixtures. In: Bishop CM, Frey BJ (eds) Proceedings of the ninth international workshop on artificial intelligence and statistics Gaffney S, Smyth P (2003) Curve clustering with random effects regression mixtures. In: Bishop CM, Frey BJ (eds) Proceedings of the ninth international workshop on artificial intelligence and statistics
12.
go back to reference Girolami M, Rogers S (2005) Hierarchic bayesian models for kernel learning. In: International conference on machine learning (ICML’05), pp 241–248 Girolami M, Rogers S (2005) Hierarchic bayesian models for kernel learning. In: International conference on machine learning (ICML’05), pp 241–248
13.
go back to reference Gonen M, Alpaydin E (2011) Multiple kernel learning algorithms. J Mach Learn Res 12:2211–2268MathSciNet Gonen M, Alpaydin E (2011) Multiple kernel learning algorithms. J Mach Learn Res 12:2211–2268MathSciNet
14.
15.
go back to reference Harrell F (2001) Regression modeling strategies. With applications to linear models, logistic regression and survival analysis. Springer, New YorkMATH Harrell F (2001) Regression modeling strategies. With applications to linear models, logistic regression and survival analysis. Springer, New YorkMATH
16.
go back to reference Hu M, Chen Y, Kwok J (2009) Building sparse multiple-kernel SVM classifiers. IEEE Trans. Neural Netw 20(5):827–839CrossRef Hu M, Chen Y, Kwok J (2009) Building sparse multiple-kernel SVM classifiers. IEEE Trans. Neural Netw 20(5):827–839CrossRef
17.
go back to reference Keogh E, Lin J, Truppel W (2005) Clustering of time series subsequences is meaningless: implications for past and future research. Knowl Inf Syst KAIS 2:154–177CrossRef Keogh E, Lin J, Truppel W (2005) Clustering of time series subsequences is meaningless: implications for past and future research. Knowl Inf Syst KAIS 2:154–177CrossRef
19.
go back to reference Li J, Barron A (2000) Mixture density estimation. In: Advances in neural information processing systems, Vol 12. The MIT Press, Cambridge, pp 279–285 Li J, Barron A (2000) Mixture density estimation. In: Advances in neural information processing systems, Vol 12. The MIT Press, Cambridge, pp 279–285
20.
23.
go back to reference Pelekis N, Kopanakis I, Kotsifakos E, Frentzos E, Theodoridis Y (2011) Clustering uncertain trajectories. Knowl Inf Syst KAIS 28:117–147CrossRef Pelekis N, Kopanakis I, Kotsifakos E, Frentzos E, Theodoridis Y (2011) Clustering uncertain trajectories. Knowl Inf Syst KAIS 28:117–147CrossRef
24.
go back to reference Rakthanmanon T, Campana B et al (2013) Addressing big data time series: mining trillions of time series subsequences under dynamic time warping. ACM Trans Knowl Discov Data 7(3):1–31 Rakthanmanon T, Campana B et al (2013) Addressing big data time series: mining trillions of time series subsequences under dynamic time warping. ACM Trans Knowl Discov Data 7(3):1–31
25.
go back to reference Schmolck A, Everson R (2007) Smooth relevance vector machine: a smoothness prior extension of the RVM. Mach Learn 68(2):107–135CrossRef Schmolck A, Everson R (2007) Smooth relevance vector machine: a smoothness prior extension of the RVM. Mach Learn 68(2):107–135CrossRef
27.
go back to reference Seeger M (2008) Bayesian inference and optimal design for the sparse linear model. J Mach Learn Res 9:759–813MATHMathSciNet Seeger M (2008) Bayesian inference and optimal design for the sparse linear model. J Mach Learn Res 9:759–813MATHMathSciNet
28.
go back to reference Shi J, Wang B (2008) Curve prediction and clustering with mixtures of Gaussian process functional regression models. Stat Comput 18:267–283CrossRefMathSciNet Shi J, Wang B (2008) Curve prediction and clustering with mixtures of Gaussian process functional regression models. Stat Comput 18:267–283CrossRefMathSciNet
29.
go back to reference Smyth P (1997) Clustering sequences with hidden Markov models. In: Advances in neural information processing systems, pp 648–654 Smyth P (1997) Clustering sequences with hidden Markov models. In: Advances in neural information processing systems, pp 648–654
30.
go back to reference Tipping M (2001) Sparse Bayesian learning and the relevance vector machine. J Mach Learn Res 1:211–244MATHMathSciNet Tipping M (2001) Sparse Bayesian learning and the relevance vector machine. J Mach Learn Res 1:211–244MATHMathSciNet
31.
go back to reference Ueda N, Nakano R, Ghahramani Z, Hinton G (2000) SMEM algorithm for mixture models. Neural Comput 12(9):2109–2128CrossRef Ueda N, Nakano R, Ghahramani Z, Hinton G (2000) SMEM algorithm for mixture models. Neural Comput 12(9):2109–2128CrossRef
32.
go back to reference Vlassis N, Likas A (2001) A greedy EM algorithm for Gaussian mixture learning. Neural Process Lett 15:77–87CrossRef Vlassis N, Likas A (2001) A greedy EM algorithm for Gaussian mixture learning. Neural Process Lett 15:77–87CrossRef
34.
go back to reference Williams B, Toussaint M, Storkey A (2008) Modelling motion primitives and their timing in biologically executed movements. In: Advances in neural information processing systems, vol 15, pp 1547–1554 Williams B, Toussaint M, Storkey A (2008) Modelling motion primitives and their timing in biologically executed movements. In: Advances in neural information processing systems, vol 15, pp 1547–1554
35.
go back to reference Williams O, Blake A, Cipolla R (2005) Sparse Bayesian learning for efficient visual tracking. IEEE Trans. Pattern Anal Mach Intell 27(8):1292–1304CrossRef Williams O, Blake A, Cipolla R (2005) Sparse Bayesian learning for efficient visual tracking. IEEE Trans. Pattern Anal Mach Intell 27(8):1292–1304CrossRef
36.
go back to reference Xiong Y, Yeung D-Y (2002) Mixtures of ARMA models for model-based time series clustering. In: IEEE international conference on data mining (ICDM), pp 717–720 Xiong Y, Yeung D-Y (2002) Mixtures of ARMA models for model-based time series clustering. In: IEEE international conference on data mining (ICDM), pp 717–720
37.
go back to reference Zhong M (2006) A variational method for learning sparse Bayesian regression. Neurocomputing 69:2351–2355CrossRef Zhong M (2006) A variational method for learning sparse Bayesian regression. Neurocomputing 69:2351–2355CrossRef
Metadata
Title
Sparse regression mixture modeling with the multi-kernel relevance vector machine
Authors
Konstantinos Blekas
Aristidis Likas
Publication date
01-05-2014
Publisher
Springer London
Published in
Knowledge and Information Systems / Issue 2/2014
Print ISSN: 0219-1377
Electronic ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-013-0704-0

Other articles of this Issue 2/2014

Knowledge and Information Systems 2/2014 Go to the issue

Premium Partner