Skip to main content
Erschienen in: Pattern Analysis and Applications 1/2011

01.02.2011 | Theoretical Advances

A sparse multinomial probit model for classification

verfasst von: Yunfei Ding, Robert F. Harrison

Erschienen in: Pattern Analysis and Applications | Ausgabe 1/2011

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

A recent development in penalized probit modelling using a hierarchical Bayesian approach has led to a sparse binomial (two-class) probit classifier that can be trained via an EM algorithm. A key advantage of the formulation is that no tuning of hyperparameters relating to the penalty is needed thus simplifying the model selection process. The resulting model demonstrates excellent classification performance and a high degree of sparsity when used as a kernel machine. It is, however, restricted to the binary classification problem and can only be used in the multinomial situation via a one-against-all or one-against-many strategy. To overcome this, we apply the idea to the multinomial probit model. This leads to a direct multi-classification approach and is shown to give a sparse solution with accuracy and sparsity comparable with the current state-of-the-art. Comparative numerical benchmark examples are used to demonstrate the method.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
This method, adopted in recent versions of the Matlab Statistics Toolbox for dimensions above four, makes use of a degree of Monte Carlo simulation and so might be considered a hybrid approach.
 
2
Vec is the operation that stacks the columns of a matrix one upon the other from left to right.
 
3
Except in the case of Thyroid 2, for reasons of runtime, owing to its large size.
 
Literatur
1.
Zurück zum Zitat Mukherjee S (2003) Classifying microarray data using support vector machines. In: Berrar DP, Dubitzky W, Granzow M (eds) A practical approach to microarray data analysis. Kluwer, Boston Mukherjee S (2003) Classifying microarray data using support vector machines. In: Berrar DP, Dubitzky W, Granzow M (eds) A practical approach to microarray data analysis. Kluwer, Boston
2.
Zurück zum Zitat Weston J, Watkins C (1998) Multi-class support vector machines. Tech Rep CSD-TR-98-04, Department of Computer Science, Royal Holloway University of London Weston J, Watkins C (1998) Multi-class support vector machines. Tech Rep CSD-TR-98-04, Department of Computer Science, Royal Holloway University of London
3.
Zurück zum Zitat Lee Y, Lin Y, Wahba G (2001) Multicategory support vector machines. Tech Rep 1043, Department of Statistics, University of Wisconsin, Madison, WI Lee Y, Lin Y, Wahba G (2001) Multicategory support vector machines. Tech Rep 1043, Department of Statistics, University of Wisconsin, Madison, WI
4.
Zurück zum Zitat Harrison RF, Pasupa K, (2009) Sparse multinomial kernel discriminant analysis (sMKDA). Pattern Recognit 42:1795–1802MATHCrossRef Harrison RF, Pasupa K, (2009) Sparse multinomial kernel discriminant analysis (sMKDA). Pattern Recognit 42:1795–1802MATHCrossRef
5.
Zurück zum Zitat Abe S, (2007) Sparse least-squares support vector training in the reduced empirical feature space. Pattern Anal Appl 10:203–214CrossRef Abe S, (2007) Sparse least-squares support vector training in the reduced empirical feature space. Pattern Anal Appl 10:203–214CrossRef
6.
Zurück zum Zitat Girolami M, Rogers S (2006) Variational Bayesian multinomial probit regression with Gaussian process priors. Neural Comput 18(8):1790–1817MATHCrossRefMathSciNet Girolami M, Rogers S (2006) Variational Bayesian multinomial probit regression with Gaussian process priors. Neural Comput 18(8):1790–1817MATHCrossRefMathSciNet
7.
Zurück zum Zitat Cawley GC, Talbot NLC, Girolami M (2007) Sparse multinomial logistic regression via Bayesian L1 regularisation. In: Schölkopf B, Platt JC, Hoffmann T (eds) Advances in neural information processing systems (NIPS), vol 19. MIT Press, Cambridge Cawley GC, Talbot NLC, Girolami M (2007) Sparse multinomial logistic regression via Bayesian L1 regularisation. In: Schölkopf B, Platt JC, Hoffmann T (eds) Advances in neural information processing systems (NIPS), vol 19. MIT Press, Cambridge
8.
Zurück zum Zitat Figueiredo MAT (2003) Adaptive sparseness for supervised learning. IEEE Trans Pattern Anal Mach Intell 25(9):1150–1159CrossRef Figueiredo MAT (2003) Adaptive sparseness for supervised learning. IEEE Trans Pattern Anal Mach Intell 25(9):1150–1159CrossRef
9.
Zurück zum Zitat Thurstone L (1927) A law of comparative judgement. Psychol Rev 34(4):273–286CrossRef Thurstone L (1927) A law of comparative judgement. Psychol Rev 34(4):273–286CrossRef
10.
Zurück zum Zitat Bock RD, Jones LV (1968) The measurement and prediction of judgment and choice. Holden-Day, San Francisco Bock RD, Jones LV (1968) The measurement and prediction of judgment and choice. Holden-Day, San Francisco
11.
Zurück zum Zitat McFadden D (1974) Conditional logit analysis of qualitative choice behavior. In: Zarembka P (ed) Frontiers in econometrics. Academic Press, New York McFadden D (1974) Conditional logit analysis of qualitative choice behavior. In: Zarembka P (ed) Frontiers in econometrics. Academic Press, New York
12.
Zurück zum Zitat Domencich T, McFadden DL (1975) Urban travel demand: a behavioral analysis. North-Holland, Amsterdam Domencich T, McFadden DL (1975) Urban travel demand: a behavioral analysis. North-Holland, Amsterdam
13.
Zurück zum Zitat Hausman JA, Wise DA (1978) A conditional probit model for qualitative choice: discrete decisions recognizing interdependence and heterogeneous preferences. Econometrica 46(2):403–426MATHCrossRefMathSciNet Hausman JA, Wise DA (1978) A conditional probit model for qualitative choice: discrete decisions recognizing interdependence and heterogeneous preferences. Econometrica 46(2):403–426MATHCrossRefMathSciNet
14.
Zurück zum Zitat McFadden D (1989) A method of simulated moments for estimation of discrete response models without numerical integration. Econometrica 57:995–1026MATHCrossRefMathSciNet McFadden D (1989) A method of simulated moments for estimation of discrete response models without numerical integration. Econometrica 57:995–1026MATHCrossRefMathSciNet
15.
Zurück zum Zitat Miguel J, Abito M (2005) A MATLAB implementation of a Halton sequence-based GHK simulator for multinomial probit models. Department of Economics National University of Singapore Miguel J, Abito M (2005) A MATLAB implementation of a Halton sequence-based GHK simulator for multinomial probit models. Department of Economics National University of Singapore
16.
Zurück zum Zitat Genz A, (1992) Numerical computation of multivariate normal probabilities. J Comput Graph Stat 1:141–149CrossRef Genz A, (1992) Numerical computation of multivariate normal probabilities. J Comput Graph Stat 1:141–149CrossRef
17.
19.
Zurück zum Zitat Lerman S, Manski C (1981) On the use of simulated frequencies to approximate choice probabilities. In: Manski C, McFadden D (eds) Structural analysis of discrete data with econometric applications. MIT Press, Cambridge Lerman S, Manski C (1981) On the use of simulated frequencies to approximate choice probabilities. In: Manski C, McFadden D (eds) Structural analysis of discrete data with econometric applications. MIT Press, Cambridge
20.
Zurück zum Zitat McCulloch R, Rossi PE (1994) An exact likelihood analysis of the multinomial probit model. J Econom 64(1–2):207–240CrossRefMathSciNet McCulloch R, Rossi PE (1994) An exact likelihood analysis of the multinomial probit model. J Econom 64(1–2):207–240CrossRefMathSciNet
21.
Zurück zum Zitat Nobile A (1998) A hybrid markov chain for the bayesian analysis of the multinomial probit model. Stat Comput 8:229–242CrossRef Nobile A (1998) A hybrid markov chain for the bayesian analysis of the multinomial probit model. Stat Comput 8:229–242CrossRef
22.
Zurück zum Zitat Chib S, Greenberg E (1995) Hierarchical analysis of SUR models with extensions to correlated serial errors and time varying parameter models. J Econom 68:339–360MATHCrossRef Chib S, Greenberg E (1995) Hierarchical analysis of SUR models with extensions to correlated serial errors and time varying parameter models. J Econom 68:339–360MATHCrossRef
23.
Zurück zum Zitat McCulloch RE, Polson NG, Rossi PE (2000) A Bayesian analysis of the multinomial probit model with fully identified parameters. J Econom 99:173–193MATHCrossRef McCulloch RE, Polson NG, Rossi PE (2000) A Bayesian analysis of the multinomial probit model with fully identified parameters. J Econom 99:173–193MATHCrossRef
24.
Zurück zum Zitat Imai K, van Dyk DA (2005) A Bayesian analysis of the multinomial probit model using marginal data augmentation. J Econom 124(2):311–334CrossRef Imai K, van Dyk DA (2005) A Bayesian analysis of the multinomial probit model using marginal data augmentation. J Econom 124(2):311–334CrossRef
26.
Zurück zum Zitat Bishop C, Tipping M (2000) Variational relevance vector machines. In: Proceedings of the 16th conference on uncertainty in artificial intelligence, pp 46–53 Bishop C, Tipping M (2000) Variational relevance vector machines. In: Proceedings of the 16th conference on uncertainty in artificial intelligence, pp 46–53
27.
Zurück zum Zitat Shevade SK, Keerthi SS, (2003) A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinformatics 19:2246–2253CrossRef Shevade SK, Keerthi SS, (2003) A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinformatics 19:2246–2253CrossRef
29.
30.
Zurück zum Zitat Krishnapuram B, Hartemink AJ, Carin L, Figueiredo MAT (2004) A Bayesian approach to joint feature selection and classifier design. IEEE Trans Pattern Anal Mach Intell 26(9):1105-1111CrossRef Krishnapuram B, Hartemink AJ, Carin L, Figueiredo MAT (2004) A Bayesian approach to joint feature selection and classifier design. IEEE Trans Pattern Anal Mach Intell 26(9):1105-1111CrossRef
31.
Zurück zum Zitat Keane MP (1992) A note on identification in the multinomial probit model. American statistical association. Am Stat Assoc J Bus Econom Stat 10(2):193-200CrossRefMathSciNet Keane MP (1992) A note on identification in the multinomial probit model. American statistical association. Am Stat Assoc J Bus Econom Stat 10(2):193-200CrossRefMathSciNet
32.
Zurück zum Zitat Robert CP (2001) The Bayesian choice. Springer, New YorkMATH Robert CP (2001) The Bayesian choice. Springer, New YorkMATH
33.
Zurück zum Zitat Edwards YD, Allenby GM (2003) Multivariate analysis of multiple response data. J Mark Res 40:321–334CrossRef Edwards YD, Allenby GM (2003) Multivariate analysis of multiple response data. J Mark Res 40:321–334CrossRef
34.
Zurück zum Zitat Zhou X, Wang X, Dougherty ER (2006) Multi-class cancer classification using multinomial probit regression with Bayesian gene selection. IEE Proc Syst Biol 153(2):70–78CrossRef Zhou X, Wang X, Dougherty ER (2006) Multi-class cancer classification using multinomial probit regression with Bayesian gene selection. IEE Proc Syst Biol 153(2):70–78CrossRef
35.
Zurück zum Zitat Yau P, Kohn R, Wood S (2003) Bayesian variable selection and model avergaing in high-dimensional multinomial nonparametric regression. J Comput Graph Stat 12:23–54CrossRefMathSciNet Yau P, Kohn R, Wood S (2003) Bayesian variable selection and model avergaing in high-dimensional multinomial nonparametric regression. J Comput Graph Stat 12:23–54CrossRefMathSciNet
36.
Zurück zum Zitat Schölkopf B, Smola A (2002) Learning with kernels. MIT Press, Cambridge Schölkopf B, Smola A (2002) Learning with kernels. MIT Press, Cambridge
38.
Zurück zum Zitat Zhou ZH, Jiang Y, Chen SF (2003) Extracting symbolic rules from trained neural network ensembles. AI Commun 16:3–15 Zhou ZH, Jiang Y, Chen SF (2003) Extracting symbolic rules from trained neural network ensembles. AI Commun 16:3–15
39.
Zurück zum Zitat Cheng J, Greiner R (1999) Comparing Bayesian networks classifiers. In: Proceedings of the 15th conference on uncertainty in artificial intelligence (UAI 1999), pp 101–108 Cheng J, Greiner R (1999) Comparing Bayesian networks classifiers. In: Proceedings of the 15th conference on uncertainty in artificial intelligence (UAI 1999), pp 101–108
40.
Zurück zum Zitat Krishnapuram B, Carin L, Figueiredo MAT, Hartemink AJ (2005) Sparse multinomial logistic regression: fast algorithms and generalization bounds. IEEE Trans Pattern Anal Mach Intell 27(6):957–968CrossRef Krishnapuram B, Carin L, Figueiredo MAT, Hartemink AJ (2005) Sparse multinomial logistic regression: fast algorithms and generalization bounds. IEEE Trans Pattern Anal Mach Intell 27(6):957–968CrossRef
41.
Zurück zum Zitat Hsu CW, Lin CJ (2002) A comparison of methods for multi-class support vector machines. IEEE Trans Neural Netw 13(2):415–425CrossRef Hsu CW, Lin CJ (2002) A comparison of methods for multi-class support vector machines. IEEE Trans Neural Netw 13(2):415–425CrossRef
42.
Zurück zum Zitat Ray S, Page D (2005) Generalized skewing for functions with continuous and nominal attributes. In: The 22nd international conference on machine learning (ICML 2005), pp 705–712 Ray S, Page D (2005) Generalized skewing for functions with continuous and nominal attributes. In: The 22nd international conference on machine learning (ICML 2005), pp 705–712
Metadaten
Titel
A sparse multinomial probit model for classification
verfasst von
Yunfei Ding
Robert F. Harrison
Publikationsdatum
01.02.2011
Verlag
Springer-Verlag
Erschienen in
Pattern Analysis and Applications / Ausgabe 1/2011
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-010-0177-7

Weitere Artikel der Ausgabe 1/2011

Pattern Analysis and Applications 1/2011 Zur Ausgabe