Skip to main content
Top
Published in: Data Mining and Knowledge Discovery 4/2023

16-04-2023

Robust and sparse multinomial regression in high dimensions

Authors: Fatma Sevinç Kurnaz, Peter Filzmoser

Published in: Data Mining and Knowledge Discovery | Issue 4/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

A robust and sparse estimator for multinomial regression is proposed for high dimensional data. Robustness of the estimator is achieved by trimming the observations, and sparsity of the estimator is obtained by the elastic net penalty. In contrast to multi-group classifiers based on dimension reduction, this model is very appealing in terms of interpretation, since one obtains estimated coefficients individually for every group, and also the sparsity of the coefficients is group specific. Simulation studies are conducted to show the performance in comparison to the non-robust version of the multinomial regression estimator, and some real data examples underline the usefulness of this robust estimator particularly in terms of result interpretation and model diagnostics.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Alfons A, Croux C, Gelper S (2013) Sparse least trimmed squares regression for analyzing high-dimensional large data sets. Ann Appl Stat 7(1):226–248MathSciNetCrossRefMATH Alfons A, Croux C, Gelper S (2013) Sparse least trimmed squares regression for analyzing high-dimensional large data sets. Ann Appl Stat 7(1):226–248MathSciNetCrossRefMATH
go back to reference Castilla E, Ghosh A, Martin N, Pardo L (2018) New robust statistical procedures for polytomous logistic regression models. Biometrics 74(4):1282–1291MathSciNetCrossRef Castilla E, Ghosh A, Martin N, Pardo L (2018) New robust statistical procedures for polytomous logistic regression models. Biometrics 74(4):1282–1291MathSciNetCrossRef
go back to reference Cawley G, Talbot N, Girolami M (2006) Sparse multinomial logistic regression via bayesian L1 regularisation. Adv Neural Inf Process Syst 19:209–216 Cawley G, Talbot N, Girolami M (2006) Sparse multinomial logistic regression via bayesian L1 regularisation. Adv Neural Inf Process Syst 19:209–216
go back to reference Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22CrossRef Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22CrossRef
go back to reference Friedman J, Hastie T, Tibshirani R, Narasimhan B, Tay K, Simon N, Qian J, Yang J (2021) glmnet: Lasso and Elastic-Net Regularized Generalized Linear Models. . R Foundation for Statistical Computing, Vienna, Austria. R package version 4.1–3. https://CRAN.R-project.org/package=glmnet Friedman J, Hastie T, Tibshirani R, Narasimhan B, Tay K, Simon N, Qian J, Yang J (2021) glmnet: Lasso and Elastic-Net Regularized Generalized Linear Models. . R Foundation for Statistical Computing, Vienna, Austria. R package version 4.1–3. https://​CRAN.​R-project.​org/​package=​glmnet
go back to reference Johnson RA, Wichern DW (2007) Applied multivariate statistical analysis, 6th edn. Prentice Hall, Upper Saddle RiverMATH Johnson RA, Wichern DW (2007) Applied multivariate statistical analysis, 6th edn. Prentice Hall, Upper Saddle RiverMATH
go back to reference Kurnaz FS, Hoffmann I, Filzmoser P (2018) Robust and sparse estimation methods for high-dimensional linear and logistic regression. Chemom Intell Lab Syst 172:211–222CrossRef Kurnaz FS, Hoffmann I, Filzmoser P (2018) Robust and sparse estimation methods for high-dimensional linear and logistic regression. Chemom Intell Lab Syst 172:211–222CrossRef
go back to reference Lê Cao K, Boitard S, Besse P (2011) Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems. BMC Bioinf 12:253CrossRef Lê Cao K, Boitard S, Besse P (2011) Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems. BMC Bioinf 12:253CrossRef
go back to reference Matan O, Kiang R, Stenard C, Boser B, Denker J, Henderson D, Howard R, Hubbard W, Jackel L, Le Cun Y (1990) Handwritten character recognition using neural network architectures. In: Proceedings of the 4th US Postal Service Advanced Technology Conference 2:1003–1011 Matan O, Kiang R, Stenard C, Boser B, Denker J, Henderson D, Howard R, Hubbard W, Jackel L, Le Cun Y (1990) Handwritten character recognition using neural network architectures. In: Proceedings of the 4th US Postal Service Advanced Technology Conference 2:1003–1011
go back to reference Ortner I, Filzmoser P, Croux C (2020) Robust and sparse multigroup classification by optimal scoring approach. Data Min Knowl Disc 34:723–741MathSciNetCrossRefMATH Ortner I, Filzmoser P, Croux C (2020) Robust and sparse multigroup classification by optimal scoring approach. Data Min Knowl Disc 34:723–741MathSciNetCrossRefMATH
go back to reference Rousseeuw P, Leroy A (1987) Robust regression and outlier detection. Wiley Series in Probability and Statistics, John Wiley and Sons Ltd., New York Rousseeuw P, Leroy A (1987) Robust regression and outlier detection. Wiley Series in Probability and Statistics, John Wiley and Sons Ltd., New York
go back to reference Rousseeuw PJ, Driessen KV (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41(3):212–223CrossRef Rousseeuw PJ, Driessen KV (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41(3):212–223CrossRef
go back to reference Rousseeuw PJ, Driessen KV (2006) Computing its regression for large data sets. Data Min Knowl Disc 12(1):29–45CrossRef Rousseeuw PJ, Driessen KV (2006) Computing its regression for large data sets. Data Min Knowl Disc 12(1):29–45CrossRef
go back to reference Tabatabai MA, Li H, Eby WM, Kengwoung-Keumo JJ, Manne U, Bae S, Fouad M, Singh KP (2014) Robust logistic and probit methods for binary and multinomial regression. J Biometrics Biostat 5(4) Tabatabai MA, Li H, Eby WM, Kengwoung-Keumo JJ, Manne U, Bae S, Fouad M, Singh KP (2014) Robust logistic and probit methods for binary and multinomial regression. J Biometrics Biostat 5(4)
go back to reference Yin M, Zeng D, Gao J, Wu Z, Xie S (2018) Robust multinomial logistic regression based on RPCA. IEEE J Sel Topics Signal Proces 12(6):1144–1154CrossRef Yin M, Zeng D, Gao J, Wu Z, Xie S (2018) Robust multinomial logistic regression based on RPCA. IEEE J Sel Topics Signal Proces 12(6):1144–1154CrossRef
Metadata
Title
Robust and sparse multinomial regression in high dimensions
Authors
Fatma Sevinç Kurnaz
Peter Filzmoser
Publication date
16-04-2023
Publisher
Springer US
Published in
Data Mining and Knowledge Discovery / Issue 4/2023
Print ISSN: 1384-5810
Electronic ISSN: 1573-756X
DOI
https://doi.org/10.1007/s10618-023-00936-6

Other articles of this Issue 4/2023

Data Mining and Knowledge Discovery 4/2023 Go to the issue

Premium Partner