Skip to main content
Erschienen in: Advances in Data Analysis and Classification 1/2017

29.04.2015 | Regular Article

A uniform framework for the combination of penalties in generalized structured models

verfasst von: Margret-Ruth Oelker, Gerhard Tutz

Erschienen in: Advances in Data Analysis and Classification | Ausgabe 1/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Penalized estimation has become an established tool for regularization and model selection in regression models. A variety of penalties with specific features are available and effective algorithms for specific penalties have been proposed. But not much is available to fit models with a combination of different penalties. When modeling the rent data of Munich as in our application, various types of predictors call for a combination of a Ridge, a group Lasso and a Lasso-type penalty within one model. We propose to approximate penalties that are (semi-)norms of scalar linear transformations of the coefficient vector in generalized structured models—such that penalties of various kinds can be combined in one model. The approach is very general such that the Lasso, the fused Lasso, the Ridge, the smoothly clipped absolute deviation penalty, the elastic net and many more penalties are embedded. The computation is based on conventional penalized iteratively re-weighted least squares algorithms and hence, easy to implement. New penalties can be incorporated quickly. The approach is extended to penalties with vector based arguments. There are several possibilities to choose the penalty parameter(s). A software implementation is available. Some illustrative examples show promising results.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
Zurück zum Zitat de Rooi J, Eilers P (2011) Deconvolution of pulse trains with the L0 penalty. Analytica Chimica Acta 705:218–226CrossRef de Rooi J, Eilers P (2011) Deconvolution of pulse trains with the L0 penalty. Analytica Chimica Acta 705:218–226CrossRef
Zurück zum Zitat Donoho D, Elad M (2003) Optimally sparse representation in general (nonorthogonal) dictionaries via \(l^1\) minimization. Proc Natl Acad Sci 100(5):2197–2202MathSciNetCrossRefMATH Donoho D, Elad M (2003) Optimally sparse representation in general (nonorthogonal) dictionaries via \(l^1\) minimization. Proc Natl Acad Sci 100(5):2197–2202MathSciNetCrossRefMATH
Zurück zum Zitat Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least anlge regression. Ann Stat 32:407–499CrossRefMATH Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least anlge regression. Ann Stat 32:407–499CrossRefMATH
Zurück zum Zitat Fahrmeir L, Belitz C, Biller C, Brezger A, Heim S, Hennerfeind A, Jerak A (2007) Statistik. Dokumentation und Analysen, Landeshauptstadt München, Sozialreferat, Amt für Wohnen und Migration Fahrmeir L, Belitz C, Biller C, Brezger A, Heim S, Hennerfeind A, Jerak A (2007) Statistik. Dokumentation und Analysen, Landeshauptstadt München, Sozialreferat, Amt für Wohnen und Migration
Zurück zum Zitat Fahrmeir L, Kneib T, Konrath S (2010) Bayesian regularisation in structured additive regression: a unifying perspective on shrinkage, smoothing and predictor selection. Stat Comput 20(2):203–219MathSciNetCrossRef Fahrmeir L, Kneib T, Konrath S (2010) Bayesian regularisation in structured additive regression: a unifying perspective on shrinkage, smoothing and predictor selection. Stat Comput 20(2):203–219MathSciNetCrossRef
Zurück zum Zitat Fahrmeir L, Kneib T, Lang S (2004) Penalized structured additive regression for space-time data: a bayesian perspective. Stat Sinica 14(3):715–745MathSciNetMATH Fahrmeir L, Kneib T, Lang S (2004) Penalized structured additive regression for space-time data: a bayesian perspective. Stat Sinica 14(3):715–745MathSciNetMATH
Zurück zum Zitat Fahrmeir L, Tutz G (2001) Multivariate statistical modelling based on generalized linear models. Springer, New YorkCrossRefMATH Fahrmeir L, Tutz G (2001) Multivariate statistical modelling based on generalized linear models. Springer, New YorkCrossRefMATH
Zurück zum Zitat Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360MathSciNetCrossRefMATH Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360MathSciNetCrossRefMATH
Zurück zum Zitat Frank lE, Friedman JH (1993) A statistical view of some chemometrics regression tools. Technometrics 35(2):109–135CrossRefMATH Frank lE, Friedman JH (1993) A statistical view of some chemometrics regression tools. Technometrics 35(2):109–135CrossRefMATH
Zurück zum Zitat Friedman JH, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22 (glmnet, R package version 1.9-8) Friedman JH, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22 (glmnet, R package version 1.9-8)
Zurück zum Zitat Gertheiss J, Hogger S, Oberhauser C, Tutz G (2011) Selection of ordinally scaled independent variables with applications to international classification of functioning core sets. R Stat Soc Ser C Appl Stat 60(3):377–395MathSciNetCrossRef Gertheiss J, Hogger S, Oberhauser C, Tutz G (2011) Selection of ordinally scaled independent variables with applications to international classification of functioning core sets. R Stat Soc Ser C Appl Stat 60(3):377–395MathSciNetCrossRef
Zurück zum Zitat Gertheiss J, Tutz G (2012) Regularization and model selection with categorial effect modifiers. Stat Sinica 22(3):957–982MathSciNetMATH Gertheiss J, Tutz G (2012) Regularization and model selection with categorial effect modifiers. Stat Sinica 22(3):957–982MathSciNetMATH
Zurück zum Zitat Goeman JJ (2010) L1 penalized estimation in the cox proportional hazards model. Biom J 52(1):70–84MathSciNetMATH Goeman JJ (2010) L1 penalized estimation in the cox proportional hazards model. Biom J 52(1):70–84MathSciNetMATH
Zurück zum Zitat Hastie T, Efron B (2013) lars: Least angle regression, Lasso and forward stagewise. R package version 1:2 Hastie T, Efron B (2013) lars: Least angle regression, Lasso and forward stagewise. R package version 1:2
Zurück zum Zitat Hoerl AE, Kennard RW (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1):55–67CrossRefMATH Hoerl AE, Kennard RW (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1):55–67CrossRefMATH
Zurück zum Zitat Kneib T, Heinzl F, Brezger A, Bove DS, Klein N (2014) BayesX: R utilities accompanying the software package BayesX. R package versions 0.2-6 Kneib T, Heinzl F, Brezger A, Bove DS, Klein N (2014) BayesX: R utilities accompanying the software package BayesX. R package versions 0.2-6
Zurück zum Zitat Koch I (1996) On the asymptotic performance of median smoothers in image analysis and nonparametric regression. Ann Stat 24(4):1648–1666MathSciNetCrossRefMATH Koch I (1996) On the asymptotic performance of median smoothers in image analysis and nonparametric regression. Ann Stat 24(4):1648–1666MathSciNetCrossRefMATH
Zurück zum Zitat Marx BD, Eilers PHC (1998) Direct generalized additive modeling with penalized likelihood. J Comput Stat Data Anal 28:193–209CrossRefMATH Marx BD, Eilers PHC (1998) Direct generalized additive modeling with penalized likelihood. J Comput Stat Data Anal 28:193–209CrossRefMATH
Zurück zum Zitat Meier L (2013) grplasso: Fitting user specified models with group Lasso penalty. R package version 0.4-5 Meier L (2013) grplasso: Fitting user specified models with group Lasso penalty. R package version 0.4-5
Zurück zum Zitat Meier L, van de Geer S, Bnhlmann P (2008) The group Lasso for logistic regression. R Stat Soc Ser B Stat Methodol 70(1):53–71 Meier L, van de Geer S, Bnhlmann P (2008) The group Lasso for logistic regression. R Stat Soc Ser B Stat Methodol 70(1):53–71
Zurück zum Zitat Oelker M-R (2015) gvcm.cat: Regularized categorial effects/categorial effect modifiers in GLMs. R package version 1.9 Oelker M-R (2015) gvcm.cat: Regularized categorial effects/categorial effect modifiers in GLMs. R package version 1.9
Zurück zum Zitat Osborne MR, Turlach BA (2011) A homotopy algorithm for the quantile regression lasso and related piecewise linear problems. J Comput Graph Stat 20(4):972–987MathSciNetCrossRef Osborne MR, Turlach BA (2011) A homotopy algorithm for the quantile regression lasso and related piecewise linear problems. J Comput Graph Stat 20(4):972–987MathSciNetCrossRef
Zurück zum Zitat R Core Team (2015) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. R version 3.1.3 (2015-03-09) R Core Team (2015) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. R version 3.1.3 (2015-03-09)
Zurück zum Zitat Rippe RCA, Meulman JJ, Eilers PHC (2012) Visualization of genomic changes by segmented smoothing using an \(l_0\) penalty. PLoS One 7(6):1–14CrossRef Rippe RCA, Meulman JJ, Eilers PHC (2012) Visualization of genomic changes by segmented smoothing using an \(l_0\) penalty. PLoS One 7(6):1–14CrossRef
Zurück zum Zitat Tibshirani R (1996) Regression shrinkage and selection via the LASSO. R Stat Soc Ser B Stat Methodol 58(1):267–288MathSciNetMATH Tibshirani R (1996) Regression shrinkage and selection via the LASSO. R Stat Soc Ser B Stat Methodol 58(1):267–288MathSciNetMATH
Zurück zum Zitat Tibshirani R, Saunders M, Rosset J, Zhu J, Knight K (2005) Sparsity and smoothness via the fused LASSO. R Stat Soc Ser B Stat Methodol 67(1):91–108MathSciNetCrossRefMATH Tibshirani R, Saunders M, Rosset J, Zhu J, Knight K (2005) Sparsity and smoothness via the fused LASSO. R Stat Soc Ser B Stat Methodol 67(1):91–108MathSciNetCrossRefMATH
Zurück zum Zitat Ulbricht J (2010) Variable selection in generalized linear models. Dissertation, Department of Statistics, Ludwig-Maximilians-Universität München, Verlag Dr. Hut Ulbricht J (2010) Variable selection in generalized linear models. Dissertation, Department of Statistics, Ludwig-Maximilians-Universität München, Verlag Dr. Hut
Zurück zum Zitat Verbyla AP, Cullis BR, Kenward MG, Welham SJ (1999) The analysis of designed experiments and longitudinal data by using smoothing splines. J R Stat Soc Ser C (Appl Stat) 48(3):269–311CrossRefMATH Verbyla AP, Cullis BR, Kenward MG, Welham SJ (1999) The analysis of designed experiments and longitudinal data by using smoothing splines. J R Stat Soc Ser C (Appl Stat) 48(3):269–311CrossRefMATH
Zurück zum Zitat Wood S (2004) Stable and efficient multiple smoothing parameter estimation for generalized additive models. J Am Stat Assoc 99:673–686MathSciNetCrossRefMATH Wood S (2004) Stable and efficient multiple smoothing parameter estimation for generalized additive models. J Am Stat Assoc 99:673–686MathSciNetCrossRefMATH
Zurück zum Zitat Wood S (2011) Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. J R Stat Soc Ser B 73(1):3–36 (mgcv, R package versions 1.8-4) Wood S (2011) Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. J R Stat Soc Ser B 73(1):3–36 (mgcv, R package versions 1.8-4)
Zurück zum Zitat Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. R Stat Soc Ser B Stat Methodol 68(1):49–67MathSciNetCrossRefMATH Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. R Stat Soc Ser B Stat Methodol 68(1):49–67MathSciNetCrossRefMATH
Zurück zum Zitat Zou H, Hastie T (2005) Regularization and variable selection via the Elastic net. R Stat Soc Ser B Stat Methodol 67(2):301–320MathSciNetCrossRefMATH Zou H, Hastie T (2005) Regularization and variable selection via the Elastic net. R Stat Soc Ser B Stat Methodol 67(2):301–320MathSciNetCrossRefMATH
Metadaten
Titel
A uniform framework for the combination of penalties in generalized structured models
verfasst von
Margret-Ruth Oelker
Gerhard Tutz
Publikationsdatum
29.04.2015
Verlag
Springer Berlin Heidelberg
Erschienen in
Advances in Data Analysis and Classification / Ausgabe 1/2017
Print ISSN: 1862-5347
Elektronische ISSN: 1862-5355
DOI
https://doi.org/10.1007/s11634-015-0205-y

Weitere Artikel der Ausgabe 1/2017

Advances in Data Analysis and Classification 1/2017 Zur Ausgabe

Premium Partner