Top

Advances in Data Analysis and Classification

Published in:

13-12-2022 | Regular Article

LASSO regularization within the LocalGLMnet architecture

Authors: Ronald Richman, Mario V. Wüthrich

Published in: Advances in Data Analysis and Classification | Issue 4/2023

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Deep learning models have been very successful in the application of machine learning methods, often out-performing classical statistical models such as linear regression models or generalized linear models. On the other hand, deep learning models are often criticized for not being explainable nor allowing for variable selection. There are two different ways of dealing with this problem, either we use post-hoc model interpretability methods or we design specific deep learning architectures that allow for an easier interpretation and explanation. This paper builds on our previous work on the LocalGLMnet architecture that gives an interpretable deep learning architecture. In the present paper, we show how group LASSO regularization (and other regularization schemes) can be implemented within the LocalGLMnet architecture so that we receive feature sparsity for variable selection. We benchmark our approach with the recently developed LassoNet of Lemhadri et al. ( LassoNet: a neural network with feature sparsity. J Mach Learn Res 22:1–29, 2021).

previous article A power-controlled reliability assessment for multi-class probabilistic classifiers

next article Proximal methods for sparse optimal scoring and discriminant analysis

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Available only for authorised users

We call our proposal LASSO regularization of the LocalGLMnet. Whereas the initial proposal of the LASSO was indeed for the linear regression model, this has been extended to GLMs, see Sect. 3.4 in Hastie et al. (2015).

The dataset is available at this link: http://lib.stat.cmu.edu/datasets/boston and code for this example is available on Github at this link: https://github.com/RonRichman/Regularized-LocalGLMnet.

The dataset is available at this link: http://www2.math.uconn.edu/~valdez/telematics_syn-032021.csv

Note that due to privacy concerns, these 100, 000 records were generated synthetically based on real data, see So et al. (2021) for a detailed description of this.

The grouped version of the model was applied in accordance with the instructions at https://github.com/lasso-net/lassonet/issues/7.

Agarwal R, Frosst N, Zhang X, Caruana R, Hinton GE (2020) Neural additive models: interpretable machine learning with neural nets. arXiv:2004.13912v1

Apley DW, Zhu J (2020) Visualizing the effects of predictor variables in black box supervised learning models. J R Stat Soc Ser B 82(4):1059–1086

Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232MathSciNetCrossRefMATH

Gneiting T (2011) Making and evaluating point forecasts. J Am Stat Assoc 106(494):746–762MathSciNetCrossRefMATH

Gneiting T, Raftery AE (2007) Strictly proper scoring rules, prediction, and estimation. J Am Stat Assoc 102(477):359–378MathSciNetCrossRefMATH

Harrison D, Rubinfeld DL (1978) Hedonic prices and the demand for clean air. J Environ Econ Manag 5:81–102CrossRefMATH

Hastie T, Tibshirani R, Wainwright M (2015) Statistical learning with sparsity: the Lasso and generalizations. CRC PressCrossRefMATH

Hoerl A, Kennard R (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12:55–67CrossRefMATH

Lee JD, Sun DL, Sun Y, Taylor JE (2016) Exact post-selection inference, with application to the LASSO. Ann Stat 44(3):907–927MathSciNetCrossRefMATH

Lemhadri I, Ruan F, Abraham L, Tibshirani R (2021) LassoNet: a neural network with feature sparsity. J Mach Learn Res 22:1–29MathSciNetMATH

Lindholm M, Richman R, Tsanakas A, Wüthrich MV (2022) Discrimination-free insurance pricing. ASTIN Bull J IAA 52:55–89MathSciNetCrossRefMATH

Merity S, McCann B, Socher R (2017) Revisiting activation regularization for language RNNs. arXiv:1708.01009v1

Merz M, Richman R, Tsanakas A, Wüthrich MV (2022) Interpreting deep learning models with marginal attribution by conditioning on quantiles. Data Min Knowl Discov 36:1335–1370MathSciNetCrossRefMATH

Oelker M-R, Tutz G (2017) A uniform framework for the combination of penalties in generalized structured models. Adv Data Anal Classif 11:97–120MathSciNetCrossRefMATH

Parikh N, Boyd S (2013) Proximal algorithms. Found Trends Optim 1(3):123–231

Richman R (2021) Mind the gap—safely incorporating deep learning models into the actuarial toolkit. SSRN Manuscript ID 3857693

Richman R, Wüthrich MV (2022) LocalGLMnet: interpretable deep learning for tabular data. Scand Actuar J, in press

Tibshirani R (1996) Regression shrinkage and selection via the LASSO. J R Stat Soc Ser B Stat Methodol 58:267–288MathSciNetMATH

Tibshirani R, Saunders M, Rosset S, Zhu J, Knight K (2005) Sparsity and smoothness via the fused LASSO. J R Stat Soc Ser B Stat Methodol 67:91–108MathSciNetCrossRefMATH

Tikhonov AN (1943) On the stability of inverse problems. Dokl Akad Nauk SSSR 39(5):195–198MathSciNet

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. arXiv:1706.03762v5

So B, Boucher JP, Valdez EA (2021) Synthetic dataset generation of driver telematics. Risks 9(4):58CrossRef

Vaughan J, Sudjianto A, Brahimi E, Chen J, Nair VN (2018) Explainable neural networks based on additive index models. arXiv:1806.01933v1

Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B Stat Methodol 68:49–67MathSciNetCrossRefMATH

Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B Stat Methodol 67:301–320MathSciNetCrossRefMATH

Title: LASSO regularization within the LocalGLMnet architecture
Authors: Ronald Richman
Mario V. Wüthrich
Publication date: 13-12-2022
Publisher: Springer Berlin Heidelberg
Published in: Advances in Data Analysis and Classification / Issue 4/2023
Print ISSN: 1862-5347
Electronic ISSN: 1862-5355
DOI: https://doi.org/10.1007/s11634-022-00529-z

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 4/2023

Clustering data with non-ignorable missingness using semi-parametric mixture models assuming independence within components

A power-controlled reliability assessment for multi-class probabilistic classifiers

Editorial for ADAC issue 4 of volume 17 (2023)

Monitoring photochemical pollutants based on symbolic interval-valued data analysis

Sparse correspondence analysis for large contingency tables

Proximal methods for sparse optimal scoring and discriminant analysis

Premium Partner