Skip to main content
Erschienen in: European Actuarial Journal 2/2022

29.08.2021 | Original Research Paper

Loss amount prediction from textual data using a double GLM with shrinkage and selection

verfasst von: Scott Manski, Kaixu Yang, Gee Y. Lee, Tapabrata Maiti

Erschienen in: European Actuarial Journal | Ausgabe 2/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The Gamma model has been widely utilized in a variety of fields, including actuarial science, where it has important applications in insurance loss predictions. Meanwhile, high dimensional models and their applications have become more common in the statistics literature in recent years. The availability of such high dimensional models have allowed the analysis of non-traditional data, including those containing textual descriptions of the response. In the models used in such applications, the dispersion may be designed to be related to a set of covariates, as opposed to being a single fixed value for the entire population. Following this approach, we incorporate a group Lasso type penalty in both the dispersion and the mean parameterization for a Gamma model, and illustrate its use in a predictive analytics application in actuarial science. In particular, we apply the method to an insurance claim prediction problem involving textual data analysis methods. Simulations are conducted to illustrate the variable selection and model fitting performance of our method.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Tibshirani R (1996) Regression shrinkage and selection via the lasso. Stat Comput 58:267–288MathSciNetMATH Tibshirani R (1996) Regression shrinkage and selection via the lasso. Stat Comput 58:267–288MathSciNetMATH
2.
Zurück zum Zitat Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B (Methodol) 68:49–67MathSciNetCrossRefMATH Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B (Methodol) 68:49–67MathSciNetCrossRefMATH
3.
4.
Zurück zum Zitat Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33:1–22CrossRef Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33:1–22CrossRef
5.
6.
Zurück zum Zitat Qian W, Yang Y, Zou H (2016) Tweedie’s compound Poisson model with grouped elastic net. J Comput Graph Stat 25:606–625MathSciNetCrossRef Qian W, Yang Y, Zou H (2016) Tweedie’s compound Poisson model with grouped elastic net. J Comput Graph Stat 25:606–625MathSciNetCrossRef
7.
Zurück zum Zitat Frees EW, Lee G (2015) Rating endorsements using generalized linear models. Variance 10:51–74 Frees EW, Lee G (2015) Rating endorsements using generalized linear models. Variance 10:51–74
8.
Zurück zum Zitat Yin C, Lin X (2016) Efficient estimation of Erlang mixtures using iSCAD penalty with insurance application. ASTIN Bull J IAA 46(3):779–799MathSciNetCrossRefMATH Yin C, Lin X (2016) Efficient estimation of Erlang mixtures using iSCAD penalty with insurance application. ASTIN Bull J IAA 46(3):779–799MathSciNetCrossRefMATH
10.
Zurück zum Zitat Tzougas G, Karlis D (2020) An EM algorithm for fitting a new class of mixed exponential regression models with varying dispersion. ASTIN Bull J IAA 50(2):555–583MathSciNetCrossRefMATH Tzougas G, Karlis D (2020) An EM algorithm for fitting a new class of mixed exponential regression models with varying dispersion. ASTIN Bull J IAA 50(2):555–583MathSciNetCrossRefMATH
12.
Zurück zum Zitat Devriendt S, Antonio K, Reynkens T, Verbelen R (2020) Sparse regression with multi-type regularized feature modeling. Insur Math Econ 96:248–261MathSciNetCrossRefMATH Devriendt S, Antonio K, Reynkens T, Verbelen R (2020) Sparse regression with multi-type regularized feature modeling. Insur Math Econ 96:248–261MathSciNetCrossRefMATH
14.
Zurück zum Zitat Smyth GK (1989) Generalized linear models with varying dispersion. J R Stat Soc Ser B (Methodol) 51:47–60MathSciNet Smyth GK (1989) Generalized linear models with varying dispersion. J R Stat Soc Ser B (Methodol) 51:47–60MathSciNet
15.
Zurück zum Zitat Smyth GK, Jørgensen B (2002) Fitting Tweedie’s compound Poisson model to insurance claims data: dispersion modelling. ASTIN Bull 32:143–157MathSciNetCrossRefMATH Smyth GK, Jørgensen B (2002) Fitting Tweedie’s compound Poisson model to insurance claims data: dispersion modelling. ASTIN Bull 32:143–157MathSciNetCrossRefMATH
16.
17.
Zurück zum Zitat Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst 26:3111–3119 Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst 26:3111–3119
18.
Zurück zum Zitat Pennington J, Socher R, Manning CD (2014) GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), October 25–29, 2014, Doha, Qatar, pp 1532–1543 Pennington J, Socher R, Manning CD (2014) GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), October 25–29, 2014, Doha, Qatar, pp 1532–1543
19.
Zurück zum Zitat Wood SN (2017) Generalized additive models: an introduction with R, 2nd edn. CRC Press, Boca RatonCrossRefMATH Wood SN (2017) Generalized additive models: an introduction with R, 2nd edn. CRC Press, Boca RatonCrossRefMATH
Metadaten
Titel
Loss amount prediction from textual data using a double GLM with shrinkage and selection
verfasst von
Scott Manski
Kaixu Yang
Gee Y. Lee
Tapabrata Maiti
Publikationsdatum
29.08.2021
Verlag
Springer Berlin Heidelberg
Erschienen in
European Actuarial Journal / Ausgabe 2/2022
Print ISSN: 2190-9733
Elektronische ISSN: 2190-9741
DOI
https://doi.org/10.1007/s13385-021-00294-x

Weitere Artikel der Ausgabe 2/2022

European Actuarial Journal 2/2022 Zur Ausgabe