Skip to main content
main-content

Über dieses Buch

This book summarizes the state of the art in tree-based methods for insurance: regression trees, random forests and boosting methods. It also exhibits the tools which make it possible to assess the predictive performance of tree-based models. Actuaries need these advanced analytical tools to turn the massive data sets now at their disposal into opportunities.

The exposition alternates between methodological aspects and numerical illustrations or case studies. All numerical illustrations are performed with the R statistical software. The technical prerequisites are kept at a reasonable level in order to reach a broad readership. In particular, master's students in actuarial sciences and actuaries wishing to update their skills in machine learning will find the book useful.

This is the second of three volumes entitled Effective Statistical Learning Methods for Actuaries. Written by actuaries for actuaries, this series offers a comprehensive overview of insurance data analytics with applications to P&C, life and health insurance.

Inhaltsverzeichnis

Frontmatter

Chapter 1. Introduction

Abstract
Insurance companies cover risks (that is, random financial losses) by collecting premiums. Premiums are generally paid in advance (hence their name). The pure premium is the amount collected by the insurance company, to be re-distributed as benefits among policyholders and third parties in execution of the contract, without loss nor profit. Under the conditions of validity of the law of large numbers, the pure premium is the expected amount of compensation to be paid by the insurer (sometimes discounted to policy issue in case of long-term liabilities).
Michel Denuit, Donatien Hainaut, Julien Trufin

Chapter 2. Performance Evaluation

Abstract
In actuarial pricing, the objective is to evaluate the pure premium as accurately as possible. The target is thus the conditional expectation \(\mu (\textit{\textbf{X}})=\text {E}[Y|\textit{\textbf{X}}]\) of the response Y (claim number or claim amount for instance) given the available information \(\textit{\textbf{X}}\).
Michel Denuit, Donatien Hainaut, Julien Trufin

Chapter 3. Regression Trees

Abstract
In this chapter, we present the regression trees introduced by Breiman et al. (1984). Regression trees are at the core of this second volume.
Michel Denuit, Donatien Hainaut, Julien Trufin

Chapter 4. Bagging Trees and Random Forests

Abstract
Two ensemble methods are considered in this chapter, namely bagging trees and random forests. One issue with regression trees is their high variance. There is a high variability of the prediction \(\widehat{\mu }_\mathcal {D}(\textit{\textbf{x}})\) over the trees trained from all possible training sets \(\mathcal {D}\). Bagging trees and random forests aim to reduce the variance without too much altering bias.
Michel Denuit, Donatien Hainaut, Julien Trufin

Chapter 5. Boosting Trees

Abstract
Bagging trees and random forests base their predictions on an ensemble of trees. In this chapter, we consider another training procedure based on an ensemble of trees, called boosting trees. However, the way the trees are produced and combined differ between random forests (and so bagging trees) and boosting trees.
Michel Denuit, Donatien Hainaut, Julien Trufin

Chapter 6. Other Measures for Model Comparison

Abstract
Actuarial pricing models are generally calibrated so that they minimize the generalization error computed with an appropriate loss function. Model selection is based on the generalization error.
Michel Denuit, Donatien Hainaut, Julien Trufin
Weitere Informationen