Mortality density forecasts: An analysis of six stochastic mortality models

doi:10.1016/j.insmatheco.2010.12.005

Insurance: Mathematics and Economics

Volume 48, Issue 3, May 2011, Pages 355-367

https://doi.org/10.1016/j.insmatheco.2010.12.005 Get rights and content

Abstract

This paper develops a framework for developing forecasts of future mortality rates. We discuss the suitability of six stochastic mortality models for forecasting future mortality and estimating the density of mortality rates at different ages. In particular, the models are assessed individually with reference to the following qualitative criteria that focus on the plausibility of their forecasts: biological reasonableness; the plausibility of predicted levels of uncertainty in forecasts at different ages; and the robustness of the forecasts relative to the sample period used to fit the model. An important, though unsurprising, conclusion is that a good fit to historical data does not guarantee sensible forecasts. We also discuss the issue of model risk, common to many modelling situations in demography and elsewhere. We find that even for those models satisfying our qualitative criteria, there are significant differences among central forecasts of mortality rates at different ages and among the distributions surrounding those central forecasts.

Introduction

The last twenty years has seen a growing range of models for forecasting mortality. Early work on stochastic models by McNown and Rogers (1989) and Lee and Carter (1992) has been followed by:

•
developments on the statistical foundations by, for example, Lee and Miller (2001), Brouhns et al. (2002), Booth et al. (2002a), Czado et al. (2005), Delwarde et al. (2007), and Li et al. (2009); and
•
the development of new stochastic models by Booth et al., 2002a, Booth et al., 2002b, Booth et al., 2005, Cairns et al. (2006b) (CBD), Renshaw and Haberman (2006), Hyndman and Ullah (2007), Cairns et al. (2009), Plat (2009) and Debonneuil (2010).

These stochastic models vary significantly according to a number of key elements: number of sources of randomness driving mortality improvements at different ages; assumptions of smoothness in the age and period dimensions; inclusion or not of cohort effects; estimation method.

A number of studies have sought to draw out more formal comparisons between a number of these models. Some of these limit themselves to comparison of some variants of the Lee–Carter model (Lee and Miller, 2001, Booth et al., 2002a, Booth et al., 2002b, Booth et al., 2005). Hyndman and Ullah (2007) compare out-of-sample forecasting performance of the Lee–Carter model and its Lee–Miller and Booth–Maindonald–Smith variants with a new class of multifactor models. CMI, 2005, CMI, 2006, CMI, 2007 compare the Lee–Carter, Renshaw and Haberman and $P$ -splines models. These types of analysis have been extended to a wider range of models with substantially different characteristics by the present authors; this paper is one part of this endeavour.

Cairns et al. (2009) focused on quantitative and qualitative comparisons of eight stochastic mortality models (see Table 1 in Section 2), based on their general characteristics and ability to explain historical patterns of mortality. The criteria employed included: quality of fit, as measured by the Bayes information criterion (BIC); ease of implementation; parsimony; transparency; incorporation of cohort effects; ability to produce a non-trivial correlation structure between ages; robustness of parameter estimates relative to the period of data employed.

Complementing this, Dowd et al., 2010a, Dowd et al., 2010b carry out a range of formal, out-of-sample backtesting and goodness-of-fit tests using mortality data for English and Welsh males. They find that some models fare better under some criteria than others, but that no single model can claim superiority under all the criteria considered. In any event, different patterns of mortality improvements in different countries means that models that are best for one country might not be as suitable for another. Finally, this paper focuses on the ex ante plausibility and robustness of forecasts produced by the different models. The present paper, therefore, focuses on the ex ante qualitative aspects of forecasts, while the previous works (Cairns et al., 2009, Dowd et al., 2010a, Dowd et al., 2010b) focus on the ex post quantitative aspects.

Building on the analyses of historical data of Cairns et al. (2009) and Dowd et al., 2010a, Dowd et al., 2010b, the present paper focuses on ex ante qualitative aspects of mortality forecasts and the distribution of results around central forecasts. Specifically, we introduce a number of qualitative criteria that focus on the plausibility of forecasts made using different models.

Often in this paper, we will refer to the concept of biological reasonableness (which was first proposed in Cairns et al., 2006a). The concept is not intended to refer to criteria based on hard scientific (biological or medical) facts. Instead, it is intended to cover a wide range of subjective criteria, related to biology, medicine and the environment. What the modeller needs to do is look at the results and ask the question: what mixture of biological factors, medical advances and environmental changes would have to happen to cause this particular set of forecasts? As one example, the upper set of projections in Fig. 4 at age 85 looks rather more unusual than the two lower sets of projections under a particular model. Under the upper scenario, we would have to think of a convincing biological, medical or environmental reason why, with certainty, age 85 mortality rates are going to deteriorate to 1960’s levels. If the modeller cannot think of any good reason why this might happen, then she must rule out the model (at least with its current method of calibration) on grounds of biological unreasonableness.

Besides biological reasonableness, we also consider the issue of the plausibility of forecast levels of uncertainty in projections at different ages. The objective here is to judge whether or not the pattern of uncertainty at different ages is consistent with historical levels of variability at different ages: we can sometimes conclude that a particular model is less plausible on the basis of forecast levels of uncertainty.

An important additional issue concerns the robustness of forecasts relative to the choice of sample period and age range. If we make a small change either to the sample period (for example, when we add in the latest mortality data) or to the age range, we would normally expect to see, with a robust model, only modest changes in the forecasts at all ages. Where a model is found to lack robustness with one sample population, there is a danger that it will lack robustness if applied to another sample population and should, therefore, either be used with great care or not used at all.

Although application of such a wide ranging set of model selection criteria will eliminate some models, we will demonstrate that mortality forecasting is no different from many other modelling problems where model risk is significant: mortality forecasters should acknowledge this fact and make use of multiple models rather than pretend that it is sufficient to make forecasts based on any single model.

We will consider qualitative assessment criteria that allow us to examine the ex ante plausibility of the forecasts generated by six stochastic mortality models, illustrating with national population data for England & Wales (EW) for an age group consisting of males 60–89 years old and estimated over the years 1961–2004. This is supplemented by a briefer discussion of forecasts for the equivalent US dataset. We focus on higher ages because our current principal research interest is the longevity risk facing pension plans and annuity providers.

We will concentrate on six of the models discussed by Cairns et al. (2009): these are labelled in Table 1 as M1, M2, M3, M5, M7 and M8. Models M2, M3, M7 and M8 include a cohort effect and these emerged in Cairns et al. (2009) as the best fitting, in terms of BIC, of the eight models considered on the basis of male mortality data from EW and the US for the age group under consideration. M2 is the Renshaw and Haberman (2006) extension of the original Lee–Carter model (M1), M3 is a special case of M2, and M7 and M8 are extensions of the original CBD model (M5). The original Lee–Carter and CBD models had no cohort effect, and provide useful benchmarks for comparison with the four models involving cohort effects. M4 is not considered any further in this study because of its low BIC and qualitative rankings for these datasets in Cairns et al. (2009, Table 3). (M4 focuses on identifying the smooth underlying trend. However, this means that it is not as good as the other models at capturing short-term deviations from this trend.) Although M3 is a special case of M2, we include it here because it had a relatively high BIC ranking for the US data, and because it avoids a problem with the robustness of parameter estimates for M2 identified by CMI (2007), Cairns et al. (2009), and Dowd et al., 2010a, Dowd et al., 2010b. M6 was also dropped from the original set of eight models: M6 is a special case of M7, and M7 was found to be stable and to deliver consistently better and more plausible results than M6.

The structure of the paper is as follows. In Section 2, we specify the stochastic processes needed for forecasting the term structure of mortality rates for each of the models. Results for the different models obtained using EW male mortality data are compared and contrasted in Section 3. Section 5 examines two applications of the forecast models, namely applications to survivor indices and annuity prices, and makes additional comments on model risk and plausibility of the forecasts. Each model is then tested for the robustness of its forecasts in Section 4. Finally, in Section 6, we summarise an analysis for US male mortality data: our aim is to draw out features of the US data that are distinct from those of the EW data. Section 7 concludes.

Section snippets

Forecasting with stochastic mortality models

We take six stochastic mortality models which, on the basis of fitting to historical data, appear to be suitable candidates for forecasting future mortality at higher ages, and prepare them for forecasting. To do this, we need to specify the stochastic processes that drive the age, period and (if present) cohort effects in each model.

We define $m (t, x)$ to be the death rate in year $t$ at age $x$ , and $q (t, x)$ to be the corresponding mortality rate, with the relationship between them given by $q (t, x) = 1 -$

Forecasts and model comparisons

We now proceed to compare the forecasting results for EW for the nine models M1, M2A, M2B, M3A, M3B, M5, M7, M8A and M8B. (Corresponding results for US males are presented and discussed in Section 6.) To do this, we will present fan charts of the forecasts produced by the models. Each fan chart illustrates the forecast output from the stochastic mortality models by dividing the simulated densities into 5% quantile bands. Fan charts give us the opportunity to explore any distinctive visual

Robustness of projections

We now assess the projections from models M1, M2B, M3B, M5, M7, M8A and M8B for robustness relative to the sample period used in estimating the model. For each model, we compare three sets of simulations:

•
Scenario 1: (A) The underlying model is first fitted to mortality data from 1961 to 2004. (B) The stochastic model for the $κ_{t}^{(i)}$ period effects and the $γ_{t - x}^{(i)}$ cohort effects is then fitted to the full set of values resulting from (A) (44 $κ_{t}^{(i)}$ ’s and 60 $γ_{t - x}^{(i)}$ ’s).
•
Scenario 2: (A) The

Applications: survivor index and annuity price

In this section, we switch our attention from forecasts of the underlying mortality rates, $q (t, x)$ , to two “derivative” quantities that utilise these forecasts. The first of these is a survivor index, and the second is the price of an annuity (which is, in turn, derived from the survivor index). Forecasts of these will provide additional evidence of possible model risk.

Fig. 7 shows the fan charts produced by each model of the future value of the survivor index $S (t, 65)$ ; this measures the

Results for US males

In this section, we report briefly on a repeat analysis of US males data from 1968 to 2003. (For a more detailed discussion, see Cairns et al., 2008b.) Our aim in this repeat analysis is to see whether the conclusions that we have drawn in Sections 3 Forecasts and model comparisons, 4 Robustness of projections, 5 Applications: survivor index and annuity price, 6 Results for US males are specific to the England & Wales males dataset or whether they might apply more generally to the US population

Conclusions

One of the main lessons from this investigation into forecasting with stochastic mortality models is the danger of ranking and selecting models purely on the basis of how well they fit historical data: it is quite possible for a model to give a good fit to the historical data, and still give inadequate forecasts. We propose here new qualitative criteria that focus on a model’s ability to produce plausible forecasts: biological reasonableness of forecast mortality term structures, biological

Acknowledgement

The authors would like to thank an anonymous referee for his/her helpful comments.

References (28)

N. Brouhns et al.
A Poisson log-bilinear regression approach to the construction of projected life tables
Insurance: Mathematics and Economics
(2002)
C. Czado et al.
Bayesian Poisson log-linear mortality projections
Insurance: Mathematics and Economics
(2005)
K. Dowd et al.
Evaluating the goodness of fit of stochastic mortality models
Insurance: Mathematics and Economics
(2010)
R.J. Hyndman et al.
Robust forecasting of mortality and fertility rates: a functional data approach
Computational Statistics & Data Analysis
(2007)
R. Plat
On stochastic mortality modelling
Insurance: Mathematics and Economics
(2009)
A.E. Renshaw et al.
A cohort-based extension to the Lee–Carter model for mortality reduction factors
Insurance: Mathematics and Economics
(2006)
Bauwens, L., Sucarrat, G., 2008. General to specific modelling of exchange rate volatility: a forecast evaluation....
H. Booth et al.
Applying Lee–Carter under conditions of variable mortality decline
Population Studies
(2002)
Booth, H., Maindonald, J., Smith, L., 2002b. Age–time interactions in mortality projection: applying Lee–Carter to...
H. Booth et al.
Evaluation of the variants of the Lee–Carter method of forecasting mortality: a multi-country comparison
New Zealand Population Review
(2005)

Butt, Z., Haberman, S., 2010. A comparative study of parametric mortality projection models. Actuarial Research Paper...

A.J.G. Cairns et al.

Pricing death: frameworks for the valuation and securitization of mortality risk

ASTIN Bulletin

(2006)

A.J.G. Cairns et al.

A two-factor model for stochastic mortality with parameter uncertainty: theory and calibration

Journal of Risk and Insurance

(2006)

A.J.G. Cairns et al.

Modelling and management of mortality risk: a review

Scandinavian Actuarial Journal

(2008)

Cited by (215)

Thirty years on: A review of the Lee–Carter method for forecasting mortality
2023, International Journal of Forecasting
The introduction of the Lee–Carter (LC) method marked a breakthrough in mortality forecasting, providing a simple yet powerful data-driven stochastic approach. The method has the merit of capturing the dynamics of mortality change by a single time index that is almost invariably linear. This thirtieth anniversary review of its 1992 publication examines the LC method and the large body of research that it has since spawned. We first describe the method and present a 30-year ex post evaluation of the original LC forecast for U.S. mortality. We then review the most prominent extensions of the LC method in relation to the limitations that they sought to address. With a focus on the efficacy of the various extensions, we review existing evaluations and comparisons. To conclude, we juxtapose the two main statistical approaches used, discuss further issues, and identify several potential avenues for future research.
Bayesian model averaging for mortality forecasting using leave-future-out validation
2023, International Journal of Forecasting
Predicting the evolution of mortality rates plays a central role for life insurance and pension funds. Various stochastic frameworks have been developed to model mortality patterns by taking into account the main stylized facts driving these patterns. However, relying on the prediction of one specific model can be too restrictive and can lead to some well-documented drawbacks, including model misspecification, parameter uncertainty, and overfitting. To address these issues we first consider mortality modeling in a Bayesian negative-binomial framework to account for overdispersion and the uncertainty about the parameter estimates in a natural and coherent way. Model averaging techniques are then considered as a response to model misspecifications. In this paper, we propose two methods based on leave-future-out validation and compare them to standard Bayesian model averaging (BMA) based on marginal likelihood. An intensive numerical study is carried out over a large range of simulation setups to compare the performances of the proposed methodologies. An illustration is then proposed on real-life mortality datasets, along with a sensitivity analysis to a Covid-type scenario. Overall, we found that both methods based on an out-of-sample criterion outperform the standard BMA approach in terms of prediction performance and robustness.
Pricing extreme mortality risk in the wake of the COVID-19 pandemic
2023, Insurance: Mathematics and Economics
In pricing extreme mortality risk, it is commonly assumed that interest rate and mortality rate are independent. However, the COVID-19 pandemic calls this assumption into question. In this paper, we employ a bivariate affine jump-diffusion model to describe the joint dynamics of interest rate and excess mortality, allowing for both correlated diffusions and joint jumps. Utilizing the latest U.S. mortality and interest rate data, we find a significant negative correlation between interest rate and excess mortality, and a much higher jump intensity when the pandemic experience is considered. Moreover, we construct a risk-neutral pricing measure that accounts for both diffusion and jump risk premia, and we solve for the market prices of risk based on mortality bond prices. Our results show that the pandemic experience can drastically change investors' perception of the mortality risk market in the post-pandemic era.
Expressive mortality models through Gaussian process kernels
2024, ASTIN Bulletin
A market consistent approach to the valuation of no-negative equity guarantees and equity release mortgages
2023, Journal of Demographic Economics
A calendar year mortality model in continuous time
2023, ASTIN Bulletin

View all citing articles on Scopus

¹: Disclaimer: This report has been partially prepared by the Pension Advisory group, and not by any research department, of JPMorgan Chase & Co. and its subsidiaries (“JPMorgan”). Information herein is obtained from sources believed to be reliable but JPMorgan does not guarantee its completeness or accuracy. Opinions and estimates constitute JPMorgan’s judgment and are subject to change without notice. Past performance is not indicative of future results. This material is provided for informational purposes only and is not intended as a recommendation or an offer or solicitation for the purchase or sale of any security or financial instrument.

View full text

Mortality density forecasts: An analysis of six stochastic mortality models

Abstract

Introduction

Section snippets

Forecasting with stochastic mortality models

Forecasts and model comparisons

Robustness of projections

Applications: survivor index and annuity price

Results for US males

Conclusions

Acknowledgement

Insurance: Mathematics and Economics

Insurance: Mathematics and Economics

Insurance: Mathematics and Economics

Computational Statistics & Data Analysis

Insurance: Mathematics and Economics

Insurance: Mathematics and Economics

Applying Lee–Carter under conditions of variable mortality decline

Population Studies

Evaluation of the variants of the Lee–Carter method of forecasting mortality: a multi-country comparison

New Zealand Population Review

Pricing death: frameworks for the valuation and securitization of mortality risk

ASTIN Bulletin

A two-factor model for stochastic mortality with parameter uncertainty: theory and calibration

Journal of Risk and Insurance

Modelling and management of mortality risk: a review

Scandinavian Actuarial Journal