Sie können Operatoren mit Ihrer Suchanfrage kombinieren, um diese noch präziser einzugrenzen. Klicken Sie auf den Suchoperator, um eine Erklärung seiner Funktionsweise anzuzeigen.
Findet Dokumente, in denen beide Begriffe in beliebiger Reihenfolge innerhalb von maximal n Worten zueinander stehen. Empfehlung: Wählen Sie zwischen 15 und 30 als maximale Wortanzahl (z.B. NEAR(hybrid, antrieb, 20)).
Findet Dokumente, in denen der Begriff in Wortvarianten vorkommt, wobei diese VOR, HINTER oder VOR und HINTER dem Suchbegriff anschließen können (z.B., leichtbau*, *leichtbau, *leichtbau*).
Der Artikel präsentiert einen bahnbrechenden Ansatz zur Sterblichkeitsprognose, indem er die bayesche Vektor-Autoregression (MSBVAR) auf ein altersgegliedertes Lee-Carter-Sterblichkeitsmodell anwendet. Diese innovative Methode unterteilt den gesamten Altersbereich in Untergruppen mit deutlichen Sterblichkeitsmerkmalen, optimiert die Passform und bietet ein differenziertes Verständnis der Sterblichkeitsdynamik. Das Modell zeichnet sich durch eine präzise Vorhersage der Sterblichkeit und die Einschätzung der Unsicherheit über kurz- und langfristige Horizonte aus und übertrifft traditionelle Modelle wie die ursprünglichen Lee-Carter-Ansätze und die bayesianischen vektorautoregressiven Ansätze (BVAR). Durch die Einbeziehung sowohl dauerhafter als auch wiederkehrender struktureller Veränderungen der Sterblichkeitsmuster bietet das Modell einen umfassenden Überblick über historische Sterblichkeitstrends und zukünftige Prognosen. Der Artikel geht auf die Formulierung des Modells ein, vergleicht seine Leistung mit anderen Modellen durch Quervalidierung und demonstriert seine Überlegenheit bei der Vorhersagepräzision und Unsicherheitseinschätzung. Darüber hinaus bietet es eine detaillierte Sterblichkeitsprognose für die männliche französische Zivilbevölkerung über einen Zeitraum von 30 Jahren, die die praktischen Anwendungsmöglichkeiten des Modells und Einsichten in strukturelle Veränderungen der Sterblichkeit hervorhebt.
KI-Generiert
Diese Zusammenfassung des Fachinhalts wurde mit Hilfe von KI generiert.
Abstract
Forecast accuracy and measures of uncertainty are important in mortality modeling, for instance in risk management and pricing of financial products. In this paper, we introduce a new mortality model that provides excellent mortality forecasts and accurate estimation of mortality risk; these merits persist as we extend the forecasting horizon to 30 years. We forecast with Markov-switching Bayesian vector autoregression (MSBVAR) and believe that this is the first time MSBVAR has been used in Lee–Carter-based mortality modeling. Our strategy begins by partitioning the full lifespan into age subgroups that experience different mortality dynamics. Applying Lee–Carter within each age subgroup, we generate a separate stochastic time factor for each. Then, we forecast mortality using a method that can capture and quantify both permanent and recurring structural changes to mortality. The recurring changes are modeled with MSBVAR, which also considers parameter correlation. The forecasting process provides parameter uncertainty estimates and mortality uncertainty estimates.
Droms Sean, Patrick Brewer and Barry Smith contributed equally to this work.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
1 Introduction
Accurate mortality forecasting and estimation of mortality uncertainty play an important role in the pricing and risk management of pensions, annuities and other portfolios with longevity risk (Njenga and Sherris [31]; Borger and Schupp [4]; Cairns et al. [7]). Long-term forecasting and risk-estimation, especially, is essential for effective solutions to manage mortality risk and longevity risk for actuaries (Gaille and Sherris [13]; Qiao et al. [32]; Liu and Yu [27]).
In this paper, we introduce a new mortality forecasting model that has excellent accuracy and great estimation of mortality risk across a wide range of forecasting time horizons. It applies Markov-switching Bayesian autoregression (MSBVAR) ([38]) for the first time to a mortality model based on Lee–Carter ([21]). Our model partitions the full-age range into subgroups with different mortality characteristics, with a fixed number of subgroups and boundary points chosen to optimize fit. It also provides insightful quantitative analysis for permanent as well as recurring mortality structural changes within each age subgroup.
Anzeige
The Lee–Carter model includes a single stochastic factor to address mortality risk. The new model includes multiple stochastic factors obtained by fitting Lee–Carter separately within each age subgroup. Then, while forecasting these stochastic factors, we model both permanent and recurring structural changes in mortality patterns. A “permanent” change is modeled as a permanent change in the drift of a mortality variation factor at one point in time (determined by the model). Intuitively, this can model a permanent, technological development like the introduction of penicillin. We also model transient changes where the model changes for a while and then reverts back, as would be expected during a short war. MSBVAR is a natural choice to model the latter, and in this setting such changes are often called “recurrent,” and we shall do so in what follows. Because of these elements, we name the model “Lee–Carter in age subgroups with structural changes” (LCASSC).
The Lee–Carter model has inspired a great many extensions, for example, Booth et al. [3] and Renshaw and Haberman [33]. However, the Lee–Carter model tends to underestimate the recent mortality decline (Liu and Yu [27]; Booth et al. [2]; Lee and Miller [22]). Liu and Yu [27] also states that the Lee–Carter model neglects drift uncertainty observed in the last century, which is a significant problem for its long horizon forecast performance. The Lee–Carter model also fails to capture correlations among model parameters representing intertwined mortality patterns in different ages or populations, as models proposing vector-autoregressive (VAR) do (for instance, Guibert et al. [16] and Li and Shi [24]). Noting the importance of parameter uncertainty and parameter correlations for mortality forecasting, Njenga and Sherris [31] applied a Bayesian vector autoregressive (BVAR) in their model and demonstrate that BVAR is superior to VAR by providing more accurate forecasts and the ability to quantify both parameter uncertainty and longevity risk, but the model is less accurate than the Lee–Carter model.
Our goal for this paper is to create a mortality model that retains the strengths of BVAR and the accuracy of Lee–Carter over a wide range of forecasting horizons and that provides interpretability for historical mortality structural changes. The LCASSC model successfully achieves these goals. To demonstrate the strength of LCASSC, we compare it to the original Lee–Carter and two other models. The latter two use the same first step as LCASSC, i.e., partitioning into age-groups and fitting Lee–Carter, to obtain the same time factors, but they apply different forecasting methods. One uses independent random walks with drift (as does the original Lee–Carter model), and the other uses a lag-2 BVAR.
To compare accuracy, we use rolling-window cross-validation. The results show that the LCASSC model has the highest accuracy among the four models for 1- to 5-year forecasts, 1- to 20-year forecasts, and 1- to 30-year forecasts. For example, compared to the original Lee–Carter model, our model has only 19.02%, 22.28% and 24.72% of its average MSE on full-age (0–95) for 1- to 5-year forecasts, 1- to 20-year forecasts and 1- to 30-year forecasts, respectively. The LCASSC model shows excellent accuracy and this merit persists into long-term forecast horizons. Indeed, just the initial step of our modeling process— partitioning into age groups and using multiple time factors—already demonstrates striking improvement over Lee–Carter; we see a large increase in accuracy, regardless of which forecasting method is employed. That is, the three comparison models that apply the same initial step as LCASSC all outperform Lee–Carter. Details are in Sect. 3.1.
Anzeige
The fact that these three use the same input time factors allows us to isolate the performance of the different forecasting methods. Comparing them demonstrates that the LCASSC forecasting methodology creates superior estimates of mortality uncertainty. The LCASSC model gives the best 95% prediction intervals for the 1- to 30-year forecasts for the logarithm of mortality, compared to the two models using the same factors. It provides the narrowest 95% prediction intervals always and they cover all the observed data for ages considered across full-age except one observation. In particular, for ages in later adulthood and old-age (55, 70, 80, 90, 95), the LCASSC model gives perfect 95% prediction intervals that cover all the observed data (Fig. 21). We also find that the comparison model that applies BVAR always tends to underestimate mortality and gives unnecessarily wide prediction intervals, which is also observed in Fu et al. [12]; the comparison model based on random walks with drift sometimes overestimates and for old-age forecasts gives unnecessarily wide prediction intervals, which confirms the comment made in Liu and Yu [27] about the underestimation of recent mortality decline.
Besides the advantages in forecasting and forecast uncertainty, the LCASSC model captures and quantifies permanent as well as recurring mortality structural changes. Changes in mortality patterns among populations in the last century have been widely observed, and accounting for such changes could improve mortality forecasts (Liu and Yu [27]; Fu et al. [11, 12]). For the same population, mortality patterns are different among different age subgroups, while at the same time correlated (Fu et al. [11, 12]). To address these complicated mortality patterns underlying the data, the LCASSC model first partitions the full-age range into subgroups, which have different mortality patterns, then models permanent as well as recurring structural changes in the age subgroups. The recurring structural changes are modeled by MSBVAR, which considers the correlation among the different mortality patterns in different age groups.
By training on French male civilian mortality data for ages 0–95 and years 1925–2020, the LCASSC model generates the following age subgroups 0–14, 15–49, 50–86, 87–95 and captures very different time points for permanent mortality structural change for each subgroup: 1955 for childhood (0–14), 1999 for late teenage and early adulthood (15–49), 1978 for early old-age (50–86), and 1962 for later old-age (87–95). The time factor for each age subgroup shows clearly different drifts before and after the structural change happened with no indication that it is a transient change. This sort of effect is why we include permanent structural changes in the model, which should capture, for instance, improvements in technology leading to permanent improvements in mortality. Moreover, we observed that all four age subgroups experienced recurring mortality structural changes in 1940 and 1951. Naturally, World War II changed mortality patterns for the duration of the war, but after some lag after the war the patterns seemed to revert back to the previous regime. MSBVAR is a natural model for transient regime switches where mortality is temporarily in a very different state from the “common” state. Our model also gives a transition matrix for these recurring structural changes.
Fu et al. [12] introduced MSBVAR in mortality modeling for the first time and demonstrated that MSBVAR retains the strengths of BVAR, while providing improved forecast accuracy and better estimation of parameter uncertainty. MSBVAR includes Markov-switching (MS) (also known as regime switching behavior) to explicitly allow for structural changes in mortality patterns and uses BVAR as the model within each regime. The LCASSC model considers both permanent and recurring structural changes in different ways and models the recurring structural changes by MSBVAR.
Literature on structural changes in mortality dynamics includes methods to test for one or more structural changes and models that incorporate structural changes to improve forecasting. The variety of techniques used to incorporate structural changes in mortality models include regime-switching (Milidonis et al. [29], Hainaut [18], Shen and Siu [35], Gao et al. [14], Ignatieva et al. [20], Zhou [42], and Gylys and Šiaulys [17], broken-trend stationary Li et al. [25]) and difference-stationary processes with breakpoints (Coelho and Nunes [9], van Berkum et al. [40]). Regime-switching models inherently assume that when mortality dynamics change to a new regime, sometime in the future they may revert to the old regime. Advances in medical technology or cultural changes such as the widespread recognition that smoking is harmful are presumably permanent. So while regime-switching may capture transient changes, such as those occuring during a short period of war, a realistic mortality model should also account for permanent structural changes. LCASSC models permanent structural changes with a piecewise-linear trend, and the Markov-switching aspect of MSBVAR models additional transient effects.
After the initial work of Njenga and Sherris [31], Bayesian VAR models have found some additional use in mortality forecasting, for instance in Lu and Zhu [28] and Shi et. al. [36]. These articles and and the present one have different stated goals, so approach Bayesian VAR models in different ways. But all share the advantage afforded by Bayesian models of providing estimates of parameter uncertainty. VAR models often have very large numbers of parameters, and adopting Bayesian methods can help manage the situation.
As for the differences, Lu and Zhu [28] combined the mathematical simplicity and familiarity of a factor model like Lee–Carter with flexibility of VAR models, introducing the factor-augmented VAR (FAVAR) model. This model still uses 6000 parameters, and when compared with Lee–Carter, the 10-year out-of-sample forecasting was less accurate even than Lee–Carter for French males in age groups 51–100 and 61–100. Yet the model is still attractive for including Lee–Carter and standard VAR models as special cases and for incorporating the sparse VAR model of Li and Lu [23] as a baseline. (The sparse VAR model strikes a balance between flexibility and parsimony and ensures that series for different ages are cointegrated.) A strength of the present LCASSC model is forecast accuracy, so we provide a thorough analysis of forecasts using rolling-window cross-validation and provide out-of-sample forecasts up to a 30-year horizon. The model far-outperforms Lee–Carter when forecasting much larger age-range 0–95.
Shi et al. [36], on the other hand, explicitly incorporate mean-reversion into their multi-population model. This is natural, important, and relatively simple when considering separate populations within one model. One does not want the mortality pattern for 50-year-olds in one country to diverge from that of 50-year-olds in another country, especially if those countries are similar culturally and economically and are geographically close. On the other hand, the LCASSC model does not create series for different populations that include people of the same ages, but rather creates series for subgroups of a single population formed by partitioning the various age ranges. We expect more divergence in behavior for these groups, and mean-reversion seems less natural in this context.
Because in developing the LCASSC model, we prioritized incorporating structural breaks, we have kept things simple by not including an explicit mechanism to maintain coherence between log mortality rates of different age groups. This is in spite of the fact that cointegration between the log mortality series at neighboring ages is a desirable, biologically reasonable feature of a mortality model, and some coherence should still be expected over long time horizons between the larger age blocks we use. But intuitively, structural shocks can sometimes happen primarily to one age group (war, for instance, has a more drastic effect on mortality for young men of military age than it does on old men, especially given the naturally higher mortality rate of the latter group in normal circumstances). Allowing for this while devising a model with built-in long-term coherence between the age groups is a more complicated task, but it is something we may try to address in future work.
The remainder of this paper is organized as follows. Section 2 provides the model formulation, and Sect. 3 shows the main results. In Sect. 3.1, we present the results of accuracy testing of our model using rolling-window cross-validation. In Sect. 3.2, we show the forecast precision of our model for both mortality and parameters. In Sect. 3.3, we provide an example of a mortality forecast and give quantitative analysis of historical mortality structural changes in the last century. Finally, Sect. 4 concludes the paper.
2 Model formulation
For a given value of K, our model uses an algorithm to divide the full age range into K subgroups by minimizing the weighted MSE of mortality (described in Eq. 3) and then applies the Lee–Carter model in each age subgroup to obtain time series \(\kappa _t^k\) for \(k=1, 2, \dots , K\) (described in Eq. 2). We assume that the estimated \(\kappa _t^k\) experience both permanent and recurrent structural changes and forecast the recurrent structural changes with MSBVAR. After we forecast \(\kappa _t^k\), we are able to forecast mortality.
2.1 The Lee–Carter model and age sub-grouping
The original Lee–Carter model in Lee and Carter [21] has the formulation:
where \(m_{x,t}\) is the mortality rate for age x and calendar year t. In this model, \(\alpha _x\) represents the general shape across age, \(\kappa _t\) represents the variation of mortality level over time, \(\beta _x\) is the age-specific response to the changes over time, and \(\epsilon _{x,t}\) is an error term following i.i.d normal distribution with mean 0 and constant variance. The two constraints are added to deal with the identification issue in this formulation and the singular value decomposition (SVD) method is used to fit the model. In this paper, we also apply SVD in Lee–Carter model fitting.
We follow the general framework of a P-simple model introduced by Danesi et al. [10], which allows for different levels of complexities in different sub-populations. It has the formulation as below: For each k, where \(k=1, 2, \dots , K\) denotes the index for kth sub-population,
The study in Danesi et al. [10] considers sub-populations in different regions in Italy that were predetermined through historical distinctions before model fitting. In our application, the sub-populations are age subgroups and we consider the number of those subgroups to be predetermined but not their boundary points. Rather, we choose the boundary points by minimizing the weighted MSE (WMSE), defined below. By this method we optimize the partition of the full-age range into subgroups for a specific value of K.
where \(d_{x>0,t}\) is the death rate at age \(x>0\) for year t derived from our fitted \(m_{x,t}\) using the period life tables in the Human Mortality Database (HMD) [41].2 Bardoutsos et al. [1] and Fu et al. [11] applied weighted MSE of \(q_{x}\), defined very similarly to ours, in their approach. They note that the mean-squared error (MSE) of \(\ln {q_{x}}\) gives a relatively large amount of weight to errors at young ages, the MSE of \(q_{x}\) gives more weight to errors at old ages, and the MSE of \(d_{x}\) gives more weight to errors around the modal age. The same is true with \( m_{x,t}\) in place of \(q_{x}\). Since WMSE has a carefully balanced weight on all the ages, it is appropriate for our model studying a wide age range, i.e. 0–95.
While these points give an intuitive justification for the form of the WMSE given above (although discussing \(q_x\) instead of \(m_{x}\)), we did some additional testing to ensure it seemed reasonable in our context. We refit the same mortality data (French male civilian mortality data for ages 0–95 and years 1925–2020) with \(K=4\) and recomputed age cutoffs using alternate forms of WMSE that put all the weight on the first term above, all the weight on the second, and all the weight on the third, and we also tried weights 0.25, 0.5, and 0.25 on the three terms involving \(\ln m_{x,t}\), \(m_{x,t}\), and \(d_{x>0,t}\), respectively. When putting all weight on just one term, we saw in each case a dramatic shift in the age cutoffs as would be expected given the intuitive description of each term above (for instance, putting all weight on \(\ln m_{x,t}\) makes all age cutoffs much lower at 15, 22, and 45, which fails to capture a distinction between middle-aged and senescent people), while using 0.25, 0.5, and 0.25 gave exactly the same age cutoffs (at 14, 49, and 86) as assigning equal weights, showing the choice above is robust to modest changes in weights.
We first choose the number of subgroups K as a predetermined constant and assign the age partition parameters \(\{A_1, A_2, \dots , A_{K-1}\}\), so the age subgroups are \([0, A_1]\), \([A_1+1, A_2]\), ..., \([A_{K-1}+1, 95]\). We fit a P-simple model for different age subgroups by minimizing the fitted WMSE to estimate all the parameters: \(A_k\), \(\alpha _x^k\), \(\beta _x^k\), and \(\kappa _t^k\) for \(k=1, 2, \dots , K\). Among the parameters, the \(\kappa _t^k\) are time series assumed to experience permanent structural change and recurrent structural change over time. We model and forecast the values of \(\kappa _t^k\) (described in later sections) in order to forecast mortality. In this fitting, we used the R package DEoptim [30], which performs global optimization by differential evolution.
2.2 Structural change and MSBVAR
The time series \(\{\kappa _t^1,\kappa _t^2,...,\kappa _t^K\}\) represent the variation of mortality over time for each sub-population. Danesi et al. [10] assumes that the time series corresponding to their sub-populations are multidimensional, uncorrelated random walks with drift. However, we regard \(\{\kappa _t^1,\kappa _t^2,...,\kappa _t^K\}\) as correlated time series which experience structural changes over time. There are permanent structural changes as well as recurrent structural changes in mortality, and we also consider these two types of structural changes in \(\kappa _t^k\).
Let \(\Delta \kappa _t^k=\kappa _{t+1}^k-\kappa _t^k\) for \(k=1, 2, \dots , K\). We model \(\Delta \kappa _t^k\) as the sum of the recurrent structural change term \(\Delta (\kappa _r^k)_t\) and the permanent structural change term \(\Delta (\kappa _p^k)_t\):
2.2.1 \(\Delta (\kappa _p^k)_t\) for permanent structural changes in mortality
For the permanent structural changes term \(\Delta (\kappa _p^k)_t\), in the time period considered \([T_0, T]\), we assume the kth sub-population has \(m^k-1\) permanent structural changes at \(T_1^k, T_2^k, \dots , T_{m^k-1}^k\) arranged in ascending order. So there are \(m^k\) states of mortality in \([T_0, T]\). Then in each state \([T_{m-1}^k, T_m^k]\), we assume \(\Delta (\kappa _p^k)_t\) is a constant \(\mu _m^k\) specific to this state for this sub-population, i.e. for the kth subpopulation,
For simplicity, in this paper, we assume there is only one permanent structural change for all K age subgroups in the studied period (since 1925), i.e. \(m^1=m^2=\dots =m^K=2\). Then Eq. 4 becomes
where \(I(\cdot )\) returns 1 if its augment is true and 0 otherwise.
2.2.2 MSBVAR and \(\Delta (\kappa _r^k)_t\) for recurrent structural changes in mortality
For the recurrent structural changes, we apply the MSBVAR model described in Sims et al. [38] to the multiple time-series (i.e., {\(\Delta (\kappa _r^1)_t, \Delta (\kappa _r^2)_t, \dots , \Delta (\kappa _r^K)_t\)}) subject to regime switching, where we assume that recurrent structural changes follow a Markov process.
Assume that there are h regimes in the system and we denote them as \(H=\{1, 2, \dots , h\}\). At time t, let \(s_t\in H\) denote the corresponding regime. Let \(Q=(q_{ij})_{h\times h} \in [0, 1]^{h^2}\) be the Markov transition matrix, with \(q_{ij}=P(s_t=i|s_{t-1}=j)\) being the probability that \(s_t=i\) given that \(s_{t-1}=j\). Q satisfies \(\sum _{i\in H} q_{ij}=1\). For \(1\le i,j\le h\) and \(\alpha _{i,j}>0\), the prior on Q follows the Dirichlet form:
where \(\Gamma (\cdot )\) denotes the standard gamma function. The Dirichlet prior is a standard choice of distribution for a stochastic vector, both because it is the conjugate prior to the multinomial distribution and it provides flexibility through the ability to choose a symmetric (i.e., flat, assuming no information) or asymmetric version and to set the concentration parameter.
In this paper, we apply the simplified MSBVAR case to {\(\Delta (\kappa _r^1)_t, \Delta (\kappa _r^2)_t, \dots , \Delta (\kappa _r^K)_t\)}: an intercept is considered but no other exogenous factors, with lag 1 and a 2-regime system (\(h=2\)). (For a description the general model, see Sims et al. [38] and Fu et al. [12]). We use \(K=4\) age subgroups (the justification for this is in Sect. 3), and let \(y_t\) = \((\Delta (\kappa _r^1)_t, \Delta (\kappa _r^2)_t, \Delta (\kappa _r^3)_t, \Delta (\kappa _r^4)_t)'\). We assume that \(y_t\) conditional on the past only depends on the current regime, not on the historical regimes. In the certain regime \(s_t\) (equal either to 1 or 2) at time t, the MSBVAR model considers a structural VAR with lag 1 for \(y_t\) given by
where \(c(s_t) = (c_1(s_t), c_2(s_t), c_3(s_t), c_4(s_t))'\) is the unknown intercept vector, \(B(s_t)\) is a \(4\times 4\) matrix of unknown coefficients at lag 1, and \(e_t\) is a vector of i.i.d. errors distributed as \(N_4(0, \Sigma )\) with the unknown error covariance matrix \(\Sigma \). Equation (7) is then:
The Bayesian nature of BVAR assumes the coefficient matrix of the underlying VAR model is random, assuming prior distributions for coefficients. The well-known prior distribution called Litterman’s Prior in Litterman [26] and [34] defines the prior by specifying the mean values and then describing their variation (the “tightness” of the distribution) around those means based on a predetermined collection of hyperparameters. The Sims–Zha prior Sims and Zha [37] extends the Litterman prior. Then Sims et al. [38] combine this extension with the Markov-switching framework to introduce the MSBVAR model, which is the model we use to project the mortality parameters. The MSBVAR hyperparameters are as follows (see Sims and Zha [37], Brandt [5], Njenga and Sherris [31]):
\(\lambda _0 \in [0, 1]\): Overall tightness of the prior on the error covariance matrix. As it increases, the model moves away from a random walk.
\(\lambda _1 \in [0, 1]\): Standard deviation or tightness of the prior around the AR(1) parameters.
\(\lambda _4>0\): Standard deviation or tightness around the intercept.
(Sims and Zha also include hyperparameters \(\lambda _3\), \(\lambda _5\), \(\mu _5\), and \(\mu _6\); \(\lambda _3\) governs the decay of VAR coefficients as the lag increases, \(\lambda _5\) governs tightness of the distribution of exogenous variables, while \(\mu _5\) and \(\mu _6\) are weights on dummy information included in the model using a Theil mixed estimation approach. Because we use lag 1 only, and our only exogenous variable is the intercept, which is controlled by \(\lambda _4\), and because we also set the weights \(\mu _5\) and \(\mu _6\) either equal to 0 or very close to 0 in all situations, these hyperparameters are not relevant to our setup.)
We applied the Sims–Zha Normal-Wishart prior in model 7, where the prior distribution of the (vectorization of the) coefficient matrix \(\begin{bmatrix} c(s_t)' \\ B(s_t) \end{bmatrix}\) conditional on \(\Sigma \) is multivariate Normal and the prior distribution of \(\Sigma \) is inverse Wishart. Given state \(s_t\), for the conditional lag-1 coefficient matrix \(B(s_t)|\Sigma \), its elements are independent normally distributed random variables and each variable follows a random walk with a drift that may be nonzero. Their mean and standard deviation satisfies
for \(i,j\in \{1,2,3,4\}\), where \(\sigma _j\) is the variation in the jth variable. Each constant term in \(c(s_t)\) has a conditional prior mean of zero and a standard deviation controlled by \(\lambda _0\lambda _4\). As the pioneer first to introduce BVAR to mortality modeling, Njenga and Sherris [31] provides a detailed summary of Bayesian VARs with Litterman’s prior and also Sims–Zha’s prior for the case that the intercept is the only exogenous factor as we did, and one can refer to this paper for more details about the Sims–Zha prior. See also Sims et al. [38] for the Sims–Zha prior for general MSBVAR with more exogenous factors.
We use the Gibbs sampler to draw Bayesian posterior samples for the MSBVAR model, and we generate draws from the posterior forecast density, which provides forecast and parameter uncertainty.
To fit and forecast the variation of mortality over time \(\Delta \kappa ^k_t\) in our study, we first use Eqs. 8 and 6 to fit the parameters in the permanent structural changes term \(\Delta (\kappa _p^k)_{t}\), i.e. \(\mu _1^k\), \(\mu _2^k\), \(T_1^k\) for \(k=1, 2, \dots , 4\) by maximum likelihood estimation, where parameters in Eq. 8 are not considered in a Bayesian context but as Markov-switching vector autoregression parameters. With estimated values for \(\mu _1^k\), \(\mu _2^k\), \(T_1^k\), and Eq. 6, we then regard the unknown VAR parameters in Eq. 8 in a Bayesian context as jointly distributed random variables by applying the Sims–Zha Normal-Wishart prior to fit and forecast the mortality parameters \(\Delta (\kappa _r^k)_{t}\) with MSBVAR. Then we can forecast the \(\{ \Delta \kappa ^k_t \}_{k=1}^4\) and the mortality.
3 Main results
In this section, we apply our model to the French civilian male population by age and year (\(1\times 1\)) using mortality data from the Human Mortality Database [19] for the period of 1925–2020. In Sect. 3.1, we perform an out-of-sample forecasting test on our model using a form of rolling-window cross-validation and compare its accuracy with three other models. Then in Sect. 3.2, we also compare the forecasting precision of the \(\kappa _t^k\) and \(\ln {m_{x,t}}\). In Sect. 3.3, we give the 30-year-ahead mortality forecast of French male civilians after 2020 by our model, as well as the analysis of historical mortality structural change in 1925–2020.
Human mortality is naturally divided into four age periods. The first two correspond to a sharp initial mortality decrease, followed by a sharp increase. The third period includes a hump, corresponding to young adulthood; and the final period has continual increase. Burger et al. [6] illustrates this fact for Japanese, Swedes, hunter-gatherers, and chimps. Accordingly, we conjectured that K = 4 age subgroups is a good choice and found supporting evidence from Giordano et al. [15], who used tree-based methods to partition the full age range into 4 age subgroups (and found similar values for the cut points between age groups as ours).
We considered the correct number of subgroups empirically, as well. Figure 1 below shows that the training error decreased with increasing K and that values greater than 4 provide little additional benefit. More specifically, Fig. 1 gives the plot of training RSS of \(m_{xt}\) as the value of K increases for training periods 1925–1975 and 1925–1990, i.e., the shortest and the longest training periods applied in the rolling-window cross-validation tests in Sect. 3.1. Both cases show a clear “elbow” at \(K=4\).
Fig. 1
Training RSS of \(m_{xt}\) for different number of age subgroups K for training periods 1925–1975 and 1925–1990
We also applied a clustering method suggested by Tsai and Cheng [39] to justify the choice of \(K=4\). By following their approach, to generate the new bivariate objects for clustering, we first denote the age-specific individual time trend as \(\varvec{A_x}=\{A_{x,t}=\ln (m_{x,t}):t=T_0,T_0+1,...,T\}\) for age \(x\in \{0,1,..,95\}\), and the group time trend as \({\varvec{G}}=\{G_t=\frac{1}{m}\sum _{x=0}^{95} A_{x,t}:t=T_0,T_0+1,...,T\}\). Then a simple linear regression model with \(\varvec{A_x}\) as the dependent variable and \({\varvec{G}}\) as the independent variable is fitted for each considered age x, and its intercept and slope are denoted as \((i_x,s_x)\). The clustering method is applied to \(\{(i_x,s_x):x\in \{0,1,...,95\}\}\). Furthermore, to consider the value of K, the R package “Nbclust” [8] is implemented which provides 30 indices to select the optimal number of clusters. We applied “Nbclust” with the Euclidean distance as the measure of dissimilarity and the Ward.D2 method to \(\{(i_x,s_x):x=0,1,...,95\}\), for each training period in the rolling-window cross-validation tests in Sect. 3.1 and for the training period 1925–2020 of the forecast example in Sect. 3.3. Table 1 reports the frequencies of different number of clusters (i.e. K) suggested by all indices provided and the optimal value of K based on the highest frequency for each training, where we considered the value of K from 3 to 8.
Shown by the table, among the 16 experiments of the rolling-window cross-validation, the optimal K value is chosen to be 4 nine times, 3 nine times and 5 three times; and for the training period 1925–2020, \(K=4\) is optimal. Therefore, \(K=4\) is a reasonable and proper choice for our paper. Tsai and Cheng [39] also mentioned that in most cases for their cluster results, each cluster is composed of consecutive ages, which matches our assumption in age separation in Sect. 2.1.
From all evidence described above, we chose \(K=4\) for all the experiments ran on the French civilian male data in our paper.
Table 1
Frequencies of K value suggested by all indices in “Nbclust” for \((i_x,s_x)\) of considered training periods and full ages 0–95
Training period
K = 3
K = 4
K = 5
K = 6
K = 7
K = 8
Optimal K value based on the highest frequency
1925–1975
4
7
3
3
4
2
4
1925–1976
6
6
4
1
1
5
3 or 4
1925–1977
4
5
6
4
0
4
5
1925–1978
4
6
6
4
1
2
4 or 5
1925–1979
6
6
4
3
0
4
3 or 4
1925–1980
6
6
5
3
0
3
3 or 4
1925–1981
8
5
4
3
1
3
3
1925–1982
7
9
2
1
1
3
4
1925–1983
7
9
2
1
1
3
4
1925–1984
5
6
6
4
0
2
4 or 5
1925–1985
8
3
0
6
2
5
3
1925–1986
8
6
3
1
1
5
3
1925–1987
6
5
4
2
3
3
3
1925–1988
7
6
6
1
1
3
3
1925–1989
8
5
3
2
1
5
3
1925–1990
3
8
0
6
2
4
4
1925–2020
5
9
4
0
2
3
4
3.1 Out-of-sample forecasting accuracy comparison using rolling-window cross-validation
We applied rolling-window cross-validation to compare the out-of-sample forecasting accuracy of four models:
(1)
the LCASSC model: our model described in Sects. 2.1 and 2.2,
(2)
the Lee–Carter model fitted by SVD method [21] with \(\kappa _t\) assumed as a random walk with drift,
(3)
the Lee–Carter model fitted by SVD method in each of the four age subgroups obtained by Sect. 2.1, with the \(\kappa _t^k\) assumed as four independent random walks with drift,
(4)
BVAR with lag-2 applied to the \(\kappa _t^k\) obtained by Sect. 2.1.
We applied the Sims–Zha Normal-Wishart prior for both MSBVAR in the LCASSC model and BVAR in model (4), and used similar strategies to chose the hyperparameter values. Training and testing were based on mortality data from French male civilians from 1925–2020 for ages 0–95; the testing results are shown in Tables 2 and 3. Table 2 and the highlighted part of the Table 3 report the accuracy results for full age range 0–95, whereas Table 3 also represents the average errors for each of the four age subgroups.
Table 2
One-year to thirty-year forecasting accuracy comparison of four models’ MSE of \(m_{x,t}\) for French male civilian population for full ages 0–95
Training period
Forecast period
LCASSC
Model (2)
Model (3)
Model (4)
1925–1975
1976–2005
4.88\(\times 10^{-5}\)
1.89\(\times 10^{-4}\)
7.26\(\times 10^{-5}\)
3.88\(\times 10^{-4}\)
1925–1976
1977–2006
9.65\(\times 10^{-5}\)
2.02\(\times 10^{-4}\)
7.10\(\times 10^{-5}\)
3.73\(\times 10^{-4}\)
1925–1977
1978–2007
4.22\(\times 10^{-5}\)
1.97\(\times 10^{-4}\)
4.79\(\times 10^{-5}\)
2.92\(\times 10^{-4}\)
1925–1978
1979–2008
5.39\(\times 10^{-5}\)
2.05\(\times 10^{-4}\)
6.23\(\times 10^{-5}\)
2.69\(\times 10^{-4}\)
1925–1979
1980–2009
4.22\(\times 10^{-5}\)
2.13\(\times 10^{-4}\)
5.13\(\times 10^{-5}\)
2.70\(\times 10^{-4}\)
1925–1980
1981–2010
3.96\(\times 10^{-5}\)
2.31\(\times 10^{-4}\)
5.78\(\times 10^{-5}\)
2.77\(\times 10^{-4}\)
1925–1981
1982–2011
6.69\(\times 10^{-5}\)
2.33\(\times 10^{-4}\)
8.80\(\times 10^{-5}\)
1.84\(\times 10^{-4}\)
1925–1982
1983–2012
4.03\(\times 10^{-5}\)
2.24\(\times 10^{-4}\)
5.77\(\times 10^{-5}\)
9.48\(\times 10^{-5}\)
1925–1983
1984–2013
1.01\(\times 10^{-4}\)
2.59\(\times 10^{-4}\)
1.09\(\times 10^{-4}\)
1.80\(\times 10^{-4}\)
1925–1984
1985–2014
7.12\(\times 10^{-5}\)
2.46\(\times 10^{-4}\)
6.37\(\times 10^{-5}\)
9.25\(\times 10^{-5}\)
1925–1985
1986–2015
1.23\(\times 10^{-4}\)
2.42\(\times 10^{-4}\)
1.16\(\times 10^{-4}\)
8.68\(\times 10^{-5}\)
1925–1986
1987–2016
6.17\(\times 10^{-5}\)
2.56\(\times 10^{-4}\)
8.62\(\times 10^{-5}\)
1.11\(\times 10^{-4}\)
1925–1987
1988–2017
2.80\(\times 10^{-5}\)
2.32\(\times 10^{-4}\)
5.12\(\times 10^{-5}\)
8.65\(\times 10^{-5}\)
1925–1988
1989–2018
2.77\(\times 10^{-5}\)
2.26\(\times 10^{-4}\)
4.19\(\times 10^{-5}\)
3.91\(\times 10^{-5}\)
1925–1989
1990–2019
2.74\(\times 10^{-5}\)
2.39\(\times 10^{-4}\)
4.36\(\times 10^{-5}\)
3.36\(\times 10^{-5}\)
1925–1990
1991–2020
2.44\(\times 10^{-5}\)
2.26\(\times 10^{-4}\)
4.37\(\times 10^{-5}\)
4.76\(\times 10^{-5}\)
Each cell of Table 2 reports the average MSE of \(m_{x,t}\) on full age period 0–95 for one-, two-, ..., and thirty-year-head forecasts based on training on the previous years’ data. For example, the second row of the table indicates that we trained the models on data from 1925–1975 and forecast values of \(m_{x,t}\) for 1976–2005. The subsequent four cells report the average MSE of the \(m_{x,t}\) estimates for the four models for full ages 0–95.
The average of the MSEs of \(m_{x,t}\) for all 480 forecasts of each model in the Table 2 are shown in the fourth row of the Table 3 below; each average equals the respective column mean. To check different scope of the forecasting accuracy, we added similar one-year to five-year and one-year to twenty-year comparison of forecasting accuracy among the four models, shown in the 2nd row and 3rd row of the Table 3 respectively. For example, the 2nd row reports the average testing MSE for full ages 0–95 of the one-year to five-year \(m_{x,t}\) forecasts based on different training period listed as in the Table 2 for the four models. The average testing MSEs for the full-age in Table 3 are highlighted in yellow, and we compare the four models’ general performance in accuracy based on the full-age results.
From full-age results in Table 3, the LCASSC model has the lowest average MSE of \(m_{x,t}\) for all the given forecast horizons among the four models. It shows excellent forecasting accuracy for different forecasting horizons, from short-term to long-term. For a given number of age subgroups that partition the full age range, the LCASSC model optimizes their boundary points, considers correlation among \(\kappa _t^k\) for different age subgroups, and models mortality structural changes. All of these contribute to its accurate mortality forecasting.
The full-age results show that the first step of the new model, i.e., partitioning into age subgroups and fitting Lee–Carter to each, already leads to a large improvement in forecast accuracy. Model (3), model (4), and LCASSC implement this idea, and all three significantly outperform model (2) that does not, shown by the highlighted part in Table 3. Taking as an example the 1- to 30-year forecast average MSE of \(m_{x,t}\) for the full-age range for all tests in the rolling-window cross-validation, model (3) has 29.41% of the average MSE of model (2), while model (4) has 78.07% and the LCASSC model has 24.72%.
The second step of the model is forecasting mortality. The LCASSC model, model (3), and model (4) use different methods to forecast time factors, but they all use the same age subgroups and estimated \(\alpha _x^k\), \(\beta _x^k\), \(\kappa _t^k\) (for the training period) obtained as in Sect. 2.1. So the difference in accuracy (for full-age range) in Table 3 among the three models results solely from the differing forecast methods applied to the time factors \(\kappa _t^k\).
By comparing LCASSC to model (4), we find that the consideration of mortality structural changes in LCASSC is necessary and significantly improves the forecasting accuracy. Model (4) applied BVAR with lag-2 to forecast the series \(\kappa _t^k\), however, LCASSC modeled the permanent structural change of mortality (Sect. 2.2.1) as well as the recurrent structural changes by MSBVAR with lag-1 in its forecast. Both the models consider correlation among the \(\kappa _t^k\), while model (4) didn’t involve structural changes as LCASSC did. The LCASSC model has only 63.72%, 33.94% and 31.66% of the average MSE for full-age of the model (4) for 1- to 5-year forecast, 1- to 20-year forecast, and 1- to 30-year forecast, respectively, as shown in the highlighted part of the Table 3.
From the full-age average MSE results, LCASSC and model (3) show higher accuracy than the other two models, but LCASSC outperforms model (3), with average MSE 11.65%, 14.11% and 18.99% less than model (3) for 1- to 5-year forecast, 1- to 20-year forecast, and 1- to 30-year forecast, respectively, as shown in the highlighted part of Table 3. The table also shows that the advantage of LCASSC increases for longer forecast horizons. Model (3) forecasts the \(\kappa _t^k\) as four independent random walks with drift, while our model considers their correlation (in MSBVAR) and the structural changes. Comparison between our model and the model (3) shows that the consideration of correlation of the \(\kappa _t^k\) and mortality structural changes is beneficial to the forecasting accuracy. Our model design includes separate drifts (\(\mu _1^k\) and \(\mu _2^k\)) of \(\kappa _t^k\) for the stage before and after the permanent structural change, which refines the assumption of a single drift for \(\kappa _t^k\) in the model (3). This contributes to model accuracy, especially for longer forecasting horizons.
The forecasting accuracy improvement derived from step 2 of LCASSC implementation is rather mild compared to the step 1, but Sect. 3.2 shows that LCASSC provides more satisfying estimates of mortality uncertainty.
In summary, the LCASSC model consistently shows the highest accuracy in forecasting for full-age range among the four models for every forecast horizon length considered, from short-term to long-term. Its forecasting accuracy becomes more obvious as the forecasting horizon increases, making our model particularly useful for long-term mortality forecasts (e.g., 30-year horizon). Furthermore, LCASSC gives better prediction intervals than Model (3) and Model (4) (Sect. 3.2), and can provide insightful information about the structural changes and dynamics of intrinsic factors involved in mortality (Sect. 3.3).
Table 3
Forecasting accuracy comparison of four models’ average MSE of \(m_{x,t}\) (in \(10^{-6}\)) for French male civilian population for 5 years, 20 years and 30 years forecasting horizon by different age periods
Age period
Forecast horizon
LCASSC
Model (2)
Model (3)
Model (4)
1- to 5-year forecast
16.99
89.32
18.97
26.67
1- to 20-year forecast
35.33
158.60
40.32
104.10
Full ages 0–95
1- to 30-year forecast
55.90
226.14
66.52
176.55
1- to 5-year forecast
2.15
5.14
2.00
2.37
1- to 20-year forecast
1.78
3.82
1.40
2.57
Age subgroup 1 (Childhood)
1- to 30-year forecast
1.54
3.12
1.12
2.72
1- to 5-year forecast
0.16
0.13
0.14
0.13
1- to 20-year forecast
0.18
0.16
0.15
0.14
Age subgroup 2 (teenage and young adulthood)
1- to 30-year forecast
0.24
0.14
0.16
0.15
1- to 5-year forecast
6.56
45.33
7.92
9.70
1- to 20-year forecast
23.57
88.64
30.22
40.54
Age subgroup 3 (early old-age)
1- to 30-year forecast
38.73
122.59
50.46
65.68
1- to 5-year forecast
81.29
420.44
89.48
127.93
1- to 20-year forecast
152.32
752.57
174.65
501.67
Age subgroup 4 (later old-age)
1- to 30-year forecast
243.04
1085.61
294.43
856.12
To further understand the four models’ forecasting accuracy for different age periods, Table 3 also gives average testing MSEs for the four age subgroups. All the 16 experiments listed in Table 2 give similar partitions of the full-age: childhood, teenage & young adulthood, early old-age and later old-age, where their fitted separating ages have the scale 8–13, 45–52 and 76–82 respectively. Similar as the highlighted part, Table 3 represents the average of the MSEs of \(m_{x,t}\) for all forecasts for specific age subgroup and forecast horizon of each model. For example, the 4th row in Table 3 reports the average testing MSE for ages in childhood of the one-year to five-year \(m_{x,t}\) forecasts based on different training period listed as in the Table 2 for the four models.
From the results for age subgroups, for the 3rd and the 4th age subgroups the LCASSC model has the lowest average MSE of \(m_{x,t}\) for all the given forecast horizons among the four models. This demonstrate the out-performance of our model in forecasting mortality for old ages.
For the 1st and the 2nd age subgroups the LCASSC is not the most accurate among the four models but is comparable. However, by comparing the magnitude of errors in different age subgroups, one can notice that the error in the 3rd and the 4th age subgroups have a much greater contribution to the overall average MSE for full-age than that of the first two age subgroups. Slightly less accurate for young age groups but significantly more accurate for older age groups, the LCASSC has the highest accuracy for full-age in all given forecast horizons as a result.
3.2 Comparison of forecast precision
We take the last experiment of our rolling-window cross-validation forecast tests in the Table 2 as the exemplar, i.e., with training period 1925–1990 and 30-year forecasts, to compare the forecasts precision of LCASSC, Model (3) and Model (4). The three models have the same historical values of \(\kappa _t^k\) for the training period but apply different methods to forecast future values of \(\kappa _t^k\), resulting in different forecasts of \(m_{x,t}\). In this section, we compare their forecasts precision for \(\ln {m_{x,t}}\) as well as the \(\kappa _t^k\).
3.2.1 Forecast precision for \(\ln {m_{x,t}}\)
Figure 2 provides the 95% prediction intervals for the one-year through thirty-year forecasts of \(\ln m_{x,t}\) for ages {0, 10, 25, 40, 55, 70, 80, 90, 95} estimated by LCASSC3, Model (3), and Model (4).
The comparison in Fig. 2 shows that LCASSC gives the best 95% prediction interval of \(\ln {m_{x,t}}\) forecasts. It always provides the narrowest 95% prediction intervals among the three models for all the given ages, and its prediction intervals cover all the observed \(\ln {m_{x,t}}\) values expect one observation (269 out of 270), shown by the black curves after 1990. The advantage is especially clear for the older ages {55, 70, 80, 90, 95}, where LCASSC gives perfect 95% prediction intervals covering all the observed \(\ln {m_{x,t}}\) values with much narrower width. For example, for the age 90, the width of the 95% prediction interval for the 30-year ahead forecast from LCASSC is only 47.82% and 70.66% of that of Model (3) and Model (4), respectively. It is also worth mentioning that the 68% prediction intervals from LCASSC have already covered all the observed \(\ln {m_{x,t}}\) values for the older ages {55, 70, 80, 90, 95}, shown by the red dashed lines.
Compared with LCASSC, Model (4) always tends to underestimate \(\ln {m_{x,t}}\) and gives unnecessarily wide prediction intervals, like for ages 0,10,55,80,95. Model (3) also gives unnecessarily wide prediction intervals in old-age forecasting, especially for ages 85 and 95, and it sometimes overestimates \(\ln {m_{x,t}}\) by a lot, like for ages 70 and 80.
In this Sect. 3.1, we showed that our model has the best 95% prediction intervals and the most accurate predicted values for forecasts of \(\ln {m_{x,t}}\), for the full age range generally. Now, we make some observations about model performance in different age groups, as seen in Fig. 2:
For infants and childhood (plots for ages 0 and 10), LCASSC gives good predicted values shown by the red curves, and its 95% prediction intervals successfully cover all of the black curves representing the observed data except one. By contrast, Model (3) predicts well but gives much wider prediction intervals than LCASSC for both cases; Model (4) obviously underestimates and gives underestimating, unnecessarily wide prediction intervals.
For the early stage of adulthood (plots for ages 25 and 40), LCASSC tends to overestimate the value of \(\ln {m_{x,t}}\) but gives perfect 95% prediction intervals which cover all observed data. By contrast, Model (4) underestimates and gives underestimating prediction intervals; Model (3) gives the best predicted values but its prediction intervals are slightly wider than that of LCASSC.
For the later adulthood and old ages (plots for ages 55, 70, 80, 90, 95), LCASSC gives good predicted values and perfect 95% prediction intervals which cover all the observed data. By contrast, Model (3) tends to overestimate for ages 70 and 80, and gives unnecessarily wide prediction intervals for all the given five ages, especially for ages 90 and 95; Model (4) underestimates for ages 55 and 95, and gives underestimating, unnecessarily wide prediction intervals. This shows the advantage of our model in mortality forecasts for later adulthood and old ages.
Fig. 2
\(\ln m_{x,t}\) observed values from data, predicted values and 95% prediction intervals for one-year through thirty-year forecasts with LCASSC, Model (4), and Model (3) after training on 1925–1990 French male civilian population data. In each plot, the solid black curve shows the observed values for \(\ln m_{x,t}\) over 1925–2020; the solid red, green and blue curves represent thirty years of point-forecast values from LCASSC, Model (4), and Model (3) respectively; and the pink, green, and blue shaded parts represent the forecast 95% prediction intervals from LCASSC, Model (4), and Model (3), respectively. The red dashed lines represent the forecast 68% prediction intervals from LCASSC
\(m_{x,t}\) observation covered rate and total area of the 95% prediction intervals of \(m_{x,t}\), on full-age and old-age, for one-year through thirty-year forecasts with LCASSC, Model (3) and Model (4) after training on 1925–1990 French male civilian population data
Model
Data covered rate r
Prediction area A
r/A (% per area unit)
Full-age (0–95)
LCASSC
99.51%
59.60
1.67
Model (3)
99.90%
115.59
0.86
Model (4)
99.90%
67.59
1.48
Old-age (Age subgroups 3 &4: 51–95)
LCASSC
100.00%
54.97
1.82
Model (3)
100.00%
110.88
0.90
Model (4)
100%
65.08
1.54
Furthermore, to consider the model forecast precision comprehensively for other ages not plotted in Fig. 2, Table 4 gives the covering rate of the observed \(m_{x,t}\) values r, the total area A, and the ratio of these two for the 95% prediction intervals of \(m_{x,t}\) for 1-year through 30-year forecasts with the three models, on full-age as well as on old-age. Here r/A measures the effectiveness of the prediction intervals, as averaged covering rate per unit of area can capture.
From the full-age results, all three models have excellent covering rate (over 99.5%), however LCASSC gives the smallest total area of the 95% prediction intervals with the highest r/A ratio, which means the 95% prediction interval of LCASSC is the most effective among the three models. Though Model (3) and Model (4) show higher covering rate, it is at the cost of larger prediction areas: Model (3) increased the covering rate by 0.39% from LCASSC but with 93.94% more prediction area than that of LCASSC; Model (4) increased the covering rate by 0.39% from LCASSC but with 13.41% more prediction area than that of LCASSC.
The old-age results support the conclusion in Sect. 3.1: the LCASSC model has an advantage for forecasting mortality in old-age. It uses the smallest total area of the 95% prediction intervals to provide perfect 95% prediction intervals that successfully cover all observed values of \(m_{x,t}\) for the age period 51–95 (old-age) in all the 1-year through 30-year forecasts.
3.2.2 Forecast precision of \(\kappa _t^k\)
For the exemplar studied in this Sect. 3.2, the full age range 0–95 is partitioned by {13, 50, 82} into four age subgroups, which have their associated variation of mortality over time \(\kappa _t^k\), for \(k=1, 2, 3, 4\) respectively. Figure 3 provides 95% prediction intervals for the one-year through thirty-year forecasts of the four time series \(\kappa _t^k\) estimated by three models: LCASSC, Model (3), and Model (4).
As for the predicted values of \(\kappa _t^k\), our model usually gives balanced forecasts between the predicted values given by the other two models for age periods 1–13, 50–82, and 82–95, resulting in the best accuracy of our model in mortality forecasts demonstrated by Sect. 3.1. Model (4) tends to give the lowest prediction, which explains its general underestimation in mortality stated in Sect. 3.2.1. Model (3) obviously predicts \(\kappa _t^3\) much higher than our model for ages 50–82, which may explain its overestimation of \(\ln m_{x,t}\) for ages 70 and 80 mentioned in Sect. 3.2.1. Our model predicts \(\kappa _t^2\) highest among the three models for ages 13–50, which may explain its overestimation of \(\ln m_{x,t}\) for age 40 shown in Sect. 3.2.1.
As for the width of prediction intervals for the forecasts of the \(\kappa _t^k\), LCASSC always has the narrowest, Model (3) the widest for the 1st, 2nd and 3rd age subgroups, and Model (4) the widest for the 4th age subgroup, which may explain why Model (3) and Model (4) give unnecessarily wide prediction intervals for some ages in the corresponding age subgroups, as observed in Fig. 2.
Fig. 3
Values of \(\kappa _t^k\) fit to historical data and 95% prediction intervals for one-year through thirty-year forecasts with LCASSC, Model (4), and Model (3) after training on 1925–1990 French male civilian population data. In each plot, the solid black curve shows the estimated historical values for \(\kappa _t^k\) from fitting by Sect. 2.1 methods over 1925–1990; the solid red, green and blue curves represent thirty years of point-forecast values from the LCASSC model, Model (4) and Model (3), respectively; and the pink, green and blue shaded parts represent the forecast 95% prediction intervals from LCASSC, Model (4) and Model (3), respectively
We forecast the logarithm of 2050 mortality, i.e. \(\ln m_{x,t}\), for age 0–95 for the French male civilian population by training on the period 1925–2020 using the LCASSC model. Figure 4 shows the 30-year-ahead mortality forecasting results and the 68% and 95% prediction intervals.4
Fig. 4
Year 2050 predicted values of \(\ln m_{x,t}\), the 68% prediction interval, and the 95% prediction interval for thirty-year-ahead forecast by the LCASSC model after training on 1925–2020 French male civilian population data. The solid black curve shows the predicted values; the pink shaded part represents the forecast 68% prediction interval; the dashed lines represent the forecast 95% prediction interval
We use this forecast as an exemplar to show the age grouping and mortality structural change results from our model fitting as well. By training the method in Sect. 2.1 on data from 1925 to 2020, our model grouped the full-age 0–95 into the following four age subgroups: [0, 14], (14, 49], (49, 86], and (86, 95] and considered different variation of mortality over time \(\kappa _t^k\), where \(k=1, 2, 3, 4\), for each age subgroup. This partition roughly groups the full-age into childhood, late teenage and early adulthood, early old-age and later old-age, which matches the natural sense that mortality variation over time has different pattern for these four age subgroups. The forecasting tests in Sect. 3.1 give similar partitions as well.
Moreover, the LCASSC model also reveals insightful information about mortality structural change, not only the permanent structural changes but also the recurrent structural changes.
For the permanent structural changes experienced by \(\kappa _t^k\) in each age subgroup, the fitting results are shown in the Table 5. Here \(T_1^k\) is the time of permanent structural change for the specific age subgroup; \(\mu _1^k\) and \(\mu _2^k\) are the drift of \(\kappa _t^k\) before and after the permanent structural change, respectively. For example, the mortality variation \(\kappa _t^1\) for ages [0, 14] experienced a permanent structural change in 1955; before and in 1955, the drift of \(\kappa _t^1\) is \(-\)4.4293 while after 1955, the drift of \(\kappa _t^1\) is \(-\)2.9876. We also ran the Chow test on \(\kappa _t^k\) separately for each age subgroups, using the fitted time of structural break from LCASSC. The p-values are \(8.5\times 10^{-7}\), 0.5, \(2.2\times 10^{-16}\), and \(1.9\times 10^{-4}\) respectively for k from 1 to 4. This again supports that a structural break point is present in each age subgroup at the fitted time, except for the age group (14, 49]. However, the fitted change in drifts for this group was smaller than with the other groups, and the Chow test is less powerful when the change is small, so the lack of significance with group 2 is perhaps not surprising.
Table 5
Estimated values of \(\mu _1^k\), \(\mu _2^k\), \(T_1^k\) by the LCASSC model after training on 1925–2020 French male civilian population data
\(\kappa _t^1\) for ages [0, 14]
\(\kappa _t^2\) for ages (14, 49]
\(\kappa _t^3\) for ages (49, 86]
\(\kappa _t^4\) for ages (86, 95]
\(T_1^k\)
1955
1999
1978
1962
\(\mu _1^k\)
\(-\)4.4293
\(-\)2.0532
\(-\)1.0731
\(-\)0.5973
\(\mu _2^k\)
\(-\)2.9876
\(-\)2.3689
\(-\)1.6631
\(-\)0.8048
Table 5 shows that the permanent structural change time is very different for each age subgroup, which illustrates the necessity of grouping the full-age into subgroups since they have different patterns for the variation of mortality over time. For each specific age subgroup, there is significant difference between the values of \(\mu _1^k\) and \(\mu _2^k\), which shows the necessity of considering permanent structural change in mortality and the different \(\kappa _t^k\) drifts before and after it for each age subgroup.
LCASSC models recurrent structural changes in mortality by applying MSBVAR with lag-1 to the four time series \(\Delta (\kappa _r^1)_t\), \(\Delta (\kappa _r^2)_t\), \(\Delta (\kappa _r^3)_t\), \(\Delta (\kappa _r^4)_t\). Figure 5 shows that all four of them experienced recurrent structrual changes in 1940 and 1951, corresponding to World War II and its aftermath. The fitted full mean transition matrix Q of the \(\Delta (\kappa _r^k)_t\) is given below and shows an estimate for the regime-switching transition matrix of the recurrent structural changes in mortality.
Fitted state probabilities from two-regime MSBVAR(1) for \(\Delta (\kappa _r^k)_t\) during 1925–2020. The black curve expresses the state probability of regime 1 while the red curve expresses the state probability of regime 2
In Fig. 5, there is no recurrent structural change revealed after 1951, which may result from the uncommonly large shock caused by World War II. So we are also interested in how LCASSC model would capture recurrent structural changes in mortality if it is applied on a period of the same dataset which avoids World War II and its aftermath. So we retrained the LCASSC model on the 1960–2020 mortality data with one-lag and two-regime MSBVAR and the recurrent structural changes captured are shown in the Fig. 6.
There are two recurrent structural changes happening captured by our model, in 1983 and 2019 respectively. (The first known case of Coronavirus disease 2019 (COVID-19) is identified in 2019 and this disease quickly spread worldwide, which could explain the switching in 2019. We might also speculate that the 1983 switch had something to do with the AIDS epidemic, which took off in France right around that time and might have changed mortality patterns in certain age groups.) Furthermore, by comparing the amplitude of fitted state probabilities in Figs. 5 and 6, one can notice that the fitted state probability for the state with the World War II is nearly 100%, obviously higher than those in Fig. 6, which may imply the dominate effect of the World War II in a two-regime MSBVAR fitting on 1925–2020.
Fig. 6
Fitted state probabilities from two-regime MSBVAR(1) for \(\Delta (\kappa _r^k)_t\) after training LCASSC on 1960–2020 French male civilian population data. The black curve expresses the state probability of regime 1 while the red curve expresses the state probability of regime 2
Mortality has shown significant improvement in the last century, and its uncertainty is the focus of much recent research because of its importance in pricing and risk management of related financial products. The approach to mortality modeling in this paper was chosen with two main goals: to provide accurate mortality forecast and good estimation of mortality uncertainty for both short and long term horizons, and to capture and quantify the different mortality structural changes in different age subgroups, distinguishing permanent ones from recurring. Both of these goals are accomplished with the LCASSC model.
The Lee–Carter model omits the fact that different age periods experiences different mortality dynamics over time, which may explain its worst forecasting accuracy among the four models considered in this paper (Sect. 3.1). BVAR in Model (4) considers parameter correlation, but its forecasts always underestimate and universally produce prediction intervals that are too wide for forecasts across the full age range (Fig. 2). Model (3) ignores the parameter correlation and gives unnecessarily wide prediction intervals in old-age forecasting (Fig. 2). None of Model(2), Model (3) and Model (4) considers mortality structural changes, which have been observed in the last century for many populations, and ignoring those changes will lead to doubtful long-term mortality forecasts.
By considering correlation among time factors and mortality structural changes, permanent or recurring, the LCASSC model provides excellent forecast accuracy and strong estimation of forecast uncertainty for both short-term and long-term horizons (Sect. 3.1). It has the lowest average MSEs among the four models compared for all forecast horizons considered from short to long. When training on data in 1925–1990 and testing on 1- to 30-year forecasts for ages {0, 10, 25, 40, 55, 70, 80, 90, 95}, it gives 95% prediction intervals of mortality that successfully cover almost all the observed data (269 out of 270) and that have the narrowest width among the three models that apply the same time factors (Sect. 3.2). Moreover, since LCASSC models recurring structural changes by MSBVAR and permanent structural changes, it gives insightful quantitative information about those changes (Sect. 3.3).
However, we notice that in Fig. 4, the forecast of \(\ln {m_{x,t}}\) in 2050 and the corresponding prediction intervals show discontinuity at the separating age points of the four age subgroups. In future study, we would like to extend this model and carefully consider the projections of the time factors near the separating ages, so that it would provide a smoother mortality forecast across ages. Alternative future work would refine the model to incorporate a mechanism to increase coherence between the age group series (i.e., cointegration, mean-reversion, etc.) while preserving the ability for any individual age group to experience structural change.
Declarations
Conflict of interest
The authors declare they have no financial interests. The authors have no conflict of interest to declare that are relevant to the content of this article.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The variance in each term was computed using the observed data, not fitted values, and the sample variance was computed, i.e., \(Var[\ln m_{x,t}]=\frac{\sum _{x,t}\Big (\ln m_{x,t}-mean(\ln m_{x,t})\Big )^2}{n-1}\), where n is the total number of pairs x, t. The \(MSE[\ln m_{x,t}]\) follows the standard definition of mean squared error, i.e., \(MSE[\ln m_{x,t}]=\frac{\sum _{x,t}(\ln m_{x,t}-\ln {\hat{m}}_{x,t})^2}{n}\), where \({\hat{m}}_{x,t}\) is the fitted value and \(m_{x,t}\) is the observed value.
Slight discontinuities are observed at the age-subgroup cutoffs. The age cutoffs were fit by the model and would have estimation error; additionally, one expects they would experience some drift from year to year. One way to create a smooth picture is to account for this fuzziness is by fitting the model several times with slight changes to these age cutoffs. For instance, for the French data considered here, the model fit age 14 as a cutoff, but the modeler could re-fit the data using cutoffs at age 13 and 15. The point estimates of mortality can be computed using an average of the estimates using the varied cutoffs.
1.
Bardoutsos A, de Beer J, Janssen F (2018) Projecting delay and compression of mortality. Genus 74:17CrossRef
2.
Booth H, Maindonald J, Smith L (2002) Applying Lee–Carter under conditions of variable mortality decline. Popul Stud 56(3):325–336CrossRef
3.
Booth H, Hyndman RJ, Tickle L, Jong PD (2006) Lee–Carter mortality forecasting: a multi-country comparison of variants and extensions. Demogr Res 15:289–310CrossRef
4.
Börger M, Schupp J (2018) Modeling trend processes in parametric mortality models. Insur Math Econ 78:369–380MathSciNetCrossRefMATH
5.
Brandt PT (2012) Markov-switching, Bayesian vector autoregression models-package “MSBVAR”. The R Foundation for Statistical Computing
6.
Burger O, Baudisch A, Vaupel JW (2012) Human mortality improvement in evolutionary context. Proc Natl Acad Sci 109:18210–18214CrossRef
7.
Cairns A, Blake D, Dowd K, Coughlan GD, Epstein D, Ong A, Balevich I (2009) A quantitative comparison of stochastic mortality models using data from England and Wales and the United States. North Am Actuar J 13(1):1–35. https://doi.org/10.1080/10920277.2009.10597538MathSciNetCrossRef
8.
Charrad M, Ghazzali N, Boiteau V, Niknafs A (2014) Nbclust: An R package for determining the relevant number of clusters in a data set. J Stat Softw 61(6):1–36CrossRef
9.
Coelho E, Nunes LC (2011) Forecasting mortality in the event of a structural change. J R Stat Soc Ser A. Stat Soc 174(3):713–736MathSciNetCrossRef
10.
Danesi IL, Haberman S, Millossovich P (2015) Forecasting mortality in subpopulations using Lee–Carter type models: a comparison. Insur Math Econ 62:151–161MathSciNetCrossRefMATH
11.
Fu W, Smith B, Brewer P, Droms S (2022) A new mortality framework to identify trends and structural changes in mortality improvement and its application in forecasting. Risks 10(8):161. https://doi.org/10.3390/risks10080161CrossRef
12.
Fu W, Smith B, Brewer P, Droms S (2023) Markov-switching bayesian vector autoregression model in mortality forecasting. Risks 11(9):152CrossRef
Gao H, Mamon R, Liu X, Tenyakov A (2015) Mortality modelling with regime-switching for the valuation of a guaranteed annuity option. Insur Math Econ 63:108–20MathSciNetCrossRefMATH
15.
Giordano G, Haberman S, Russolillo M (2019) Coherent modeling of mortality patterns for age-specific subgroups. Decis Econ Finance 42:189–204MathSciNetCrossRefMATH
16.
Guibert Q, Lopez O, Piette P (2019) Forecasting mortality rate improvements with a high-dimensional VAR. Insur Math Econ 88:255–72MathSciNetCrossRefMATH
17.
Gylys R, Šiaulys J (2020) Estimation of uncertainty in mortality projections using state-space Lee–Carter model. Mathematics 8:1053CrossRef
18.
Hainaut D (2012) Multidimensional Lee–Carter model with switching mortality processes. Insur Math Econ 50:236–246MathSciNetCrossRefMATH
19.
Human Mortality Database (n.d.) University of California, Berkeley (USA), and Max Planck Institute for Demographic Research (Germany). Available online: www.mortality.org (accessed on 25 Aug 2022)
20.
Ignatieva K, Song A, Ziveyi J (2016) Pricing and hedging of guaranteed minimum benefits under regime-switching and stochastic mortality. Insur Math Econ 70:286–300MathSciNetCrossRefMATH
21.
Lee R, Carter LR (1992) Modeling and forecasting U.S. mortality. J Am Stat Assoc 87:659–671
Litterman RB (1986) Forecasting with Bayesian vector autoregressions: five years of experience. J Bus Econ Stat 4:25–38
27.
Liu X, Yu H (2011) Assessing and extending the Lee–Carter model for long-term mortality prediction. In: SOA Living to 100 Symposium, January 5–7, Orlando, Florida (2011)
Milidonis A, Lin Y, Cox SH (2011) Mortality regimes and Pricing. North Am Actuar J 15:266–89MathSciNetCrossRefMATH
30.
Mullen KM, Ardia D, Gil DL, Windover D, Cline J (2011) DEoptim: an R package for global optimization by differential evolution. J Stat Softw 40:1–26CrossRef
31.
Njenga CN, Sherris M (2020) Modeling mortality with a Bayesian vector autoregression. Insur Math Econ 94:40–57MathSciNetCrossRefMATH
Renshaw AE, Haberman S (2006) A cohort-based extension to the Lee–Carter model for mortality reduction factors. Insur Math Econ 38:556–70CrossRefMATH
34.
Robertson J, Tallman E (1999) Vector autoregressions: forecasting and reality. Econ Rev 84:4–18
35.
Shen Y, Siu T (2013) Longevity bond pricing under stochastic interest rate and mortality with regime-switching. Insur Math Econ 52:114–23MathSciNetCrossRefMATH
Wilmoth JR, Andreev KF, Jdanov D, Glei DA (2007) Methods protocol for the human mortality database. University of California, Berkeley, and Max Planck Institute for Demographic Research, Rostock
42.
Zhou R (2019) Modelling mortality dependence with regime-switching copulas. ASTIN Bull: J IAA 49(2):373–407MathSciNetCrossRef