Skip to main content
Log in

Statistical analysis of extreme events in a non-stationary context via a Bayesian framework: case study with peak-over-threshold data

  • Original Paper
  • Published:
Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

Abstract

Statistical analysis of extremes currently assumes that data arise from a stationary process, although such an hypothesis is not easily assessable and should therefore be considered as an uncertainty. The aim of this paper is to describe a Bayesian framework for this purpose, considering several probabilistic models (stationary, step-change and linear trend models) and four extreme values distributions (exponential, generalized Pareto, Gumbel and GEV). Prior distributions are specified by using regional prior knowledge about quantiles. Posterior distributions are used to estimate parameters, quantify the probability of models and derive a realistic frequency analysis, which takes into account estimation, distribution and stationarity uncertainties. MCMC methods are needed for this purpose, and are described in the article. Finally, an application to a POT discharge series is presented, with an analysis of both occurrence process and peak distribution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Berger JO (1985) Statistical decision theory and bayesian analysis. Springer, Berlin Heidelberg New York, p 617

    Google Scholar 

  • Chib S (1995) Marginal likelihood from the Gibbs output. J Am Stat Assoc 90:1313–1321

    Article  Google Scholar 

  • Coles S, Pericchi L (2003) Anticipating catastrophes through extreme value modelling. J R Stat Soc Ser C-Appl Stat 52:405–416

    Article  Google Scholar 

  • Coles SG, Powell EA (1996) Bayesian methods in extreme value modelling: a review and new developments. Int Stat Rev 64:119–136

    Article  Google Scholar 

  • Coles SG, Tawn JA (1996) A Bayesian analysis of extreme rainfall data. J R Stat Soc Ser C-Appl Stat 45:463–478

    Google Scholar 

  • Coles S, Pericchi LR, Sisson S (2003) A fully probabilistic approach to extreme rainfall modelling. J Hydrol 273:35–50

    Article  Google Scholar 

  • Cooley D (2005) Statistical analysis of extremes motivated by weather and climate studies: applied and theoretical advances. University of Colorado. 122 p

  • Cooley D, Nychka D, Naveau P (2005) A spatial Bayesian hierarchical model for a precipitation return levels map. In: Extreme value analysis, Gothenburg, Sweden

  • CTGREF (1980–1982) Srae, S.H. Diame. Synthèse nationale sur les crues des petits bassins versants. Fascicule 2: la méthode Socose; Information Technique no 38–2 (Juin 1980); Fascicule 3: la méthode Crupedix

  • Cunderlik JM, Burn DH (2003) Non-stationary pooled frequency analysis. J Hydrol 276:210–223

    Article  Google Scholar 

  • Diebolt J, El-Aroui MA, Garrido M, Girard S (2003) Quasi-conjugate Bayes estimates for GPD parameters and application to heavy tails modelling. Rapport de recherche INRIA. 29 p

  • Favre AC, El Adlouni S, Perreault L, Thiemonge N, Bobee B (2004) Multivariate hydrological frequency analysis using copulas. Water Resour Res 40, WO1101, DOI 10.1029/2003WR002456

  • Fisher RA, Tippett LH (1928) Limiting forms of the frequency distribution of the largest or smallest member of a sample. Cambridge Philos Soc 24:180–190

    Article  Google Scholar 

  • Galéa G, Prudhomme C (1997) Notions de base et concepts utiles pour la compréhension de la modélisation synthétique des régimes de crue des bassins versants au sens des modèles QdF. Revue des Sciences de l’ Eau 1:83–101

    Google Scholar 

  • Gelman A, Carlin JB, Stern HS, Rubin DB (1995) Bayesian data analysis. Chapman & Hall, London, 526p

  • GREHYS (1996) Presentation and review of some methods for regional flood frequency analysis. J Hydrol 186:63–84

    Article  Google Scholar 

  • Gumbel EJ (1958) Statistics of extremes. Columbia University Press, New York, pp 375

    Google Scholar 

  • Hastings WK (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57:97–109

    Article  Google Scholar 

  • IPCC (2001) Climate change 2001: synthesis report. Cambridge University Press, Cambridge, pp 408

  • Javelle P, Grésillon JM, Galéa G (1999) Discharge-duration-frequency curves modeling for floods and scale invariance. Comptes Rendus de l’Académie des Sciences, Sciences de la terre et des planètes 329:39–44

    Google Scholar 

  • Javelle P, Ouarda T, Lang M, Bobee B, Galéa G, Grésillon JM (2002) Development of regional flood-duration-frequency curves based on the index-flood method. J Hydrol 258:249–259

    Article  Google Scholar 

  • Kass RE, Raftery AE (1995) Bayes factors. J Am Stat Assoc 90:773–795

    Article  Google Scholar 

  • Katz RW, Parlange MB, Naveau P (2002) Statistics of extremes in hydrology. Adv Water Resour 25:1287–1304

    Article  Google Scholar 

  • Lang M (1999) Theoretical discussion and Monte-Carlo simulations for a Negative Binomial process paradox. Stoch Environ Res Risk Assess 13:183–200

    Article  Google Scholar 

  • Lang M, Ouarda TBMJ, Bobée B (1999) Towards operational guidelines for over-threshold modeling. J Hydrol 225:103–117

    Article  Google Scholar 

  • Madsen H, Mikkelsen PS, Rosbjerg D, Harremoes P (2002) Regional estimation of rainfall intensity-duration-frequency curves using generalized least squares regression of partial duration series statistics. Water Resour Res 38:1239

    Article  Google Scholar 

  • Metropolis N, Ulam S (1949) The Monte Carlo method. J Am Stat Assoc 44:335–341

    Article  CAS  Google Scholar 

  • Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953) Equation of state calculations by fast computing machines. J Chem Phys 21:1087–1092

    Article  CAS  Google Scholar 

  • Parent E, Bernier J (2003) Encoding prior experts judgments to improve risk analysis of extreme hydrological events via POT modeling. J Hydrol 283:1–18

    Article  Google Scholar 

  • Perreault L (2000) Analyse bayésienne rétrospective d’une rupture dans les séquences de variables aléatoires hydrologiques. ENGREF, INRS-Eau. 200 p

  • Perreault L, Fortin V (2003) Mixture and Hidden Markov models for peak flow analysis. in. Seizièmes entretiens du centre Jacques Cartier, Lyon, France

  • Perreault L, Bernier J, Bobee B, Parent E (2000a) Bayesian change-point analysis in hydrometeorological time series. Part 1. The normal model revisited. J Hydrol 235:221–241

    Article  Google Scholar 

  • Perreault L, Bernier J, Bobee B, Parent E (2000b) Bayesian change-point analysis in hydrometeorological time series. Part 2. Comparison of change-point models and forecasting. J Hydrol 235:242–263

    Article  Google Scholar 

  • Perreault L, Parent E, Bernier J, Bobee B, Slivitzky M (2000c) Retrospective multivariate Bayesian change-point analysis: a simultaneous single change in the mean of several hydrological sequences. Stoch Environ Res Risk Assess 14:243–261

    Article  Google Scholar 

  • Pickands J (1975) Statistical inference using extreme order statistics. Ann Stat 3:119–131

    Google Scholar 

  • Prudhomme C (1995) Modèles synthétiques des connaissances en hydrologie. Université Montpellier II, CEMAGREF Lyon. 400 p, Montpellier

  • Ray BK, Tsay RS (2002) Bayesian methods for change-point detection in long-range dependent processes. J Time Ser Anal 23:687–705

    Article  MathSciNet  Google Scholar 

  • Reed DW (1999) Flood estimation handbook. Vol 1: Overview. I. o. Hydrology. 108 p. Wallingford

  • Reis DS, Stedinger JR (2005) Bayesian MCMC flood frequency analysis with historical information. J Hydrol 313:97–116

    Article  Google Scholar 

  • Ritter C, Tanner MA (1992) Facilitating the Gibbs Sampler—the Gibbs Stopper and the Griddy-Gibbs Sampler. J Am Stat Assoc 87:861–868

    Article  Google Scholar 

  • Robert CP, Ryden T, Titterington DM (2000) Bayesian inference in hidden Markov models through the reversible jump Markov chain Monte Carlo method. J R Stat Soc Ser B-Stat Method 62:57–75

    Article  Google Scholar 

  • Rosbjerg D, Madsen H (2004) Advanced approaches in PDS/POT modelling of extreme hydrological events. In: Hydrology: science practice for the 21st century, B. H. Society, London, pp 217–221

  • Strupczewski WG, Kaczmarek Z (2001a) Non-stationary approach to at-site flood frequency modelling II. Weighted least squares estimation. J Hydrol 248:143–151

    Article  Google Scholar 

  • Strupczewski WG, Singh VP, Feluch W (2001b) Non-stationary approach to at-site flood frequency modelling I. Maximum likelihood estimation. J Hydrol 248:123–142

    Article  Google Scholar 

  • Tanner MA (1996) Tools for statistical inference. Springer, Berlin Heidelberg New York, p 208

    Google Scholar 

Download references

Acknowledgements

This work was conducted as part of a national program of hydrological research (PNRH), which associates Cemagref (Lyon), LTHE (Grenoble), Hydrosciences (Montpellier), Meteo France (Toulouse), and Electricité de France (EDF Chatou and Grenoble). The authors would like to thank all members of this project. The financial support provided by Cemagref and EDF for the PhD research of B. Renard is gratefully acknowledged. We also thank two anonymous reviewers for their helpful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Benjamin Renard.

Appendices

Annex 1: Griddy Gibbs sampling

The aim of this method is to simulate a sample from the posterior distribution \(p (\varvec{\theta} | \user2{X}).\) The Gibbs sampling algorithm can be written as follow:

  • Choose a starting value \({\varvec{\theta}}^{{(0)}} = {\left({\theta ^{{(0)}}_{1},\ldots,\theta ^{{(0)}}_{k}} \right)},\) and set j = 0.

  • Repeat N times:

    • j=j+1;

    • Sample θ (j)1 p1 | θ (j −1)2 ,...,θ(j − 1), X),

    • Sample θ (j)2 p2 | θ (j)1 (j − 1)3 ,...,θ (j − 1) k , X),

    • ...

    • Sample θ (j) q p q | θ (j)1 ,...,θ (j)q − 1 (j − 1)q+1 ,...,θ (j − 1) k , X),

    • ... ...

    • Sample θ (j) k p k | θ (j)1 ,...,θ (j)k − 1 , X)

The vectors series \((\varvec{\theta}^{(j)})\) converges to the target posterior distribution as j tends to infinity. In order to decrease the influence of the starting point, the first m iterations are usually deleted, and inference is made with the last Nm iterations. Sensitivity analysis may be necessary to determine acceptable values for m and N (usually, at least a few thousands of iterations are used for m and N). More generally, this method presents the same drawbacks as other iterative simulation techniques. Convergence thus has to be monitored, by choosing several starting points or computing convergence indices. Gelman et al. (1995) or Tanner (1996) provide some guidelines for improving the numerical simulations efficiency.

Unfortunately, the Gibbs sampling algorithm can usually not be used in this raw version, because it involves being able to sample from the full conditional densities p q | θ (j)1 ,...,θ (j)q − 1 (j − 1)q+1 ,...,θ (j − 1) k ,X ). Ritter and Tanner (1992) proposed the use of a discrete approximation of the cumulative density functions (cdf) of these distributions:

  • Choose a grid of points y 1, ..., y p .

  • Evaluate p i 1,...i−1, θi+1,...,θ q , X) on this grid, to obtain w 1,..., w p .

  • Compute the cumulative sums of w 1,...,w p to obtain an approximation of the CDF.

  • Sample u from a uniform distribution on [0, 1].

  • Transform u by the inverse of the approximate CDF.

This algorithm must be added at each step of the Gibbs iteration. If the full conditional densities are known only up to proportionality, p q | θ (j)1 ,...,θ (j)q − 1 (j - 1)q+1 ,...,θ (j − 1) k ,X ) can be replaced by f (j)1 ,...,θ (j)q − 1 ,y (j − 1)q+1 ,...,θ (j − 1) k ), the product of prior and likelihood, and the cumulative sums of w 1, ..., w p must then be divided by the total sum to provide the CDF approximation. Inversion of the CDF can be made with a linear interpolation between two grid points. The grid choice is the most important issue of this technique: it has to be broad enough to cover the range of the distributions, and fine enough to ensure a sufficient accuracy, keeping in mind that this supplementary step is computing-time expensive. Some improvements of the method are described in Ritter and Tanner (1992).

Annex 2: Chib method

The aim of this method is to compute the marginal distribution of the observations, which is the normalizing constant of the Bayes theorem:

$$p({\user2{X}}| M) = \frac{{\pi ({\varvec{\theta }})p({\user2{X}}| {\varvec{\theta }})}}{{p({\varvec{\theta}}| {\user2{X}})}} = \frac{{f({\varvec{\theta}})}}{{p({\varvec{\theta}}| {\user2{X}})}}.$$
(35)

This relationship being true for any vector \(\varvec{\theta},\) let us consider a particular \({\varvec{\theta}}^{*} = (\theta ^{*}_{1},\ldots,\theta ^{*}_{k}).\) \(f (\varvec{\theta}^{*})\) is directly computable, which is not the case of the denominator. Consider the following relationship:

$$\begin{aligned} p(\theta ^{*}_{1},\ldots,\theta ^{*}_{k} | {\user2{X}}) &= p(\theta ^{*}_{1} | {\user2{X}})p(\theta ^{*}_{2} | \theta ^{*}_{1}, {\user2{X}}) \cdots p(\theta ^{*}_{q} | \theta ^{*}_{1}, \theta ^{*}_{2},\ldots,\theta ^{*}_{{q - 1}}, {\user2{X}}) \\ &\quad \cdots p(\theta ^{*}_{k} | \theta ^{*}_{1},\ldots,\theta ^{*}_{{k - 1}},{\user2{X}}).\\ \end{aligned}$$
(36)

The first term can be evaluated thanks to the sample of the first marginal distribution, by using a Gaussian kernel as an example. The last term can be computed by 1-D numerical integration:

$$p(\theta ^{*}_{k} | \theta ^{*}_{1},\ldots,\theta ^{*}_{{k - 1}},{\user2{X}}) = \frac{{f({\varvec{\theta}}^{*})}}{{{\int {f(\theta ^{*}_{1},\ldots,\theta ^{*}_{{k - 1}},\theta _{k}){\rm d}}}\theta _{k}}}.$$
(37)

The griddy Gibbs sampling can be used to compute intermediary terms. p * q | θ *1 *2 ,...,θ *q − 1 ,X) is indeed the first marginal of the distribution p q q+1,...,θ k | θ *1 , θ *2 ,...,θ *q − 1 ,X), evaluated at θ * q . The griddy Gibbs algorithm can thus be applied to the non-normalized posterior density with the first q − 1 components being fixed, that is f *1 , θ *2 ,...,θ *q − 1 , θ q q+1,...,θ k ).

Although this approach is theoretically valid for any value \(\varvec{\theta}^{*}\) with non-zero posterior probability, Chib recommends the use of a high-density point to increase the method accuracy.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Renard, B., Lang, M. & Bois, P. Statistical analysis of extreme events in a non-stationary context via a Bayesian framework: case study with peak-over-threshold data. Stoch Environ Res Ris Assess 21, 97–112 (2006). https://doi.org/10.1007/s00477-006-0047-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00477-006-0047-4

Keywords

Navigation