Statistical analysis of extreme events in a non-stationary context via a Bayesian framework: case study with peak-over-threshold data

Renard, Benjamin; Lang, Michel; Bois, Philippe

doi:10.1007/s00477-006-0047-4

Statistical analysis of extreme events in a non-stationary context via a Bayesian framework: case study with peak-over-threshold data

Original Paper
Published: 01 April 2006

Volume 21, pages 97–112, (2006)
Cite this article

Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

Benjamin Renard¹,
Michel Lang¹ &
Philippe Bois²

1520 Accesses
80 Citations
Explore all metrics

Abstract

Statistical analysis of extremes currently assumes that data arise from a stationary process, although such an hypothesis is not easily assessable and should therefore be considered as an uncertainty. The aim of this paper is to describe a Bayesian framework for this purpose, considering several probabilistic models (stationary, step-change and linear trend models) and four extreme values distributions (exponential, generalized Pareto, Gumbel and GEV). Prior distributions are specified by using regional prior knowledge about quantiles. Posterior distributions are used to estimate parameters, quantify the probability of models and derive a realistic frequency analysis, which takes into account estimation, distribution and stationarity uncertainties. MCMC methods are needed for this purpose, and are described in the article. Finally, an application to a POT discharge series is presented, with an analysis of both occurrence process and peak distribution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical Methods for Nonstationary Extremes

Bayesian Methods for Non-stationary Extreme Value Analysis

Generalized distributions for modeling precipitation extremes based on the L moment approach for the Amman Zara Basin, Jordan

Article 10 May 2019

References

Berger JO (1985) Statistical decision theory and bayesian analysis. Springer, Berlin Heidelberg New York, p 617
Google Scholar
Chib S (1995) Marginal likelihood from the Gibbs output. J Am Stat Assoc 90:1313–1321
Article Google Scholar
Coles S, Pericchi L (2003) Anticipating catastrophes through extreme value modelling. J R Stat Soc Ser C-Appl Stat 52:405–416
Article Google Scholar
Coles SG, Powell EA (1996) Bayesian methods in extreme value modelling: a review and new developments. Int Stat Rev 64:119–136
Article Google Scholar
Coles SG, Tawn JA (1996) A Bayesian analysis of extreme rainfall data. J R Stat Soc Ser C-Appl Stat 45:463–478
Google Scholar
Coles S, Pericchi LR, Sisson S (2003) A fully probabilistic approach to extreme rainfall modelling. J Hydrol 273:35–50
Article Google Scholar
Cooley D (2005) Statistical analysis of extremes motivated by weather and climate studies: applied and theoretical advances. University of Colorado. 122 p
Cooley D, Nychka D, Naveau P (2005) A spatial Bayesian hierarchical model for a precipitation return levels map. In: Extreme value analysis, Gothenburg, Sweden
CTGREF (1980–1982) Srae, S.H. Diame. Synthèse nationale sur les crues des petits bassins versants. Fascicule 2: la méthode Socose; Information Technique no 38–2 (Juin 1980); Fascicule 3: la méthode Crupedix
Cunderlik JM, Burn DH (2003) Non-stationary pooled frequency analysis. J Hydrol 276:210–223
Article Google Scholar
Diebolt J, El-Aroui MA, Garrido M, Girard S (2003) Quasi-conjugate Bayes estimates for GPD parameters and application to heavy tails modelling. Rapport de recherche INRIA. 29 p
Favre AC, El Adlouni S, Perreault L, Thiemonge N, Bobee B (2004) Multivariate hydrological frequency analysis using copulas. Water Resour Res 40, WO1101, DOI 10.1029/2003WR002456
Fisher RA, Tippett LH (1928) Limiting forms of the frequency distribution of the largest or smallest member of a sample. Cambridge Philos Soc 24:180–190
Article Google Scholar
Galéa G, Prudhomme C (1997) Notions de base et concepts utiles pour la compréhension de la modélisation synthétique des régimes de crue des bassins versants au sens des modèles QdF. Revue des Sciences de l’ Eau 1:83–101
Google Scholar
Gelman A, Carlin JB, Stern HS, Rubin DB (1995) Bayesian data analysis. Chapman & Hall, London, 526p
GREHYS (1996) Presentation and review of some methods for regional flood frequency analysis. J Hydrol 186:63–84
Article Google Scholar
Gumbel EJ (1958) Statistics of extremes. Columbia University Press, New York, pp 375
Google Scholar
Hastings WK (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57:97–109
Article Google Scholar
IPCC (2001) Climate change 2001: synthesis report. Cambridge University Press, Cambridge, pp 408
Javelle P, Grésillon JM, Galéa G (1999) Discharge-duration-frequency curves modeling for floods and scale invariance. Comptes Rendus de l’Académie des Sciences, Sciences de la terre et des planètes 329:39–44
Google Scholar
Javelle P, Ouarda T, Lang M, Bobee B, Galéa G, Grésillon JM (2002) Development of regional flood-duration-frequency curves based on the index-flood method. J Hydrol 258:249–259
Article Google Scholar
Kass RE, Raftery AE (1995) Bayes factors. J Am Stat Assoc 90:773–795
Article Google Scholar
Katz RW, Parlange MB, Naveau P (2002) Statistics of extremes in hydrology. Adv Water Resour 25:1287–1304
Article Google Scholar
Lang M (1999) Theoretical discussion and Monte-Carlo simulations for a Negative Binomial process paradox. Stoch Environ Res Risk Assess 13:183–200
Article Google Scholar
Lang M, Ouarda TBMJ, Bobée B (1999) Towards operational guidelines for over-threshold modeling. J Hydrol 225:103–117
Article Google Scholar
Madsen H, Mikkelsen PS, Rosbjerg D, Harremoes P (2002) Regional estimation of rainfall intensity-duration-frequency curves using generalized least squares regression of partial duration series statistics. Water Resour Res 38:1239
Article Google Scholar
Metropolis N, Ulam S (1949) The Monte Carlo method. J Am Stat Assoc 44:335–341
Article CAS Google Scholar
Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953) Equation of state calculations by fast computing machines. J Chem Phys 21:1087–1092
Article CAS Google Scholar
Parent E, Bernier J (2003) Encoding prior experts judgments to improve risk analysis of extreme hydrological events via POT modeling. J Hydrol 283:1–18
Article Google Scholar
Perreault L (2000) Analyse bayésienne rétrospective d’une rupture dans les séquences de variables aléatoires hydrologiques. ENGREF, INRS-Eau. 200 p
Perreault L, Fortin V (2003) Mixture and Hidden Markov models for peak flow analysis. in. Seizièmes entretiens du centre Jacques Cartier, Lyon, France
Perreault L, Bernier J, Bobee B, Parent E (2000a) Bayesian change-point analysis in hydrometeorological time series. Part 1. The normal model revisited. J Hydrol 235:221–241
Article Google Scholar
Perreault L, Bernier J, Bobee B, Parent E (2000b) Bayesian change-point analysis in hydrometeorological time series. Part 2. Comparison of change-point models and forecasting. J Hydrol 235:242–263
Article Google Scholar
Perreault L, Parent E, Bernier J, Bobee B, Slivitzky M (2000c) Retrospective multivariate Bayesian change-point analysis: a simultaneous single change in the mean of several hydrological sequences. Stoch Environ Res Risk Assess 14:243–261
Article Google Scholar
Pickands J (1975) Statistical inference using extreme order statistics. Ann Stat 3:119–131
Google Scholar
Prudhomme C (1995) Modèles synthétiques des connaissances en hydrologie. Université Montpellier II, CEMAGREF Lyon. 400 p, Montpellier
Ray BK, Tsay RS (2002) Bayesian methods for change-point detection in long-range dependent processes. J Time Ser Anal 23:687–705
Article MathSciNet Google Scholar
Reed DW (1999) Flood estimation handbook. Vol 1: Overview. I. o. Hydrology. 108 p. Wallingford
Reis DS, Stedinger JR (2005) Bayesian MCMC flood frequency analysis with historical information. J Hydrol 313:97–116
Article Google Scholar
Ritter C, Tanner MA (1992) Facilitating the Gibbs Sampler—the Gibbs Stopper and the Griddy-Gibbs Sampler. J Am Stat Assoc 87:861–868
Article Google Scholar
Robert CP, Ryden T, Titterington DM (2000) Bayesian inference in hidden Markov models through the reversible jump Markov chain Monte Carlo method. J R Stat Soc Ser B-Stat Method 62:57–75
Article Google Scholar
Rosbjerg D, Madsen H (2004) Advanced approaches in PDS/POT modelling of extreme hydrological events. In: Hydrology: science practice for the 21st century, B. H. Society, London, pp 217–221
Strupczewski WG, Kaczmarek Z (2001a) Non-stationary approach to at-site flood frequency modelling II. Weighted least squares estimation. J Hydrol 248:143–151
Article Google Scholar
Strupczewski WG, Singh VP, Feluch W (2001b) Non-stationary approach to at-site flood frequency modelling I. Maximum likelihood estimation. J Hydrol 248:123–142
Article Google Scholar
Tanner MA (1996) Tools for statistical inference. Springer, Berlin Heidelberg New York, p 208
Google Scholar

Download references

Acknowledgements

This work was conducted as part of a national program of hydrological research (PNRH), which associates Cemagref (Lyon), LTHE (Grenoble), Hydrosciences (Montpellier), Meteo France (Toulouse), and Electricité de France (EDF Chatou and Grenoble). The authors would like to thank all members of this project. The financial support provided by Cemagref and EDF for the PhD research of B. Renard is gratefully acknowledged. We also thank two anonymous reviewers for their helpful comments.

Author information

Authors and Affiliations

U.R. Hydrologie-Hydraulique, Cemagref Centre de Lyon, 3 bis Quai Chauveau, CP 220, 69336, Lyon cedex 09, France
Benjamin Renard & Michel Lang
LTHE, Laboratoire d’études des Transferts en Hydrologie et Environnement, BP53, 38041, Grenoble cedex 09, France
Philippe Bois

Authors

Benjamin Renard
View author publications
You can also search for this author in PubMed Google Scholar
Michel Lang
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Bois
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Benjamin Renard.

Appendices

Annex 1: Griddy Gibbs sampling

The aim of this method is to simulate a sample from the posterior distribution $p (\varvec{\theta} | \user2{X}).$ The Gibbs sampling algorithm can be written as follow:

Choose a starting value ${\varvec{\theta}}^{{(0)}} = {\left({\theta ^{{(0)}}_{1},\ldots,\theta ^{{(0)}}_{k}} \right)},$ and set j = 0.
Repeat N times:
- j=j+1;
- Sample θ ^(j)₁ ∼ p(θ₁ | θ ^{(j −1)}₂ ,...,θ^{(j − 1)}, X),
- Sample θ ^(j)₂ ∼ p(θ₂ | θ ^(j)₁ ,θ ^{(j − 1)}₃ ,...,θ ^{(j − 1)}_k , X),
- ...
- Sample θ ^(j)_q ∼ p(θ_q | θ ^(j)₁ ,...,θ ^(j)_{q − 1} ,θ ^{(j − 1)}_q+1 ,...,θ ^{(j − 1)}_k , X),
- ... ...
- Sample θ ^(j)_k ∼ p(θ_k | θ ^(j)₁ ,...,θ ^(j)_{k − 1} , X)

The vectors series $(\varvec{\theta}^{(j)})$ converges to the target posterior distribution as j tends to infinity. In order to decrease the influence of the starting point, the first m iterations are usually deleted, and inference is made with the last N−m iterations. Sensitivity analysis may be necessary to determine acceptable values for m and N (usually, at least a few thousands of iterations are used for m and N). More generally, this method presents the same drawbacks as other iterative simulation techniques. Convergence thus has to be monitored, by choosing several starting points or computing convergence indices. Gelman et al. (1995) or Tanner (1996) provide some guidelines for improving the numerical simulations efficiency.

Unfortunately, the Gibbs sampling algorithm can usually not be used in this raw version, because it involves being able to sample from the full conditional densities p(θ_q | θ ^(j)₁ ,...,θ ^(j)_{q − 1} ,θ ^{(j − 1)}_q+1 ,...,θ ^{(j − 1)}_k ,X ). Ritter and Tanner (1992) proposed the use of a discrete approximation of the cumulative density functions (cdf) of these distributions:

Choose a grid of points y ₁, ..., y _p.
Evaluate p (θ_i|θ_1,...,θ_i−1, θ_i+1,...,θ_q, X) on this grid, to obtain w ₁,..., w _p.
Compute the cumulative sums of w ₁,...,w _p to obtain an approximation of the CDF.
Sample u from a uniform distribution on [0, 1].
Transform u by the inverse of the approximate CDF.

This algorithm must be added at each step of the Gibbs iteration. If the full conditional densities are known only up to proportionality, p(θ_q | θ ^(j)₁ ,...,θ ^(j)_{q − 1} ,θ ^{(j - 1)}_q+1 ,...,θ ^{(j − 1)}_k ,X ) can be replaced by f(θ ^(j)₁ ,...,θ ^(j)_{q − 1} ,y,θ ^{(j − 1)}_q+1 ,...,θ ^{(j − 1)}_k ), the product of prior and likelihood, and the cumulative sums of w ₁, ..., w _p must then be divided by the total sum to provide the CDF approximation. Inversion of the CDF can be made with a linear interpolation between two grid points. The grid choice is the most important issue of this technique: it has to be broad enough to cover the range of the distributions, and fine enough to ensure a sufficient accuracy, keeping in mind that this supplementary step is computing-time expensive. Some improvements of the method are described in Ritter and Tanner (1992).

Annex 2: Chib method

The aim of this method is to compute the marginal distribution of the observations, which is the normalizing constant of the Bayes theorem:

$$p({\user2{X}}| M) = \frac{{\pi ({\varvec{\theta }})p({\user2{X}}| {\varvec{\theta }})}}{{p({\varvec{\theta}}| {\user2{X}})}} = \frac{{f({\varvec{\theta}})}}{{p({\varvec{\theta}}| {\user2{X}})}}.$$

(35)

This relationship being true for any vector $\varvec{\theta},$ let us consider a particular ${\varvec{\theta}}^{*} = (\theta ^{*}_{1},\ldots,\theta ^{*}_{k}).$ $f (\varvec{\theta}^{*})$ is directly computable, which is not the case of the denominator. Consider the following relationship:

$$\begin{aligned} p(\theta ^{*}_{1},\ldots,\theta ^{*}_{k} | {\user2{X}}) &= p(\theta ^{*}_{1} | {\user2{X}})p(\theta ^{*}_{2} | \theta ^{*}_{1}, {\user2{X}}) \cdots p(\theta ^{*}_{q} | \theta ^{*}_{1}, \theta ^{*}_{2},\ldots,\theta ^{*}_{{q - 1}}, {\user2{X}}) \\ &\quad \cdots p(\theta ^{*}_{k} | \theta ^{*}_{1},\ldots,\theta ^{*}_{{k - 1}},{\user2{X}}).\\ \end{aligned}$$

(36)

The first term can be evaluated thanks to the sample of the first marginal distribution, by using a Gaussian kernel as an example. The last term can be computed by 1-D numerical integration:

$$p(\theta ^{*}_{k} | \theta ^{*}_{1},\ldots,\theta ^{*}_{{k - 1}},{\user2{X}}) = \frac{{f({\varvec{\theta}}^{*})}}{{{\int {f(\theta ^{*}_{1},\ldots,\theta ^{*}_{{k - 1}},\theta _{k}){\rm d}}}\theta _{k}}}.$$

(37)

The griddy Gibbs sampling can be used to compute intermediary terms. p(θ ^*_q | θ ^*₁ ,θ ^*₂ ,...,θ ^*_{q − 1} ,X) is indeed the first marginal of the distribution p(θ_q,θ_q+1,...,θ_k | θ ^*₁ , θ ^*₂ ,...,θ ^*_{q − 1} ,X), evaluated at θ ^*_q . The griddy Gibbs algorithm can thus be applied to the non-normalized posterior density with the first q − 1 components being fixed, that is f(θ ^*₁ , θ ^*₂ ,...,θ ^*_{q − 1} , θ_q,θ _q+1,...,θ_k).

Although this approach is theoretically valid for any value $\varvec{\theta}^{*}$ with non-zero posterior probability, Chib recommends the use of a high-density point to increase the method accuracy.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Renard, B., Lang, M. & Bois, P. Statistical analysis of extreme events in a non-stationary context via a Bayesian framework: case study with peak-over-threshold data. Stoch Environ Res Ris Assess 21, 97–112 (2006). https://doi.org/10.1007/s00477-006-0047-4

Download citation

Published: 01 April 2006
Issue Date: December 2006
DOI: https://doi.org/10.1007/s00477-006-0047-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical analysis of extreme events in a non-stationary context via a Bayesian framework: case study with peak-over-threshold data

Abstract

Access this article

Similar content being viewed by others

Statistical Methods for Nonstationary Extremes

Bayesian Methods for Non-stationary Extreme Value Analysis

Generalized distributions for modeling precipitation extremes based on the L moment approach for the Amman Zara Basin, Jordan

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Annex 1: Griddy Gibbs sampling

Annex 2: Chib method

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Statistical analysis of extreme events in a non-stationary context via a Bayesian framework: case study with peak-over-threshold data

Abstract

Access this article

Similar content being viewed by others

Statistical Methods for Nonstationary Extremes

Bayesian Methods for Non-stationary Extreme Value Analysis

Generalized distributions for modeling precipitation extremes based on the L moment approach for the Amman Zara Basin, Jordan

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Annex 1: Griddy Gibbs sampling

Annex 2: Chib method

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation