Elsevier

Journal of Hydrology

Volume 225, Issues 3–4, 6 December 1999, Pages 103-117
Journal of Hydrology

Review
Towards operational guidelines for over-threshold modeling

https://doi.org/10.1016/S0022-1694(99)00167-5Get rights and content

Abstract

Annual maximum flood (AMF) sampling remains the most popular approach to flood frequency analysis. An alternative, the “peaks over threshold” (POT) approach, deals with the selection of over-threshold values. However, the POT approach remains under-employed mainly because of the complexities associated with its use. Among the difficulties are the choice of threshold and the selection of criteria for retaining flood peaks. The literature remains sparse and incoherent concerning the various elements and complexities of the POT model. The purpose of the present paper is to shed some light on some of the recurrent and most important questions with regard to the practice of POT modeling, and to make a first step in establishing a set of coherent practice-oriented guidelines for the use of the POT model. This paper reviews tests and methods useful for modeling the process of over-threshold values, the choice of the threshold level, the verification of the independence of the values and the stationarity of the process, and also presents an application.

Introduction

The efficient planning, design, and operation of hydrotechnical works require an in-depth understanding of the probabilistic behavior of extreme events. Frequency analysis of extreme hydrologic events can be used to acquire such understanding, and provide adequate flood quantile estimates. Flood frequency analysis can be based on the annual maximum flood (AMF) approach, or the peaks-over-threshold (POT) approach, also called partial duration series approach (PDS). An AMF sample is constructed by extracting from a series of flows the maximum value of each year (annual flood), i.e. only one event per year is retained. On the other side, the POT approach to hydrologic frequency analysis consists in retaining all peak values that “exceed” a certain truncation level S usually called “base level” or “threshold”. Hence, the POT approach is not confined to only one event per year. The main advantage of POT modeling is that it allows for a more rational selection of events to be considered as “floods”. Unlike the AMF modeling, which includes only one event per year, the POT approach considers a wider range of events, and provides the possibility to control the number of flood occurrences to be included in the analysis by appropriate selection of the threshold. In fact, some annual floods may not even be selected as flood events in the POT approach.

Compared to the AMF approach, POT modeling provides additional flexibility in the representation of floods and a more complete description of the flood generating process. In fact, POT modeling is a compromise between AMF analysis and classical time series modeling. The POT approach implies a dual-domain-modeling since it requires the analysis of both the magnitude and the time of arrival of peaks, and hence, allows to capture more information concerning the whole flood phenomena than its AMF counterpart. Further, POT modeling concentrates only on the higher maximum values, which contain most of the information about flood processes, while time series analysis focuses on modeling the autocorrelation structure for the whole series. Consequently, POT modeling is theoretically easier than time series modeling and is more adapted to the analysis of extreme values.

However, the additional flexibility of the POT approach is often associated with an additional analytical complexity. Furthermore, the POT approach suffers from a lack of general guidelines for its application, and from a multitude of unsolved questions concerning the various details of the approach. Consequently, as compared to the AMF approach, the POT model remains relatively unpopular and under-employed in the practice of design flood estimation. The problems of choice of threshold and selection of criteria for retaining flood peaks represent some of the main difficulties associated with the POT approach. These two elements are of great importance since they are crucial for satisfying the model hypotheses concerning the independence and distribution of flood peaks. In contrast, the AMF approach, based on the selection of the largest discharge for each year of the record, naturally leads to flood events that are generally identically distributed. The literature remains sparse and incoherent concerning the various elements and complexities of the POT model.

If the popularity of the POT model is to be increased among practitioners, it seems important to propose a set of comprehensive practice-oriented guidelines for its use, as emphasized by Rasmussen et al. (1994). The first step towards reaching that objective goes through performing an intensive state-of-the-art review of the various components of the POT model. The purpose of the proposed research is to provide an answer to some of the recurrent and most important questions with regard to the practice of POT modeling.

This paper presents the general guidelines proposed for over-threshold modeling: first, the sampling technique for POT values is introduced (Section 2), then the study of the occurrence process (Section 3) and the POT modeling (Section 4) are discussed, and finally the combination of occurrence process and POT distributions to obtain the AMF distribution are presented (Section 5). A numerical application of these various steps is also presented for one hydrometric data series (Section 6).

Section snippets

Independence criteria

Meeting the independence condition by the set of selected peaks is a prerequisite to any statistical frequency analysis (and to the Poisson process assumption). Several criteria have been proposed in the literature in order to verify this hypothesis. The Water Resources Council (USWRC, 1976) imposes that successive flood events be separated by at least as many days as five plus the natural logarithm of square miles of basin area. This in addition to the arbitrary requirement that the

Study of the occurrence process

The occurrence process of events E can either be described by the duration θ separating two successive occurrences of the event (also called interevent duration), or by the number mt of events which occurred during the time interval [0,t]. To each of these variables we can associate a cumulative distribution function, a probability density function and a mean value.

  • Interevent duration θ:F(x)=Prob[θ<x],andf(x)dx=Prob[x<θ<x+dx].The return period of the event is defined as:T=E(θ)=0+∞θf(θ)dθ

  • Number

POT modeling

If X is a random variable, we define Xs as the maximum value of X in an episode. An episode is defined as a function of a threshold level S: it begins when X(t) exceeds S and ends when X(t) falls below the level S. We also define the POT distribution:Gs(x)=Prob[Xs<x]The return period T(x) can be defined as the average duration between two successive values Xs exceeding x, and is linked to the distribution Gs by (Rosbjerg, 1985):Gs(x)=1−1/[μT(x)]

Correspondence between POT and AMF distributions

We define X as the annual maximum value of X, with the AMF distribution Fx. The following equations can be derived (Shane and Lynn, 1964):Fx(x)=Prob[X<x]=k=0wk(1)[GS(x)]k(general case)Fx(x)=exp{−μ[1−GS(x)]}(Poissonprocess)

In the case of a Poisson process, Eq. (10) leads to the following relationships:

  • The POT model with exponentially distributed peaks leads to a Gumbel distribution of annual floods (AMF):Gs(x)=1−exp[−(x−S)/a]Fx(x)=exp{−μexp[−(x−S)/a]}

  • The POT model with Generalized Pareto

Numerical application

The previous tests were applied to the hydrometric data series from the station of Saint-Laurent-du-Pont on the Le-Guiers-Mort river (France) with a drainage area of 89 km2. Hydrometric data is available from the year 1974 to the year 1990. Results for the three-day average discharge are presented in Fig. 4, Fig. 5, Fig. 6, Fig. 7, Fig. 8.

Results of the various tests (Section 2.2) dealing with the choice of threshold level are presented in Fig. 4, Fig. 5. A threshold level of 17 m3/s is selected,

Conclusions

The development of a set of comprehensive practice-oriented guidelines for the use of the POT approach represents a great challenge, but is essential to increase the popularity of the method among practitioners. This paper presents a summary of the state of the knowledge with regard to the use of the POT model and attempts to present a set of coherent practice-oriented guidelines for the use of the POT model.

One specific difficulty of the POT approach concerns the selection of the threshold

Acknowledgements

The comments and suggestions of T.A. Buishand and two anonymous reviewers are acknowledged.

References (53)

  • F. Ashkar et al.

    Some remarks on the truncation used in partial flood series models

    Water Resour. Res.

    (1983)
  • F. Ashkar et al.

    The effect of certain restrictions imposed on the interarrival times of flood events on the Poisson distribution used for modeling flood counts

    Water Resour. Res.

    (1983)
  • L.E. Borgman

    Risk criteria

    J. Waterways Harbors Div., ASCE

    (1963)
  • T.A. Buishand

    Statistics of extremes in climatology

    Statistica Neerlandica

    (1989)
  • D. Caissie et al.

    Etude sur le choix du seuil de troncature en analyse des séries de durées partielles: application au Canada

    Revue des Sciences de l'Eau

    (1992)
  • CFGB, 1994. Design flood determination by the gradex method. 18th congress CIGB-ICOLD n°2, nov., Bulletin du Comité...
  • J.F. Cruise et al.

    A hydroclimatic application strategy for the Poisson partial duration model

    Water Resour. Bull.

    (1990)
  • C. Cunnane

    A note on the Poisson assumption in partial duration series models

    Water Resour. Res.

    (1979)
  • R.B. D'Agostino et al.

    Goodness-of-Fit Techniques

    (1986)
  • Dalrymple, T., 1960. Flood frequency analysis. US Geological Survey water supply paper No 1534A. 60...
  • A.C. Davison et al.

    Models for exceedances over high thresholds

    J. R. Stat. Soc. B

    (1990)
  • D.L. Fitzgerald

    Single station and regional analysis of daily rainfall extremes

    Stochastic Hydrol. Hydraulics

    (1989)
  • Presentation and review of some methods for regional flood frequency analysis

    J. Hydrol.

    (1996)
  • E.J. Gumbel

    Statistics of Extremes

    (1958)
  • K.N. Irvine et al.

    Partial series analysis of high flows in Canadian rivers

    Can. Water Resour. J.

    (1986)
  • Lang, M., 1995. Les chroniques en hydrologie. PhD thesis, Université Joseph Fourier Grenoble, Cemagref Lyon, France,...
  • Cited by (511)

    View all citing articles on Scopus
    View full text