A comparison of uncertainty and sensitivity analysis results obtained with random and Latin hypercube sampling

https://doi.org/10.1016/j.ress.2004.09.006Get rights and content

Abstract

Uncertainty and sensitivity analysis results obtained with random and Latin hypercube sampling are compared. The comparison uses results from a model for two-phase fluid flow obtained with three independent random samples of size 100 each and three independent Latin hypercube samples (LHSs) of size 100 each. Uncertainty and sensitivity analysis results with the two sampling procedures are similar and stable across the three replicated samples. Poor performance of regression-based sensitivity analysis procedures for some analysis outcomes results more from the inappropriateness of the procedure for the nonlinear relationships between model input and model results than from an inadequate sample size. Kendall's coefficient of concordance (KCC) and the top down coefficient of concordance (TDCC) are used to assess the stability of sensitivity analysis results across replicated samples, with the TDCC providing a more informative measure of analysis stability than KCC. A new sensitivity analysis procedure based on replicated samples and the TDCC is introduced.

Introduction

The identification and representation of the implications of uncertainty is widely recognized as a fundamental component of analyses of complex systems [1], [2], [3], [4], [5], [6], [7], [8], [9], [10]. The study of uncertainty is usually subdivided into two closely related activities referred to as uncertainty analysis and sensitivity analysis, where (i) uncertainty analysis involves the determination of the uncertainty in analysis results that derives from uncertainty in analysis inputs and (ii) sensitivity analysis involves the determination of relationships between the uncertainty in analysis results and the uncertainty in individual analysis inputs.

At an abstract level, the analysis or model under consideration can be represented as a function of the formy=y(x)=f(x),wherex=[x1,x2,,xnX]is a vector of uncertain analysis inputs andy=[y1,y2,,ynY]is a vector of analysis results. Further, a sequence of distributionsD1,D2,,DnXis used to characterize the uncertainty associated with the elements of x, where Di is the distribution associated with xi for i=1, 2,…,nX. Correlations and other restrictions involving the elements of x are also possible. The goal of uncertainty analysis is to determine the uncertainty in the elements of y that derives from the uncertainty in the elements of x characterized by the distributions D1,D2,…,DnX and any associated restrictions. The goal of sensitivity analysis is to determine relationships between the uncertainty associated with individual elements of x and the uncertainty associated with individual elements of y.

A variety of approaches to uncertainty and sensitivity analysis are in use, including (i) differential analysis, which involves approximating a model with a Taylor series and then using variance propagation formulas to obtain uncertainty and sensitivity analysis results [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], (ii) response surface methodology, which is based on using classical experimental designs to select points for use in developing a response surface replacement for a model and then using this replacement model in subsequent uncertainty and sensitivity analyses based on Monte Carlo simulation and variance propagation [25], [26], [27], [28], [29], [30], [31], [32], [33], [34], [35], (iii) the Fourier amplitude sensitivity test (FAST) and other variance decomposition procedures, which involve the determination of uncertainty and sensitivity analysis results on the basis of the variance of model predictions and the contributions of individual variables to this variance [36], [37], [38], [39], [40], [41], [42], [43], [44], [45], [46], [47], [48], [49], [50], [51], [52], [53], [54], [55], (iv) fast probability integration, which is primarily an uncertainty analysis procedure used to estimate the tails of uncertainty distributions for model predictions [56], [57], [58], [59], [60], [61], [62], and (v) sampling-based (i.e. Monte Carlo) procedures, which involve the generation and exploration of a probabilistically based mapping from analysis inputs to analysis results [63], [64], [65], [66], [67], [68], [69], [70], [71], [72], [73]. Additional information on uncertainty and sensitivity analysis is available in a number of reviews [69], [70], [74], [75], [76], [77], [78], [79], [80]. The primary focus of this presentation is on sampling-based methods for uncertainty and sensitivity analysis.

Sampling-based approaches for uncertainty and sensitivity analysis are very popular [81], [82], [83], [84], [85], [86], [87], [88], [89], [90], [91], [92], [93], [94], [95], [96]. Desirable properties of these approaches include conceptual simplicity, ease of implementation, generation of uncertainty analysis results without the use of intermediate models, and availability of a variety of sensitivity analysis procedures [67], [69], [76], [97], [98]. Despite these positive properties, concern is often expressed about using these approaches because of the computational cost involved. In particular, the concern is that the sample sizes required to obtain meaningful results will be so large that analyses will be computationally impracticable for all but the most simple models. At times, statements are made that 1000 to 10,000s of model evaluations are required in a sampling-based uncertainty/sensitivity analysis.

In this presentation, results obtained with a computationally demanding model for two-phase fluid flow are used to illustrate that robust uncertainty and sensitivity analysis results can be obtained with relatively small sample sizes. Further, results are obtained and compared for replicated random and Latin hypercube samples (LHSs) [63], [73]. For the problem under consideration, random and LHSs of size 100 produce similar, stable results.

The presentation is organized as follows. The analysis problem is described in Section 2. Then, the following topics are considered: stability of uncertainty analysis results (Section 3), stability of sensitivity analysis results based on stepwise rank regression (Section 4), use of coefficients of concordance in comparing replicated sensitivity analyses (Section 5), sensitivity analysis based on replicated samples and the top down coefficient concordance (Section 6), sensitivity analysis with reduced sample sizes (Section 7), and sensitivity analysis without regression analysis (Section 8). Finally, the presentation ends with a concluding discussion (Section 9).

Section snippets

Analysis problem

The analysis problem under consideration comes from the 1996 performance assessment (PA) for the Waste Isolation Pilot Plant (WIPP) [99], [100]. This PA was the core analysis that supported the successful Compliance Certification Application (CCA) by the US Department of Energy (DOE) to the US Environmental Protection Agency (EPA) for the operation of the WIPP [101]. With the certification of the WIPP by the EPA for the disposal of transuranic waste in May 1998 [102], the WIPP became the first

Uncertainty analysis results

The time-dependent results in Fig. 2 display the uncertainty in solutions to Eqs. (2.1), (2.2), (2.3), (2.4), (2.5), (2.6) that results from uncertainty in the 31 variables in Table 1. The goal of this presentation is to illustrate the robustness of such uncertainty representations with respect to the type and size of the sample in use. As previously indicated, results at 1000, 10,000–1000, and 10,000 yr will be used for illustration.

One way to compare uncertainty analysis results is to present

Stepwise results

A sensitivity analysis based on stepwise regression analysis with rank-transformed data [118] was carried out for the replicated samples summarized in Fig. 4 (Table 3, Table 4, Table 5, Table 6). This analysis required α-values of 0.02 and 0.05 for variables to enter and to be retained in a given analysis, respectively, and was carried out with the stepwise program [119]. The summary tables (Tables 3–6) present results for both the individual replicates and for the three replicates of a given

Coefficients of concordance

Inspection of the results in Table 3, Table 4, Table 5, Table 6 suggests that the individual replicates are producing similar results. Kendall's coefficient of concordance (KCC) provides a way to formally assess this similarity (p. 305, Ref. [120]). This coefficient is based on the consideration of arrays of the formR1R2RnRx1r(O11)r(O12)r(O1,nR)x2r(O21)r(O22)r(O2,nR)xnXr(OnX,1)r(OnX,2)r(OnX,nR)where x1,x2,…,xnX are the variables under consideration (i.e. nX=29 with the exclusion of

Sensitivity analysis with the TDCC

Replicated samples and the TDCC provide the basis for a sensitivity analysis procedure to identify important sets of variables that does not depend on direct testing of the statistical significance of sensitivity measures (e.g. the significance of the coefficients in a stepwise regression model as defined by an α-value for entry into the model). Rather, important variables are identified by the similarity of outcomes in analyses performed for the individual replicated samples.

The procedure

Sensitivity analysis with small samples

The sensitivity analysis results obtained with random and LHSs of size 100 are very similar and thus indicate that a sample size of 100 is adequate for the problem under consideration. The question naturally arises if smaller sample sizes would also be adequate.

To partially address this question, the random samples were pooled to produce 300 observations, and then three samples of size 50 were obtained by randomly sampling from these 300 observations. Each new sample of size 50 was produced by

Sensitivity analysis without regression

The regression analyses summarized in Table 3, Table 4, Table 5, Table 6 exhibit various levels of success. Some analyses are quite good, with R2 values above 0.9. Other analyses are not quite so good, with R2 values in the range from 0.6 to 0.8. The analyses for WAS_PRES at 10,000 yr are effectively failures, with R2 values in the vicinity of 0.2.

An important aspect of the analyses in Table 3, Table 4, Table 5, Table 6 is that the identification of dominant variables tends to remain the same

Discussion

Uncertainty and sensitivity analysis results obtained with replicated random and LHSs are compared. In particular, uncertainty and sensitivity analyses were performed for a large model for two-phase fluid flow with three independently generated random samples of size 100 each and also three independently generated LHSs of size 100 each.

For the outcomes under consideration, analyses with random and LHSs produced similar results. Specifically, there is little difference in the uncertainty and

Acknowledgements

Work performed for Sandia National Laboratories (SNL), which is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under contract DE-AC04-94AL85000. Review provided at SNL by M. Chavez, J. Garner, and S. Halliday. Editorial support provided by F. Puffer, J. Ripple, and K. Best of Tech Reps, Inc.

References (157)

  • H. Rabitz et al.

    Efficient input–output model representations

    Comput Phys Commun

    (1999)
  • R. Rackwitz et al.

    Structural reliability under combined random load sequences

    Comput Struct

    (1978)
  • A. Saltelli et al.

    Non-parametric statistics in sensitivity analysis for model output. A comparison of selected techniques

    Reliab Eng Syst

    (1990)
  • J.C. Helton

    Uncertainty and sensitivity analysis techniques for use in performance assessment for radioactive waste disposal

    Reliab Eng Syst Saf

    (1993)
  • J.P.C. Kleijnen et al.

    Statistical analyses of scatterplots to identify important factors in large-scale simulations, 1: review and comparison of techniques

    Reliab Eng Syst Saf

    (1999)
  • J.C. Helton et al.

    Latin hypercube sampling and the propagation of uncertainty in analyses of complex systems

    Reliab Eng Syst Saf

    (2003)
  • R.J. Breeding et al.

    Summary description of the methods used in the probabilistic risk assessments for NUREG-1150

    Nucl Eng Des

    (1992)
  • J.Z. Ma et al.

    Parameter sensitivity of a model of viral epidemics simulated with Monte Carlo techniques. I. Illness attack rates

    Int J Biomed Comput

    (1993)
  • J.Z. Ma et al.

    Parameter sensitivity of a model of viral epidemics simulated with Monte Carlo techniques. II. Durations and peaks

    Int J Biomed Comput

    (1993)
  • J.C. Helton et al.

    Uncertainty and sensitivity analysis results obtained in the 1992 performance assessment for the Waste Isolation Pilot Plant

    Reliab Eng Syst Saf

    (1996)
  • N.I. Kolev et al.

    Uncertainty and sensitivity analysis of a postexperiment simulation of nonexplosive melt–water interaction

    Exp Therm Fluid Sci

    (1996)
  • E. Hofer

    Sensitivity analysis in the context of uncertainty analysis for computationally intensive models

    Comput Phys Commun

    (1999)
  • C. Cohen et al.

    A discrete-event computer model of feline herpes virus within cat populations

    Prev Vet Med

    (2000)
  • A. Saltelli et al.

    Sensitivity analysis of model output. An investigation of new techniques

    Comput Stat Data Anal

    (1993)
  • R.L. Wagner

    Science, uncertainty and risk: the problem of complex phenomena

    APS News

    (2003)
  • A guide for uncertainty analysis in dose and risk assessments related to environmental contamination

    (1996)
  • Science and judgment in risk assessment

    (1994)
  • Issues in risk assessment

    (1993)
  • An SAB report: multi-media risk assessment for radon, review of uncertainty analysis of risks associated with exposure to radon

    (1993)
  • O. Øvreberg et al.

    Putting error bars on reservoir engineering forecasts

    J Pet Technol

    (1992)
  • Evaluating the reliability of predictions made using environmental transfer models

    (1989)
  • M.B. Beck

    Water-quality modeling: a review of the analysis of uncertainty

    Water Resour Res

    (1987)
  • R. Tomovic et al.

    General sensitivity theory

    (1972)
  • P.M. Frank

    Introduction to system sensitivity theory

    (1978)
  • J.-T. Hwang et al.

    The Green's function method of sensitivity analysis in chemical kinetics

    J Chem Phys

    (1978)
  • E.P. Dougherty et al.

    A computational algorithm for the Green's function method of sensitivity analysis in chemical kinetics

    Int J Chem Kinet

    (1979)
  • E.P. Dougherty et al.

    Further developments and applications of the Green's function method of sensitivity analysis in chemical kinetics

    J Chem Phys

    (1979)
  • D.G. Cacuci et al.

    Sensitivity theory for general systems of nonlinear equations

    Nucl Sci Eng

    (1980)
  • D.G. Cacuci

    Sensitivity theory for nonlinear systems. I. Nonlinear functional analysis approach

    J Math Phys

    (1981)
  • D.G. Cacuci

    Sensitivity theory for nonlinear systems. II. Extensions to additional classes of responses

    J Math Phys

    (1981)
  • D.G. Cacuci et al.

    On the application of the adjoint method of sensitivity analysis to problems in the atmospheric sciences

    Atmósfera

    (1994)
  • H. Rabitz et al.

    Sensitivity analysis in chemical kinetics

  • T. Turányi

    Sensitivity analysis of complex kinetic systems. Tools and applications

    J Math Chem

    (1990)
  • L. Vuilleumier et al.

    First- and second-order sensitivity analysis of a photochemically reactive system (a Green's function approach)

    Environ Sci Technol

    (1997)
  • D.G. Cacuci

    Sensitivity and uncertainty analysis

    (2003)
  • W.J. Hill et al.

    A review of response surface methodology: a literature review

    Technometrics

    (1966)
  • R. Mead et al.

    A review of response surface methodology from a biometric viewpoint

    Biometrics

    (1975)
  • R.H. Myers

    Response surface methodology

    (1971)
  • R.H. Morton

    Response surface methodology

    Math Scientist

    (1983)
  • Cited by (239)

    View all citing articles on Scopus
    View full text