Comparison of sensitivity analysis methods for pollutant degradation modelling: A case study from drinking water treatment

doi:10.1016/j.scitotenv.2012.06.026

Science of The Total Environment

Volume 433, 1 September 2012, Pages 530-537

https://doi.org/10.1016/j.scitotenv.2012.06.026 Get rights and content

Abstract

Five sensitivity analysis methods based on derivatives, screening, regression, variance decomposition and entropy are introduced, applied and compared for a model predicting micropollutant degradation in drinking water treatment. The sensitivity analysis objectives considered are factors prioritisation (detecting important factors), factors fixing (detecting non-influential factors) and factors mapping (detecting which factors are responsible for causing pollutant limit exceedances). It is shown how the applicability of methods changes in view of increasing interactions between model factors and increasing non-linearity between the model output and the model factors. A high correlation is observed between the indices obtained for the objectives factors prioritisation and factors mapping due to the positive skewness of the probability distributions of the predicted residual pollutant concentrations. The entropy-based method which uses the Kullback–Leibler divergence is found to be particularly suited when assessing pollutant limit exceedances.

Highlights

► Five Sensitivity analysis methods are applied to a model of micropollutant degradation. ► Factors prioritisation, factors fixing and factors mapping are considered. ► Methods are compared in view of capturing non-linearity and factor interactions. ► Entropy-based methods are well suited in a risk assessment context.

Introduction

Sensitivity analysis (SA) is used to examine how a model output is influenced by the uncertainty of the model factors, where the term factors may include both model parameters and model inputs. In many cases it is unclear which sensitivity method to select a priori. The choice of a SA method depends on the objective of the specific study, the computational cost, the relationship between model output and model factors (linearity, additivity, monotonicity) (e.g. Saltelli et al., 2000, Saltelli et al., 2005, Yang, 2011). Whereas the computational cost of one simulation run can mostly be estimated at the outset of an analysis, it is often unclear how the uncertain model factors will interact and which simplifying assumptions will hold, thus permitting the use of simpler methods. Most SA applications within the scientific literature are based on changing one factor at a time (OAT) and assessing derivatives at a single reference point in factors space (local SA). Saltelli and Annoni (2010) have recently highlighted how this may lead to erroneous conclusions, especially when modelling environmental systems where non-linearity and factor interactions are abundant. For a case study from environmental technology the present study aims to address these points by comparing a wide range of SA methods considering the following objectives:

•
Which factors are important in determining uncertainty about the model output?
•
Which factors are non-influential in determining uncertainty about the model output?
•
Which factors are responsible for producing model outputs in a specific region (e.g. failure region)?

Using the terminology of Saltelli et al. (2004) these three objectives are defined as follows:

•
Factors prioritisation identifies the most important factors, i.e. the factors, which if known, would be expected to lead to the largest reduction of variance in the model output.
•
Factors fixing identifies which factors can be fixed anywhere within their range of uncertainty without significantly affecting the variance of the model output.
•
Factors mapping identifies which factors are responsible for leading to model outputs in a specified region, e.g. above a threshold value.

Saltelli et al. (2004) describe a fourth objective which is not further explored in this study: variance cutting identifies the minimal set of factors to be fixed to reduce the variance of the model output below a specified value.

This study applies five methods of sensitivity analysis, each of which is a representative technique from a broader class of methods. The selected methods are summarised in Table 1 and include: relative–relative sensitivity functions (based on derivatives, local SA), Morris Screening (screening method), Standardised Regression Coefficients (based on regression), Extended Fourier Amplitude Sensitivity Testing (variance decomposition) and the Kullback–Leibler divergence (based on entropy).

The SA methods are applied to a model that predicts the degradation of micropollutants for an ozone reactor in drinking water treatment. The findings of this paper can be generalised to any study of transformation processes in environmental technology when the objective is to predict the degradation or transformation of a compound or pollutant.

For the derivation of the model and the expert elicitation of uncertainty the reader is referred to Neumann et al. (2009), where the results from a sensitivity analysis obtained with the Extended-FAST method are discussed in detail from a process engineering perspective. The current study does not give a physical interpretation of the results but focuses on comparing five SA methods in view of three SA objectives.

The study is structured as follows: The theoretical framework for each of the five SA methods is introduced in detail, followed by a brief summary of the reactor model and the selected chemical compounds as well as a description of computational aspects and numerical settings. The results obtained with the five methods are presented, for the selected micropollutants. A discussion compares their performance in view of three SA objectives and assesses the broader implications for selecting an SA method when modelling pollutant transformation.

Section snippets

Framing and terminology

The objective is to apply and compare five SA methods to estimate how the predicted relative residual pollutant concentration Y (outlet concentration/inlet concentration) is affected by the uncertainty in the model factors x_i. In this study the term factor or model factor is limited to describe model inputs and model parameters exhibiting epistemic uncertainty (uncertainty due to lack of knowledge).

Derivative-based methods: relative–relative sensitivity functions

The most frequently applied methods in SA are based on derivatives. They assess how the model

MC simulation

Fig. 1 shows the empirical cumulative distributions for the predicted relative residual concentrations Y for the three compounds (logarithmic scale). The mean relative residual concentrations Y for MTBE, bezafibrate and beta-cyclocitral are 0.71, 3.6e-2 and 3.0e-6 respectively. The cumulative distributions allow a probabilistic interpretation of the treatment performance: F(Y) on the ordinate is the probability of the relative residual concentration being below Y (abscissa).

Sensitivity analysis

The results obtained

Factors prioritisation

For factors prioritisation the effect of moving outside of the application ranges for both the derivative-based and regression-based methods is clearly observed when the non-linearity and interactions increase: The ratio of [sum(s_rr²):sum(β²):sum(S)] changes from MTBE [0.98:0.99:0.98] to bezafibrate [0.58:0.84:0.87] to beta-cyclocitral [0.001:0.25:0.50] where sum(S) identifies correctly the contribution of first order effects in explaining the variance of Y (, , ). Although the rankings

Conclusions

•
Five sensitivity analysis methods based on derivatives, screening, regression, variance decomposition and entropy were applied to a model predicting micropollutant degradation in drinking water treatment.
•
The results were compared with respect to factors prioritisation (detecting important factors), factors fixing (detecting non-influential factors) and factors mapping (detecting which factors are responsible for causing pollutant limit exceedances)
•
The capability of the different methods in

References (21)

F. Campolongo et al.
An effective screening design for sensitivity analysis of large models
Environ Model Software
(2007)
M.B. Neumann et al.
Uncertainty in prediction of disinfection performance
Water Res
(2007)
M.B. Neumann et al.
Global sensitivity analysis for model-based prediction of oxidative micropollutant transformation during drinking water treatment
Water Res
(2009)
S.G. Reid
Specification of design criteria based on probabilistic measures of design performance
Struct Saf
(2002)
A. Saltelli et al.
How to avoid a perfunctory sensitivity analysis
Environ Model Software
(2010)
L. Vezzaro et al.
Modelling the fate of organic micropollutants in stormwater ponds
Sci Total Environ
(2011)
J. Yang
Convergence and uncertainty analyses in Monte-Carlo based sensitivity analysis
Environ Model Software
(2011)
R.I. Cukier et al.
Study of the sensitivity of coupled reaction systems to uncertainties in rate coefficients. I Theory
J Chem Phys
(1973)
G.M. Hornberger et al.
An approach to the preliminary analysis of environmental systems
J Environ Manage
(1981)
F.Y. Hsieh et al.
A simple method of sample size calculation for linear and logistic regression
Stat Med
(1998)

There are more references available in the full text version of this article.

Cited by (54)

Development of a sensitivity analysis framework for aquatic biogeochemical models using machine learning
2023, Ecological Informatics
Our evolving understanding of ecosystem functioning along with the advent of computational power have paved the way for the development of complex mathematical models that explicitly represent the functional diversity of biotic communities and multiple biogeochemical cycles. The ever-growing demand for more complex models underscores the importance of robust sensitivity analysis (SA) to elucidate the impact of the uncertainty on model inputs and to untangle the parameter covariance patterns that ultimately lead to the emergence of equifinality problems. In this study, we propose a novel multi-pronged SA framework that integrates advanced statistical and machine learning (ML) techniques. Principal component analysis (PCA) is first applied to dissect the wide array of predictive outputs and identify modes of variability in time and/or space. Classification and Regression Tree (CART) analysis is then used to identify a set of splitting decisions connecting threshold values of key state variables and model parameters with different ranges of predictive outputs with management interest. Self-Organizing Maps (SOM) are implemented as a final step to unravel any non-linear associations between model parameters and responses. As a proof-of-concept, we used a complex aquatic biogeochemical model developed for the Bay of Quinte, a eutrophic embayment in Lake Ontario, to examine competition patterns and structural shifts among multiple functional phytoplankton (diatoms, N-fixing cyanobacteria, and Microcystis) and zooplankton (herbivores and omnivores) groups. Our sensitivity analysis framework showed that the parameters representing the dependence of growth and metabolic processes on temperature are particularly influential to recreate plankton community dynamics during the cold period of the year, whereas the interplay among the interspecific resource competition, strength of the prey-predator interactions, and phosphorus availability mainly regulate their dynamics during the growing season. The growth strategies of diatoms, their nutritional quality that determines the assimilation efficiency by zooplankton, along with the ambient nutrient availability determine our capacity to reproduce patterns of cyanobacteria dominance and faithfully depict the severity of harmful algal blooms. Finally, our study discusses the benefits of a broader use of the ML-based SA framework to unravel influential parametric interactions in modulating the behaviors of complex mathematical models.
Plant-wide assessment of alternative activated sludge configurations for biological nutrient removal under uncertain influent characteristics
2022, Science of the Total Environment
Citation Excerpt :
The coefficient of determination (R2) obtained for all model outputs is above 0.9 (except for methane production in the UCT configuration where R2 is ~0.6) –see Table S2 in SI. High interaction indicates that model prediction is highly non-linear - i.e. the interaction between parameters is significantly high and the influence of influent parameters cannot be mainly captured by first-order effects (Neumann, 2012). The Si values are reported throughout the text inside parentheses.
This study presents an extensive plant-wide model-based assessment of four alternative activated sludge (AS) configurations for biological nitrogen (N) and phosphorus (P) removal under uncertain influent loads and characteristics. Zeekoegat wastewater treatment plant (WWTP) in South Africa was chosen as case study due to its flexible design that enables operation in four different AS configurations: 3-stage Bardenpho (A2O), University of Cape Town (UCT), UCT modified (UCTM), and Johannesburg (JHB). A metamodeling based global sensitivity analysis was performed on a steady-state plant-wide simulation model using Activated Sludge Model No. 2d with the latest extension of physico-chemical processes describing the plant-wide P transformations. The simulation results showed that the predictions of effluent chemical oxygen demand (COD), N and P using the proposed approach fall within the interquartile range of measured data. The study also revealed that process configuration can affect: 1) how influent uncertainty is reflected in model predictions for effluent quality and cost related performances, and 2) the parameter rankings based on variance decomposition, particularly for effluent phosphate, sludge disposal and methane production. The results identified UCT and UCTM as more robust configurations for P removal (less propagated uncertainty and less sensitivity to N load) in the expense of incomplete denitrification. Moreover, based on the results of Monte-Carlo based scenario analysis, the balanced SRT for N and P removal is more sensitive to influent load variation/uncertainty for the A2O and JHB configurations. This gives a more operational flexibility to UCT and UCTM, where a narrow SRT range can ensure both N and P removal.
The development of a calibration methodology for a realistic primary settling tank (PST) model
2021, Journal of Water Process Engineering
The previous models that replicated primary sedimentation units assumed that all the total settleable solids (TSS) components settled at the same velocities. These assumptions resulted in poor predictions of primary sludge characteristics and anaerobic digestion (AD) model. With the accurate knowledge of PS compositions, the settled wastewater that leads to the activated sludge (AS) system can be determined using the principles of material mass balance. The mass balance is achieved by consolidating the known compositions of raw wastewater and primary sludge (PS) hereby allow accurate tracking of elements along the entire WRRF when using the model for design and optimised operation of WRRF systems. In this investigation, the raw wastewater entering the water and resource recovery facility (WRRF) via the PST is comprehensively characterising the TSS. The TSS was categorised into inorganic settleable solids (ISS), biodegradable particulate organics (BPO) and unbiodegradable particulate organics (UPO), each with five settling velocity distribution profiles. The BPO is further defined in terms of its elemental composition and hydrolysis rate kinetic constants to allow for accurate prediction of the AD system performance when linked to the virtual PST. To achieve this, the study introduced the (i) modification of the University of Cape Town Primary Sedimentation Unit (UCTPSU) model of (Polorigni et al., 2021), by the inclusion of hydrolysis and acid fermentation bioprocesses, and (ii) calibration of the UCTPSU model. The calibration was conducted through the (a) performance of a sensitivity analysis on the modified UCTPSU model and (b) the utilisation of the data generated from settleometer and augmented bio-methane potential (ABMP) tests.
Variance-based sensitivity analysis: The quest for better estimators and designs between explorativity and economy
2021, Reliability Engineering and System Safety
Variance-based sensitivity indices have established themselves as a reference amongst practitioners of sensitivity analysis of model outputs. A variance-based sensitivity analysis typically produces the first-order sensitivity indices $S_{j}$ and the so-called total-effect sensitivity indices $T_{j}$ for the uncertain factors of the mathematical model under analysis.
Computational cost is critical in sensitivity analysis. This cost depends upon the number of model evaluations needed to obtain stable and accurate values of the estimates. While efficient estimation procedures are available for $S_{j}$ (Tarantola et al., 2006), this availability is less the case for $T_{j}$ (Iooss and Lemaître, 2015). When estimating these indices, one can either use a sample-based approach whose computational cost depends on the number of factors or use approaches based on meta-modelling/emulators (e.g., Gaussian processes).
The present work focuses on sample-based estimation procedures for $T_{j}$ for independent inputs and tests different avenues to achieve an algorithmic improvement over the existing best practices. To improve the exploration of the space of the input factors (design) and the formula to compute the indices (estimator), we propose strategies based on the concepts of economy and explorativity. We then discuss how several existing estimators perform along these characteristics.
Numerical results are presented for a set of seven test functions corresponding to different settings (few important factors with low cross-factor interactions, all factors equally important with low cross-factor interactions, and all factors equally important with high cross-factor interactions). We conclude the following from these experiments: a) sample-based approaches based on the use of multiple matrices to enhance the economy are outperformed by designs using fewer matrices but with better explorativity; b) amongst the latter, asymmetric designs perform the best and outperform symmetric designs having corrective terms for spurious correlations; c) improving on the existing best practices is fraught with difficulties; and d) ameliorating the results comes at the cost of introducing extra design parameters.
Improved 1-D hydrodynamic simulation of reservoir release based on coarse resolution topography data using ISS-SRC-assisted ensemble Kalman filter
2020, Journal of Hydrology
Citation Excerpt :
Eqs. (27) and (19) indicate that both the EnKF and the SRC involve Monte Carlo realizations and utilize the subsequent covariance information, which implies that they are inherently connected. Therefore, instead of other GSA methods, the SRC (Neumann, 2012) is selected in this paper to describe the parameter-output relationship. Notably, the SRC formulation in Eq. (27) avoids the possible numerical instability caused by matrix inversion when parameters vary greatly in their magnitudes.
Evaluating the risk of flows related to reservoir releases requires a comprehensive grasp of the associated hydrodynamic process. Due to the obstacle posed by bathymetry in mountainous areas, the accuracy of the hydrodynamic simulation is hindered by the absence of reliable terrain data. To evaluate both the overall parameter sensitivities (e.g., sensitivities to cross-section geometry and roughness) within the feasible domain and the varying responses induced by changes in the input forcing (e.g., inflow series), we introduce a generalized global sensitivity indicator, namely, the input space sampled standardized regression coefficients (ISS-SRC). Through mathematical derivation, we found that the parameter whose ISS-SRC index is close to zero will not be significantly updated by the EnKF. Thus, ISS-SRC can be employed to assist the EnKF by predicting the correction performance and reducing the dimensionality of identifiable factors. The results of synthetic experiments confirm the connection between the evolution of the parameter ensemble and the dynamic sensitivity indicators. The ISS-SRC-assisted EnKF is employed to modify the physical and geometrical parameters of riverbed sections in the hydrodynamic model of a real-world reservoir release case and the simulation accuracy is greatly enhanced. This success demonstrates the potential of obtaining reliable flood simulation results with only coarse-resolution topographic information.
Kinetics of biological and chemical processes in anoxic-aerobic digestion of phosphorus rich waste activated sludge
2020, Water Research
Because the functions of these water and resource recovery facilities (WRRFs) stretches beyond simply meeting effluent requirements (i.e., also includes optimisation of products to be generated as recovered resources), a high level of accuracy is required in using mathematical models that virtually replicate (hence predict) WRRF system responses to dynamic conditions. The currently developed mathematical models embrace the majority of advances made towards tracking nitrogen (N) and phosphorus (P) through the entire WRRF, and significant effort has been made towards calibrating them to predict realistic outcomes. This paper presents the stepwise calibration of the PWMSA model (Ikumi et al., 2015) for aerobic (AerD) and anoxic-aerobic digestion (AAD) processes, through predictions of (i) mineral precipitation potential, in isolation to biological reactions (ii) AerD bioprocesses (including nitrification, orthophosphate (OP) release, and endogenous respiration), in isolation to mineral precipitation (iii) predicted interaction of the mineral precipitation and the biological processes of organic removal and nitrification, excluding P accumulating organisms (PAOs) and polyphosphate (PP) release during AerD, (iv) replicated interaction of mineral precipitation and bioprocesses of P release and nitrification kinetics (v) predicted PAO behavioural kinetics of anaerobic OP release with acetate uptake and aerobic PP uptake, in isolation to nitrification and (vi) predicted nitrate denitrification and anoxic OP release. The calibrated kinetic parameters allowed for the model capability of reproducing the data from the key biological, physical and chemical processes occurring in the various environments of sludge treatment (aerobic, anoxic and anaerobic) within satisfactory level of accuracy.

View all citing articles on Scopus

View full text

Comparison of sensitivity analysis methods for pollutant degradation modelling: A case study from drinking water treatment

Abstract

Highlights

Introduction

Section snippets

Framing and terminology

Derivative-based methods: relative–relative sensitivity functions

MC simulation

Sensitivity analysis

Factors prioritisation

Conclusions

Environ Model Software

Water Res

Water Res

Struct Saf

Environ Model Software

Sci Total Environ

Environ Model Software

Study of the sensitivity of coupled reaction systems to uncertainties in rate coefficients. I Theory

J Chem Phys

An approach to the preliminary analysis of environmental systems

J Environ Manage

A simple method of sample size calculation for linear and logistic regression

Stat Med