Optimal estuarine sediment monitoring network design with simulated annealing

doi:10.1016/j.jenvman.2005.04.024

Journal of Environmental Management

Volume 78, Issue 3, February 2006, Pages 294-304

https://doi.org/10.1016/j.jenvman.2005.04.024 Get rights and content

Abstract

An objective function based on geostatistical variance reduction, constrained to the reproduction of the probability distribution functions of selected physical and chemical sediment variables, is applied to the selection of the best set of compliance monitoring stations in the Sado river estuary in Portugal. These stations were to be selected from a large set of sampling stations from a prior field campaign. Simulated annealing was chosen to solve the optimisation function model. Both the combinatorial problem structure and the resulting candidate sediment monitoring networks are discussed, and the optimal dimension and spatial distribution are proposed. An optimal network of sixty stations was obtained from an original 153-station sampling campaign.

Introduction

A well designed, ongoing monitoring program is fundamental for the evaluation of environmental management of natural systems (Kay and Alder, 2000). The design of an effective monitoring program depends on the management objectives, resources (funding and staff) and available technology. Monitoring programmes should be designed to contribute to a synthesis of information or to evaluate impacts, or analyse the complex cross-linkages between environmental quality aspects, impacts and socio-economic driving forces (RIVM, 1994).

The technical design of monitoring networks is related to the determination of: (i) monitoring sites; (ii) monitoring frequencies; (iii) variables to be sampled; (iv) duration of sampling (the last two variables are not discussed here because they are case-specific). Most of the research results in this area have been obtained in the context of statistical procedures (Sanders et al., 1983; Moss, 1986, IAHS, 1986; Cochran, 1977). These rely in the principle that there are several sources of uncertainty, due to measuring errors, inherent heterogeneities of the involved variables, and in the cases where modelling is involved, also simplifications and errors in both the modelling and numerical/analysis solution phase. McBratney et al., 1981), as well as many other authors after them, indicated that uncertainties are the result of lack, in quality and quantity, of information concerning the systems under study, or as a result of spatial and temporal variations of parameters.

In many monitoring programs a first sampling stage with a large number of locations is undertaken, either because there is no prior information or it is considered necessary to collect more data. This stage is usually planned to give statistical information about the variables under study and to calculate their spatial covariance. A second stage is needed to transform the original set of sampling stations, with high cardinality, into a lower cardinality set of monitoring stations. Probably the methods used most to reduce cardinality are those based on the maximisation of spatial accuracy, or in other words, on the minimisation of the variance of the estimation error, also known as variance reduction methods. This is usually carried out in the context of geostatistical theory (Matheron, 1963, Matheron, 1965) and most frequently by interpolation with an unknown mean, i.e. by ordinary kriging. Other promising methods have been proposed for optimising the monitoring network design, in particular those based on information theory, as in articles such as those by Amorocho and Espildora, 1973, Caselton and Husain, 1980, Caselton and Zidek, 1984, Harmancioglu and Yevjevich, 1987, Husain, 1989, Harmancioglu and Alspaslan, 1992. Despite the elegance of these methods, they are limited by the need to assume a probability distribution for the variables, which may be unknown or difficult to determine. Moreover the method is particularly well adapted to variables with equal probability distributions (usually normal or lognormal). When soft and other sources of information are available then the Bayesian Maximum Entropy geostatistical method, first developed by George Christakos (Christakos, 1990, Christakos, 1992), have proven to outperform ordinary kriging (D'Or et al., 2001), and also have the advantage over the latter that they do not require the specification of particular probability distributions.

Kriging variance has been extensively used for monitoring network design. Examples can be found in the work of Bras and Rodríguez-Iturbe, 1976, Rouhani, 1985, Loaiciga, 1989, Rouhani and Hall, 1988, Pardo-Igúzquiza, 1998, van Groenigen et al., 1999, van Groenigen and Stein, 1998, and Nunes et al., 2004a, Nunes et al., 2004b.

Two categories for monitoring optimisation with variance reduction have been proposed: (i) the local approach (e.g. Amorocho and Espildora, 1973); and (ii) the global approach (e.g. Ahmed et al., 1988). In the first the influence of each additional point is analysed separately. Total variance reduction after adding one point is easily computed by considering the individual values at each initial location or at the points in the vicinity of the point being estimated. In the global approach average estimation variances are used. Therefore, global approaches provide only average answers to monitoring designs. It is useful to analyse designs still on the drawing board or to perform extensive redesigns aimed at maintaining the efficiency of a monitoring network, which may require removal of poorly located sites. The local approach, on the other hand, is better suited to optimally expanding an existing network. The optimality in this case only relates to the additional points, which may not be acceptable if the original points are not optimal (Markus et al., 1999).

Minimisation of the average kriging variance approach was applied here to select the number and positions of sediment monitoring stations in the Sado river estuary located in the southwest coast oft Portugal (Fig. 1), such that different physically and chemically homogeneous areas identified in a prior sampling campaign were considered. This monitoring network will be further integrated into an environmental data management system for the Sado Estuary as a decision support tool for local authorities. The Sado Estuary in Portugal is an example where environmental problems are not well managed owing to the high natural values and diverse pressures for development and where the right tools to help evaluating the environmental quality status need to be developed. The objective here was on the development of a monitoring network that constitutes one the information sources of the Sado Estuary management system (physic-chemical data of sediment quality).

For practical and budgetary reasons the number of monitoring stations should be reduced to a minimum. The optimisation problem can be stated in a very simple way: maximising the spatial accuracy, constrained to a maximum number of stations, given the information collected in a prior sampling program (153 sampling sites). Maximisation of spatial accuracy is easily attained by minimising the variance of estimation error, though incorporating the patchiness of homogeneous areas is a more difficult problem. One alternative would be to fix several locations inside the different homogeneous areas, but then the choice of stations would be arbitrary. Another way is to use stratification, considering that a defined number of stations must be placed inside homogeneous areas. Stratification is a well-known statistical technique used for designing monitoring (or sampling) programs with denser networks in some areas than in others. The difference in probability density may be based, for example, on spatial autocovariances, statistical risk of contamination, plume detection probabilities or empirical judgement, among many others. Here we propose a statistically based stratification: homogeneous areas are monitored according to the frequency with which they appear in the prior sampling program. The inclusion of homogeneous areas was considered important by the manager because sediment granulometry and physical and chemical characteristics have strong correlations with the amount of xenobiotics the sediment can retain and because these areas were planned to be geographic spatial units in an environmental management system. Hence, four types of sediments were established on the basis of three physical and chemical variables and the manager demanded that the proportion of stations in the four types of sediments in the monitoring network be similar to that of the sampling campaign (thus the constraint on the proportions).

Optimisation consists, then, of finding an optimal subset with a combination of stations taken from a larger set. Even for relatively small set cardinalities the number of combinations is too high to allow them all to be exhaustively evaluated in a reasonable amount of time. One of the most well known algorithms for solving combinatorial problems is simulated annealing, in particular in sampling/monitoring network optimisation (e.g. Meyer et al., 1994, Pardo-Igúzquiza, 1998, van Groenigen et al., 1999, Brus et al., 2000, Brus et al., 2002, Nunes et al., 2004a, Nunes et al., 2004b).

The article is divided in five sections. This Introduction is followed by a second section where the theoretical geostatistical and optimisation framework is presented. In this section the geostatistical parameter most frequently used to measure accuracy, the kriging estimation error variance, is explained and compared with another geostatistical measure of accuracy, the fictitious point estimation error variance. Also the simulated annealing heuristic used to solve the optimisation problem is introduced. In the third section a case-study is presented and data transformations are explained, while, in the fourth section, optimisation results are discussed. Finally, in the last section, the most important conclusions are drawn.

Section snippets

Estimation of probability distribution functions

Indicator coding implies transforming a continuous or discrete variable, Z(x), into a discrete (0,1) one, the indicator I(x). Considering a threshold value z_c on Z, I(x) is equal to 1 if Z(x)≤z_c, and 0 otherwise. Therefore the variable at each location is transformed into a distribution function, i.e. the probability of exceeding the threshold is calculated within a region. With a sufficiently large number of thresholds the prior (and post) probability distribution of Z is calculated at each

Study area and source data

The Sado Estuary is the second largest estuary in Portugal with an area of approximately 24,000 hectares. It is located on the west coast of Portugal, 45 km south of Lisbon (Fig. 1). Most of the estuary is classified as a nature reserve. The Sado Estuary basin is subject to intensive land-use practices and plays an important role in the local and national economy. Most of the activities in the estuary (e.g. industry, shipping, intensive farming, tourism and urban development) have negative

Feasible space

The number of combinations of Ω sampling stations with ω possible monitoring stations is given by the well-known formula W=Ω! / ((Ω−ω)! ω!). Now, if one wants to calculate the combinations conditioned to the reproduction of the proportions, the expression becomes $W = \prod_{i = 1}^{k} \frac{Ω_{i}!}{(Ω_{i} - ω_{i})! ω_{i}!}$

where i is the indicator number, Ω_i the number of sampling stations with the indicator i, and ω_i the number of monitoring stations with the indicator i imposed by conditioning. The number of combinations in each

Conclusions

The following conclusions can be drawn: (i) Objective function conditioning is necessary to guarantee reproduction of the probability density functions of indicator variables; (ii) the higher the conditioning the closer the posterior (estimated) pdf is to the prior (data) pdf; (iii) conditioning with δ<0.3 leads to extremely long running times and has been shown to be unnecessary; (iv) if no conditioning is used the estimation error variance increases with the rise in the number of monitoring

References (47)

S. Caeiro et al.
Spatial sampling design for sediment quality assessment in estuaries
Environmental Modelling and Software
(2003)
W.F. Caselton et al.
Optimal monitoring network designs
Statistics and Probability Letters
(1984)
N. Harmancioglu et al.
Transfer of hydrologic information among river points
Journal of Hydrology
(1987)
E. Pardo-Igúzquiza
Optimal selection of number and location of rainfall gauges for areal rainfall estimation using geostatistics and simulated annealing
Journal of Hydrology
(1998)
S. Rouhani et al.
Geostatistical schemes for groundwater sampling
Journal of Hydrology
(1988)
J.W. van Groenigen et al.
Constrained optimisation of soil sampling for minimisation of the kriging variance
Geoderma
(1999)
E. Aarts et al.
Simulated Annealing and Boltzman Machines
(1990)
S. Ahmed et al.
Combined use of hydraulic and electrical properties of an aquifer in a geostatistical estimation of transmissivity
Ground Water
(1988)
M.M. Alli et al.
Probabilistic analysis of collapsing soil by indicator kriging
Mathematical Geology
(1990)
J. Amorocho et al.
Entropy in the assessment of uncertainty of hydrologic systems and models
Water Resources Research
(1973)

R.L. Bras et al.

Network design for the estimation of areal mean of rainfall events

Water Resources Research

(1976)

D.J. Brus et al.

Designing efficient sampling schemes for reconnaissance surveys of contaminated bed sediments in water courses

Geologie en Mijnbouw-Netherlands Journal of Geosciences

(2000)

D.J. Brus et al.

Optimising two- and three-stage designs for spatial inventories of natural resources by simulated annealing

Environmental and Ecological Statistics

(2002)

S. Caeiro et al.

Delineation of estuarine management areas using multivariate geostatistics: the case of Sado Estuary

Environmental Science and Technology

(2003)

W.F. Caselton et al.

Hydrologic networks: information transmission

Journal of the Water Resources Planning and Management Division

(1980)

W.G. Cochran

Sampling Techniques

(1977)

H. Cohn et al.

Simulated annealing: searching for an optimal temperature schedule

SIAM Journal on Optimization

(1999)

G. Christakos

A Bayesian/maximum-entropy view to the spatial estimation problem

Mathematical Geology

(1990)

G. Christakos

Random Field Models in Earth Sciences

(1992)

C.V. Deutsch et al.

GSLIB, Geostatistical Software Library and User's Guide

(1992)

D. D'Or et al.

Application of the BME approach to soil texture mapping

Stochastic Environmental Research and Risk Assessment

(2001)

S. Geman et al.

Stochastic relaxation, Gibbs distributions and the Baysian restoration of images

IEEE Transaction on Pattern Analysis and Machine Intelligence

(1984)

J.J. Gruijter et al.

Continuous soil maps—a fuzzy set approach to bridge the gap between aggregation levels of process and distribution models

Geoderma

(1997)

Cited by (16)

Optimal sampling design for reclaimed land management in mining area: An improved simulated annealing approach
2019, Journal of Cleaner Production
Citation Excerpt :
The aim of the sampling point optimization is to minimize the information loss regarding spatial prediction and reduce the cost of sample collection. ( Bueso et al., 1999; Nunes et al., 2006). The obtainment of sampling point data generally depends on traditional soil survey presently (Bui and Moran, 2001).
Soil properties in reclaimed land altered inevitably due to human activities. To realize sustainable land use and management in mining area, the long-term dynamic monitoring of reclaimed soil properties is required. Traditional dense sampling network is time consuming and expensive. Therefore, it is necessary to devise a scientific sampling design that maximizing the accuracy of results while minimizing the costs. In this study, we developed an improved simulated annealing method considering the spatial correlation to optimize the sampling design for soil properties, to enhance the efficiency of sampling plans for the prediction of space variables. A separate optimization for each soil property and a sampling optimization integrating various soil properties were conducted by combining simulated annealing and artificial neural networks. The number of sampling points from present improved simulated annealing method was markedly reduced, the spatial correlation was increased and the sampling points obtained showed more accurate predictions for soil properties. The sampling points for each soil property were mostly distributed in the marginal areas and appropriately added in the middle of the dumps. There were more sampling points distribution in these areas with large changes in topography for each soil property after improved simulated annealing optimization and more sampling points were needed to reflect soil information in southern dump in Antaibao opencast coal mine. When operating optimization integrating all the properties, some sampling points of various properties were added to the sampling pattern, showing a uniform distribution of sampling points. This study provides a theoretical basis for optimal design for soil property monitoring and sustainable land management.
Optimal placement of fixed in-core detectors for Tehran Research Reactor using information theory
2018, Progress in Nuclear Energy
Monitoring of neutron flux distribution is very important in safe operation of both research reactors and nuclear power plants. Most challenging issue in design and implementation of online flux mapping systems is to determine optimal locations of the detectors. Optimal placement of the detectors for flux reconstruction should extract maximum information from reactor states, while information duplication is avoided. Information theory is one of the most effective methods for sensor placement optimization, and it is based on entropy (Shannon function) which is defined by probability of sensor measurements. Objective function of optimization using information theory could be defined by (1) minimizing mutual information between each pair of sensors, while maximizing total entropy, or (2) maximizing total entropy, while common information is minimum achievable by defining joint probabilities (proposed algorithm by this paper). In this paper, optimal location of fixed in-core self-powered detectors for Tehran research reactor is determined by both of the information-based methods. The results of both methods indicate that the proposed algorithm (Method 2) has superior performance in comparison to Method 1 (adopted from literature), and results in more informative placement of same number of detectors. This optimal placement reduces costs in construction, implementation, operation, and maintenance of the fixed in-core detectors of the online flux mapping system for Tehran research reactor.
Retrospective analysis: A validation procedure for the redesign of an environmental monitoring network
2018, Measurement: Journal of the International Measurement Confederation
Citation Excerpt :
In practice, these requirements correspond to maximizing the information content of the network while minimizing the costs and labor involved in the task. For its inherent difficulty and practical usefulness, the problem has attracted the interest of several scientists, who have proposed a wide array of possible technical solutions [1–10]. The search of an optimal solution addresses two possible situations: the design of a new network and the redesign of an already existing network.
Monitoring networks are essential tools for the effective management of vulnerable or limited environmental resources. Cost and logistics constraints often suggest to reduce the number of monitoring sites while minimizing the loss of information determined by these changes. The problem can be rigorously addressed through the optimization of one or more objective functions that represent the managerial goals associated to the network. However, the use of objective functions is based on assumptions that in practical cases can be inaccurate. To overcome this problem, we have developed a retrospective analysis procedure that validates the degree of acceptability of the optimal reduced configuration at a local and global level. The procedure has been applied to a case study in Apulia, Italy, finding that the optimal reduced network was unable to recover the measured values of the monitored parameter of two discarded locations, making it unable to accomplish its monitoring goals.
Application of an entropy-based Bayesian optimization technique to the redesign of an existing monitoring network for single air pollutants
2009, Journal of Environmental Management
Citation Excerpt :
For Elkamel et al. (2008) the candidate sites come from mathematical modeling of pollutant dispersion from a set of smokestacks with inter-station correlation coefficients used to quantify site importance. For Nunes et al. (2006), candidate sites arise from data obtained during an intensive field campaign with an analysis of geostatistical variance used for site selection. In the present analysis, we are interested in identifying both redundant sites and potential new sites from an existing monitoring network.
We apply the entropy-based Bayesian optimizing approach of Le and Zidek to the spatial redesign of the extensive air pollution monitoring network operated by Metro Vancouver, in the Lower Fraser Valley, British Columbia. This method is chosen because of its statistical sophistication, relative to other possible approaches, and because of the very rich, two-decade long data record available from this network. The redesign analysis is applied to ozone, carbon monoxide and PM_2.5 pollutants.
The analysis provides guidance with regard to stations monitoring the three pollutants. For both ozone and PM_2.5, the analysis indicates a need for more stations in the eastern part of the monitoring domain. A parallel analysis indicates that stations may be removed from the more central parts of the domain. An analysis of the carbon monoxide network produces results that are not nearly as clearly defined as those for the other two pollutants, presumably because carbon monoxide is a primary pollutant with many locally important sources. The work demonstrates the great utility of the analysis technique, and also provides statistically defensible guidance on the spatial redesign of this important monitoring network.
Optimization of Sampling Design for Soil Total Organic Carbon Assessment in the Precision Agriculture Framework: Impact of Different Variogram Models and Potentiality of Geophysical Covariate Information
2024, SSRN
Optimizing the Design of an Estuarine Water Quality Monitoring Network by Optimal Control Techniques
2023, Environmental Modeling and Assessment

View all citing articles on Scopus

¹: Tel.: +351 239790200.

²: Tel.: +351 218417834.

View full text

Optimal estuarine sediment monitoring network design with simulated annealing

Abstract

Introduction

Section snippets

Estimation of probability distribution functions

Study area and source data

Feasible space

Conclusions

Environmental Modelling and Software

Statistics and Probability Letters

Journal of Hydrology

Journal of Hydrology

Journal of Hydrology

Geoderma

Simulated Annealing and Boltzman Machines

Combined use of hydraulic and electrical properties of an aquifer in a geostatistical estimation of transmissivity

Ground Water

Probabilistic analysis of collapsing soil by indicator kriging

Mathematical Geology

Entropy in the assessment of uncertainty of hydrologic systems and models

Water Resources Research

Network design for the estimation of areal mean of rainfall events

Water Resources Research

Designing efficient sampling schemes for reconnaissance surveys of contaminated bed sediments in water courses

Geologie en Mijnbouw-Netherlands Journal of Geosciences

Optimising two- and three-stage designs for spatial inventories of natural resources by simulated annealing

Environmental and Ecological Statistics

Delineation of estuarine management areas using multivariate geostatistics: the case of Sado Estuary

Environmental Science and Technology

Hydrologic networks: information transmission

Journal of the Water Resources Planning and Management Division

Sampling Techniques

Simulated annealing: searching for an optimal temperature schedule

SIAM Journal on Optimization

A Bayesian/maximum-entropy view to the spatial estimation problem

Mathematical Geology

Random Field Models in Earth Sciences

GSLIB, Geostatistical Software Library and User's Guide

Application of the BME approach to soil texture mapping

Stochastic Environmental Research and Risk Assessment

Stochastic relaxation, Gibbs distributions and the Baysian restoration of images

IEEE Transaction on Pattern Analysis and Machine Intelligence

Continuous soil maps—a fuzzy set approach to bridge the gap between aggregation levels of process and distribution models

Geoderma