Prediction of urban stormwater quality using artificial neural networks

https://doi.org/10.1016/j.envsoft.2008.07.004Get rights and content

Abstract

There are a vast number of complex, interrelated processes influencing urban stormwater quality. However, the lack of measured fundamental variables prevents the construction of process-based models. Furthermore, hybrid models such as the buildup-washoff models are generally crude simplifications of reality. This has created the need for statistical models, capable of making use of the readily accessible data. In this paper, artificial neural networks (ANN) were used to predict stormwater quality at urbanized catchments located throughout the United States. Five constituents were analysed: chemical oxygen demand (COD), lead (Pb), suspended solids (SS), total Kjeldhal nitrogen (TKN) and total phosphorus (TP). Multiple linear regression equations were initially constructed upon logarithmically transformed data. Input variables were primarily selected using a stepwise regression approach, combined with process knowledge. Variables found significant in the regression models were then used to construct ANN models. Other important network parameters such as learning rate, momentum and the number of hidden nodes were optimized using a trial and error approach. The final ANN models were then compared with the multiple linear regression models. In summary, ANN models were generally less accurate than the regression models and more time consuming to construct. This infers that ANN models are not more applicable than regression models when predicting urban stormwater quality.

Introduction

As urbanization proceeds, so does the degradation of receiving waters. In order to minimize the subsequent damage upon aquatic ecosystems, the extent of the problem must be known. Unfortunately, sampling programs are very expensive to carry out (Driver and Tasker, 1990, Brezonik and Stadelmann, 2002, Sliva and Williams, 2001). Furthermore, planning level estimates are often required prior to the urbanization of natural catchments. This has created the need to predict urban stormwater quality at unmonitored catchments. The high variability associated with event mean concentrations (EMC) at single sites negates the validity of applying simplistic representative estimates of site mean concentrations. Furthermore, EMC variability is also observed between sites. This suggests the need for complex models capable of predicting EMC variability at single sites and between multiple sites.

The construction of process based models is difficult. Essential calibration data may not be readily accessible (Loke et al., 1999), resulting in large inaccuracies when calibration parameters are estimated without the use of data from the site of interest. Hybrid models are also limited, often only crude approximations of reality. For example, buildup-washoff models ignore potentially significant processes, including the rainout and washout of nitrogen compounds, pervious area erosion and the stream scour of sediments (Corbett et al., 1997). Furthermore, most buildup-washoff models incorrectly assume that all available accumulated pollutant is washed off during a given storm (Vaze and Chiew, 2002). This limitation is compounded when taking into consideration that pollutant accumulation data cannot be directly measured. Consequently, Huber (1992) stated that the use of literature values to predict buildup could lead to model predictions being more than an order of magnitude out. This has created the need for statistical models capable of predicting urban stormwater quality at unmonitored sites.

Two extensive, statistically based studies have been previously undertaken to predict urban stormwater quality. The first study by Driver and Tasker (1990) used data from the Nationwide Urban Runoff Program (NURP) to construct multiple linear regression models, capable of predicting EMCs at sites located throughout the United States. The second study by Brezonik and Stadelmann (2002) also used multiple linear regression models to predict EMCs at watersheds in the Twin Cities metropolitan area, Minnesota, USA. The use of logarithmically transformed data in each of these two studies allowed the simplistic representation of nonlinear relationships. However, such relationships were limited to potentially over simplified power relationships. These relationships were deemed to be rather crude approximations of the complex assortment of nonlinear relationships present in the environmental systems under study. Unfortunately, the vast array of complex, interrelated processes influencing urban stormwater quality are difficult to define prior to model construction. This has created the demand for more complex models such as artificial neural networks.

Artificial neural networks (ANN) are information processing structures inspired by the functioning of the brain. They are comprised of a vast, interconnected structure of processing elements. The computational power of these processing elements is minimal when in isolation. However, within large networks, the computational power is massive. The parallel distribution of information within the ANNs provides the capacity to model complicated, nonlinear, interrelated processes. This ultimately allows ANNs to model environmental systems without prior specification of the algebraic relationships between variables (Lek et al., 1999). This has led to the application of ANNs in many water resources applications (Holmberg et al., 2006, Mazvimavi et al., 2005, Riad et al., 2004, Sarangi and Bhattacharya, 2005, Tayfur et al., 2005).

Despite its strong theoretical potential, ANN application is subject to a number of challenges. In particular, it is widely recognized that the generalisation of an ANN is dependent upon network topology and the selection of key network parameters, including the transfer function, the error function, learning rate, and momentum (Goethals et al., 2007). A trial and error approach is often implemented to optimize ANN models, which can be extremely time consuming. Two types of automated techniques are available to select network architecture: pruning algorithms and constructive algorithms (Maier and Dandy, 1998). However, the specification of additional network parameters associated with such techniques limits their practicality.

In the current study, ANN models were used to predict urban stormwater quality at unmonitored sites located throughout the United States. Model inputs were selected using multiple linear regression, and the remaining network parameters optimized using trial and error. The final models were then compared to multiple linear regression models to determine their applicability.

Section snippets

Methods

As part of the Nationwide Urban Runoff Program (NURP) conducted in the 1970s and 1980s, the USGS and the USEPA collected water quality, climatic and geographic data from sites located throughout the United States. The data from this program was collated by the Cahaba/Warrier Student Chapter of the American Water Resources Association (1998) and used in the current study. A total of five water quality constituents were analysed in the current study: chemical oxygen demand (COD), lead (Pb),

Results and discussion

The regression equations used to predict chemical oxygen demand, lead, suspended solids, total Kjeldhal nitrogen and total phosphorus are presented below:log(COD)=3.15+0.09log(EFFIA)0.22log(P3RN)+0.10log(QMAX)0.39log(MAP)0.55log(TRN)log(Pb)=1.650.16log(DRF)+0.57log(IA)+0.06log(LUR)0.16log(P7RN)+0.41log(QMAX)+3.26log(SD)0.38log(TRFD)+0.59log(VS)log(SS)=4.460.49log(DRN)+0.22log(LUN)+0.65log(M5RR)0.78log(MAR)0.29log(RAI)+0.73log(VS)log(TKN)=0.23+0.14log(EFFIA)+0.29log(NH4CR)0.18log(P7RN)

Conclusions

In the present study, the ability of ANN models to predict urban stormwater quality was investigated. Input variables used in ANN models were derived from stepwise regression analyses, and key network parameters and topology determined using trial and error. The optimum network parameters and network topology varied from one water quality constituent to another. However, the number of hidden nodes was typically found to have slightly more of an influence on network accuracy than either learning

Acknowledgements

We acknowledge Nancy Driver from USGS, USA for providing essential raw data for the study, and Pam Davy from University of Wollongong for assisting with the statistical analyses.

References (24)

Cited by (0)

View full text