Skip to main content
Erschienen in: Complex & Intelligent Systems 1/2022

Open Access 15.04.2021 | Original Article

Rainfall and runoff time-series trend analysis using LSTM recurrent neural network and wavelet neural network with satellite-based meteorological data: case study of Nzoia hydrologic basin

verfasst von: Yashon O. Ouma, Rodrick Cheruyot, Alice N. Wachera

Erschienen in: Complex & Intelligent Systems | Ausgabe 1/2022

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This study compares LSTM neural network and wavelet neural network (WNN) for spatio-temporal prediction of rainfall and runoff time-series trends in scarcely gauged hydrologic basins. Using long-term in situ observed data for 30 years (1980–2009) from ten rain gauge stations and three discharge measurement stations, the rainfall and runoff trends in the Nzoia River basin are predicted through satellite-based meteorological data comprising of: precipitation, mean temperature, relative humidity, wind speed and solar radiation. The prediction modelling was carried out in three sub-basins corresponding to the three discharge stations. LSTM and WNN were implemented with the same deep learning topological structure consisting of 4 hidden layers, each with 30 neurons. In the prediction of the basin runoff with the five meteorological parameters using LSTM and WNN, both models performed well with respective R2 values of 0.8967 and 0.8820. The MAE and RMSE measures for LSTM and WNN predictions ranged between 11–13 m3/s for the mean monthly runoff prediction. With the satellite-based meteorological data, LSTM predicted the mean monthly rainfall within the basin with R2 = 0.8610 as compared to R2 = 0.7825 using WNN. The MAE for mean monthly rainfall trend prediction was between 9 and 11 mm, while the RMSE varied between 15 and 21 mm. The performance of the models improved with increase in the number of input parameters, which corresponded to the size of the sub-basin. In terms of the computational time, both models converged at the lowest RMSE at nearly the same number of epochs, with WNN taking slightly longer to attain the minimum RMSE. The study shows that in hydrologic basins with scarce meteorological and hydrological monitoring networks, the use satellite-based meteorological data in deep learning neural network models are suitable for spatial and temporal analysis of rainfall and runoff trends.
Hinweise

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Introduction

In sustainable water resources management, the accurate modelling of hydrological processes at watershed scales is a significant contributing factor. In particular, predictions of rainfall and runoff trends are important for different water resource planning such as in irrigation, flood control, structural design, and eco-hydrological services [1]. In most countries, catchment basins are sparsely gauged without accurate and adequate rainfall and runoff measurements [2]. This contributes to higher uncertainty in attempts to predict hydrological responses in such areas [3]. Rainfall and runoff can be characterized as random stochastic processes related to complex physical factors within catchments. Because of the spatial and temporal variabilities within watersheds, the patterns and number of variables required for the modelling of rainfall and runoff presents a complex hydrologic problem [4, 5].
Generally, the forecasting of time-series data depends on the sequence being modelled and can have different dimensional spatial dependencies. In the prediction of rainfall and runoff time-series trends, physical and conceptual models are traditionally utilized [6]. While conceptual models are considered to be suitable for daily timescale analysis, physical models can be used for daily and sub-daily timescale predictions. Because of the timescale dependencies, the physical and conceptual models are considered as unsuitable for accurate prediction of rainfall and runoff particularly where there is lack of high resolution spatial [7, 9]. Furthermore, these models require physical parameters which limit their application in the prediction of sequence data with unknown or limited quasi-periodic dynamics [1012, 50].
Among the statistical methods, the Autoregressive Moving Average (ARMA) and its invariants as Autoregressive Integrated Moving Average (ARIMA) and nearest-neighbor methods have been used for rainfall and runoff predictions. Nevertheless, the accuracy of the statistical methods depends on the quality of the input data and can only satisfactorily describe time-series data that exhibit non-stationary behaviors within and across seasons [45]. To improve on the rainfall and runoff prediction results, data-driven approaches have been proposed. This is attributed to the ability to approximate the inherent patterns and dynamics of series data without the knowledge of the parameters, and to take into consideration the stochasticity in observation and system noise. In particular, ANNs have been explored and preferred for rainfall and runoff time-series trend analysis [13]. To predict rainfall and runoff, [14] proposed to use ANN and fuzzy logic, while ANN and ARIMA models were adopted in [15]. Other studies, e.g. [16], utilized Radial Basis Function (RBF) network and empirical model decompositions in the prediction of rainfall. The above studies reported that the approaches did not adequately capture the seasonal decomposition and the inherent cyclical fluctuations in rainfall and runoff time sequence data. Towards improving ANN performance in hydrological predictions for Kentucky River catchment, [17] developed an ANN-based training using genetic algorithms for streamflow magnitude predictions. For the case study of data sparse Malaprabha River Basin in India, [18] developed a modular neural network to capture variable intensities in rainfall and runoff simulations. The results from [18] were superior as compared to the methods in [6, 17, 19].
Despite the aforementioned results, the main drawback of the conventional feedforward ANNs is their tendency to lose significant information on the sequential order of the input data during training. This is attributed to the vanishing gradient effect which occurs as the number of layers increases [13]. Further, the applicability of a single ANN in hydrological phenomena modelling may not reliably capture the localized temporal and seasonal dynamics of rainfall and runoff [18, 20]. Thus, because of the inherent seasonality and non-linear characteristics of rainfall and runoff time-series data, hybrid models such as: wavelets and least squares Support Vector Machines [40]; wavelet transform and artificial neural network hybrid models [46]; wavelet-artificial neural network and comparison with adaptive neurofuzzy inference system [48], and singular spectral analysis and discrete wavelet transform in hybrid models [55] have been recommended in their simulation and prediction.
The problem of displaying long-term dependencies in time-series data implies that the desired output at time \(t\) depends on input value that occurred at an earlier time \(\tau < < t\). As such, the dynamical neural system for such a task should be able to learn to store information for an arbitrary duration (memory) for the minimization of noise corruption. Because the typical feedforward network is not sufficiently powerful to discover contingencies spanning long temporal distances, it easily suffers from vanishing gradient effect as the number of layers increases. RNNs are most suited to store long-term time-series data with different temporal scales. However, simple RNNs that depend on the largest eigenvalue of the state-update matrix may have gradients which either increase or decrease exponentially over time. Long short-term memory (LSTM) RNN [25], was developed to improve on the conventional RNN models. LSTM-RNN uses input, output and forget gates to achieve a network that can maintain state and propagate gradients in a stable fashion over long time spans. These networks have been shown to outperform deep feedforward neural networks on a variety of tasks [57]. Due to such capability, LSTM has been applied for rainfall and runoff predictions in [13, 21].
To take into account the non-stationarity in the assimilation of rainfall and runoff time-series data, this study proposes to compare the wavelet neural network (WNN) and the LSTM recurrent neural network. The comparison is based on the fact that in data showing persistence structure within the series, data-driven models can be considered to be more appropriate in accounting for their sequence dependency, non-stationarity and non-linearity. Recent investigations have demonstrated that WNN [23] and LSTM [13, 21] as data-driven models, can overcome the constraints of time-series modelling and are suitable for taking into account the quasi-periodicities in rainfall and runoff predictions.
Specifically, wavelet transform (WT) can be used to analyze the data signal details through signal decomposition into time–frequency domains. Adopting discrete wavelet transformation (DWT), the rainfall and runoff series data can be estimated into independent data with periodicity [24, 25, 39]. Further, in temporal sequence predictions, DWT can infer the normally hidden time–frequency information in time-series data. This study thus proposes the wavelets coupled ANN towards improving the rainfall and runoff prediction model performance as demonstrated in forecasting streamflows with more reliable results [40, 41]. The wavelet-neural network model is proven to be superior to the conventional ANN and statistical regression models in rainfall and runoff prediction in different case studies by [48, 51, 52].
Compared to WNN, LSTM is capable of dynamically incorporating predecessor or past learning experience due to internal recurrence. LSTM is also considered to be more powerful computationally and topologically more reasonable as compared to the conventional feedforward neural networks without internal states [25]. With this ability, LSTM can automatically project the inherent properties in time-series data for accurate simulation and approximation of the chaotic series. LSTM is also suitable where there is a long delay and accounts for time-series signals with low- and high-frequencies. Compared to the conventional ANN and statistical models such as ARMA and ARIMA, LSTM is capable of robustly learning information contained in time-series data, and can effectively capture the variability of time-series data [23, 42, 43].
To understand the significance of the two neural network models in rainfall and runoff trend analysis, this study explores the implementation of WNN and LSTM in rainfall and runoff trend characterization and predictions within a hydrologic basin with scarce meteorological and hydrological monitoring network. From literature, comparisons on the advantages of WNN and LSTM have not been carried out especially in rainfall and runoff hydrological applications in data scarce basins. Because of lack of observed meteorological data, this study also evaluates the significance of satellite-based meteorological data including rainfall, temperature, relative humidity, wind speed and solar radiation as input data in rainfall and runoff trend characterization and prediction in the Nzoia River basin in Kenya.

Materials and methods

Study area characterization

The Nzoia River Basin forms part of the larger Lake Victoria basin. The basin is situated within latitudes 1°30′ N and 0°30′ S and longitude 34°00′ E and 35°45′ E, with an approximate area of 12,700 km2. The elevation ranges between 1100 m and 4000 m AMSL (Fig. 1) [44]. The lower parts of the basin have a flat terrain with a slope mainly ranging between 2 and 6%, whereas the upper part is hilly with more rugged terrain as depicted in Fig. 1. Further details on the study area land-use and climatic characteristics can be found in [27, 44].
In the prediction modelling, the rainfall and streamflow stations are independently modelled with respect to the containing sub-basin as depicted in Fig. 1. Each discharge station was treated as a pour point and sub-basins were delineated to include all the streams draining to each discharge station. In Fig. 1, the spatial distributions of the discharge stations and the ground and satellite meteorological stations are shown.

Data

The basin was divided into upper (1BC01), middle (1DA02) and lower (1EF01) sub-basins according to the locations of the discharge measurement stations (Fig. 1). The in situ rainfall data and satellite-based meteorological data were aggregated to monthly averages for the 30 years (1980–2009) of study.
The satellite meteorological data were downloaded from the National Centers for Environmental Prediction Climate Forecast System Reanalysis (NCEP-CFSR) website (https://​climatedataguide​.​ucar.​edu/​climate-data/​climate-forecast-system-reanalysis-cfsr). The graphical variability of the mean monthly satellite-based and measured meteorological parameters and streamflow are presented in Fig. 2. Notably, from the meteorological data, a linear trend analysis shows increase in temperature and solar radiation, with corresponding decrease in relative humidity, wind speed and received precipitation within the basin (Fig. 2a–d). The trends in the precipitation from the in-situ measurements and from satellite data have a marginal difference with the satellite data overestimating the observed precipitation (Fig. 2e).
Table 1 presents a summary of the mean monthly statistical descriptions of the satellite-based meteorological data and the observed rainfall and streamflow data. Comparing the observed and satellite-based meteorological data, it is noted that the satellite-based rainfall overestimated the measured mean monthly rainfall by approximately 36 mm, and with twice the standard deviation. The streamflow volume is seen to vary according to the location of the pour point and size of the sub-basin.
Table 1
Descriptive statistics of the mean monthly observed and satellite-based meteorological data and streamflow
Parameters
Min
Max
Median
Mean
SD
CV (%)
SE
Satellite-based parameters (1980–2009)
 Precipitation (mm)
0.1
379.3
127.2
142.9
102.7
69
5.30
 Max. Temperature (°C)
25.5
37.1
31.6
31.4
2.3
7
0.12
 Min. Temperature (°C)
12.1
17.7
15.8
15.6
0.9
6
0.05
 Relative Humidity (%)
0.3
0.9
0.6
0.6
0.1
19
0.01
 Wind Speed (m/s)
1.6
3.9
2.4
2.5
0.5
19
0.02
 Solar Radiation (W/m2)
14.0
27.2
18.6
19.1
2.7
14
0.13
Observed rainfall data (1980–2009)
 Precipitation (mm)
0
303.0
110.4
106.6
61.0
60
3.20
Observed discharge data per station (1980–2009)
 1BC01
0.5
30.4
5.4
6.1
4.6
80
0.34
 1DA01
9.0
836.1
59.7
82.2
82.4
100
4.30
 1EF01
20.13
900.3
192.1
214.7
152.1
71
7.90
SD standard deviation, CV coefficient of variation, SE standard error

Methods

This section introduces the neural network models structure, implementation and validation approach. Figure 3 presents a summary of the implementation strategy and processing flow for the prediction of rainfall and runoff trends with the proposed WNN and LSTM models. In Fig. 3, the missing rainfall data (RF) were interpolated using Inverse Distance Weighting (IDW) method. The deterministic IDW was used to interpolate the rainfall points since the missing data was less than 5%.
In general, time-series data forecasting can be represented as a 2D problem, for example as a P × Q, phenomenon tensor \(Y \in R^{P \times Q \times k}\) with k measurements. The spatial observations can be represented in a time-dimension with T-time steps as \(Y_{{1{:}T}}\). In the forecasting modelling, with \(Y_{{1{:}T}}\) as the previous observations, the future \(\Delta T\) sequence is defined by the time interval \(T(1 + \Delta T)\) which is estimated as \(\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{Y}_{{T + 1{:}T + \Delta T}}\). The forecasting task in this case is defined as \(\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{Y}_{{T + 1{:}T + \Delta T}} = \begin{array}{*{20}c} {\text{arg max}} \\ {Y_{{T + 1{:}T + \Delta T}} } \\ \end{array} p\left( {Y_{{T + 1{:}T + \Delta T}} \left| {Y_{1:T} } \right.} \right)\), which is the mostly likely predicted sequence.

LSTM recurrent neural networks

RNNs are a special neural network designed for understanding the temporal and dynamic sequences in series data [2831]. As compared to feedforward networks that pass the data forward from input to output, RNNs have feedback loops where output data can feedback into the input at some point before feedforward again for further processing and final output. The advantage of the RNN model is in the ability to model sequence time-series data such that each sample can be assumed to be dependent on previous ones. The feedback connections in RNNs provide the memory of previous activations, thus allowing the model to iteratively learn the dynamics of the sequential data in time-steps [42].
Though RNNs exhibit powerful capability for modelling of complex and non-linear time-series data [32], the conventional RNNs suffers from diminishing gradient, especially in the backpropagation iterative learning process. Thus, conventional RNNs may not adequately learn from longer time-lag data with dependencies [32]. To solve this problem, [25] proposed the LSTM algorithm (Fig. 3). The LSTM have been proven to be more suitable in the simulation of sequence-based problems with long-term dependencies [34, 41].
In the LSTM-RNN model, the memory blocks contain the input gate, output gate and forget gate that replace the hidden units. The gates are responsible for the control the internal operations on the network. Despite different LSTM variants being proposed, a comparative analysis shows that the standard LSTM is still the most significant [35, 42], and thus adopted and evaluated in this study. Figure 4 shows a sample representation of a single LSTM architecture as adopted in the current study. In addition to the memory gates, the fundamental components of the LSTM network comprise of the cell state, sigmoid and the tanh activation functions as depicted in Fig. 4. The inclusion and removal of information to the cell state are regulated by the gates. The sigmoid activation functions in the gates multiplies inputs by values of [1], and determines the data to be included or removed.
Let \(c_{t}\) be the input sum at time step \(t\), the LSTM updates for time step \(i\) at given inputs \(x_{t}\), \(h_{t - 1}\), and \(c_{t - 1}\) are given as in Eqs. 15 [25], that describe the algorithms of a typical LSTM layers.
$$ f_{t} = \sigma \left( {W_{xf} \cdot x_{t} + W_{hf} \cdot h_{t - 1} + W_{cf} \cdot c_{t - 1} + b_{f} } \right) $$
(1)
$$ i_{t} = \sigma \left( {W_{xi} \cdot x_{t} + W_{hi} \cdot h_{t - 1} + W_{ci} \cdot c_{t - 1} + b_{i} } \right) $$
(2)
$$ c_{t} = i_{t} \cdot {\text{tanh}}\left( {W_{xc} \cdot x_{t} + W_{hc} \cdot h_{t - 1} + b_{c} } \right) + f_{t} \cdot c_{t - 1} $$
(3)
$$ o_{t} = \sigma \left( {W_{xo} \cdot x_{t} + W_{ho} \cdot h_{t - 1} + W_{co} \cdot c_{t} + b_{o} } \right), $$
(4)
$$ h_{t} = o_{t} \cdot {\text{tanh}}\left( {c_{t} } \right), $$
(5)
where: \(\sigma\) = non-linearity sigmoid function, \(i_{t}\) = input gate, \(W\) = weight matrix, \(x_{t}\) = time \(t\) input, \(c_{t}\) = cell state, \(h_{t}\) = time \(t\) output, \(f_{t}\) = forget gate [1], \(o_{t}\) = output gate, \(h_{t - 1}\) = hidden state vector of the previous time step, and \(b_{i}\) = input bias vector [25].
During model training, the LSTM network is optimized using the backpropagation algorithm, and the structure of the LSTM hidden layer unit comprises of addition and multiplication operations, and several active layers. This minimizes the RNN drawback of gradient vanishing problems. The implementation of the LSTM is detailed in [13, 21].

Wavelet neural networks

Wavelet transform (WT)
WT of a signal is the representation of the data in terms of the time–frequency domain. Through the transformation, the noise components are removed and the signals are decomposed into high- and low-frequency through the high-pass and low-pass operations. In the trend analysis of time-series data, wavelet transform is useful for the effective capture of the inherent and hidden characteristics and trends, as well as in detecting localized and non-stationary of the events. It is proposed in this study that WT is capable of detecting the non-stationarity phenomena in rainfall and runoff data by representing the original signal in low- and high-frequency data components.
As detailed in [25, 36], a mother wavelet function is constructed for the wavelet function. If \(\varphi \left( t \right)\) is an integrable square function, with \(\varphi \left( t \right) = L^{2} \left( R \right)\), and if its Fourier transform \(\Psi \left( \omega \right)\) satisfies the compatibility condition (Eq. 6):
$$ \int_{R} {\frac{{\left| {\Psi \left( \omega \right)} \right|^{2} }}{\omega }} d\omega \, < \, \infty , $$
(6)
then \(\varphi \left( t \right)\) is the mother wavelet. Translation (\(\tau\)) and scale (\(a\)) factors of WT are made so that we get function \(\varphi_{a,\tau } \left( t \right)\) as in Eq. 7:
$$ \varphi_{a,\tau } (t) = a^{1/2} \Psi \left( {\frac{t - \tau }{a}} \right), \, a > 0, \, \tau \in R, $$
(7)
\(\Psi_{a,\tau } {(}\omega {)}\) = continuous wavelet, and the inner product of the input signal \(x(t)\) and \(\varphi \left( t \right)\) is calculated as in Eq. 8, and its Fourier transform time-domain in Eq. 9.
$$ f_{x} (a,\tau ) = \frac{1}{\sqrt a }\int_{ - \infty }^{\infty } {x(t)} \varphi^{*} \left( {\frac{t - \tau }{a}} \right)dt, $$
(8)
$$ f_{x} (a,\tau ) = \frac{\sqrt a }{{2\pi }}\int_{ - \infty }^{\infty } {X(\omega )} \Psi^{*} \left( {a\omega } \right)e^{j\omega \tau } d\omega . $$
(9)
In the implementation of WT, the success is based on the selected mother wavelet. In hydrological time-series modelling, the Daubechies (DAUB-N) wavelets have proven to be more effective due to the orthonormality properties and balance between information quality conservation and abundance [37, 38, 46]. DWT was used in this study to decompose the rainfall and runoff data using the Daubechies level 4 (DB4), since DB4 is able to minimize the noise but does not oversmoothen the signal [37]. More details on the development and implementation of DWT are in presented our previous studies [25, 36, 47].
To accomplish the characterization and detection of localized phenomena of non-stationary time-series data, the first step is the decomposition of the measured discharge {Dd1(t), Dd2(t), …, Ddi(t), Da(t)} and rainfall {Rd1(t), Rd2(t), …, Rdi(t), Ra(t)} to multi-frequent data, where Ddi(t) and Rdi(t) are the resulting DWT details and Dai(t) and Rai(t) represent the approximation of time-series discharge (D) and rainfall (R). The detail i defines the ith approximation level of the decomposed data.
WT-neural network (WNN) model
Inheriting the properties of wavelets and ANN, the topology of WNN is based on feedforward backpropagation (multilayer perceptron) network, with the mother wavelet acting as the hidden-layers transfer function. The WNN network topology is shown in Fig. 5a and the implementation strategy in Fig. 5b. The feedforward multilayer perceptron with input layer, hidden layers and output layer was adopted for the rainfall and runoff signals decomposed by the wavelet transform into approximation [Dai(t) and Rai(t)] and detail [Ddi(t) and Rdi(t)] coefficients.
Before selecting the DB4, levels 1–10 were compared by trial-and-error and level 4 was adopted on the basis of the size of the validation data [5355]. From the wavelet decomposition, several sub-series from the original data were obtained as input variables to the feedforward MLP (Fig. 5a). Figure 5c, d shows sample inputs following DB4 decomposition of the original max temperature and rainfall for station 8,934,023. Sample station 8,934,023 is chosen as it represents the middle elevation of the catchments area. The input includes the level 4 approximation of the original signal and the 4-level details (D1–D4).
From the input vector, hidden layer output is determined as in Eq. 10:
$$ h\left( j \right) = h_{j} \left[ {\frac{{\sum\nolimits_{i = 1}^{n} {\omega_{ij} x_{i} - b_{j} } }}{{a_{j} }}} \right], $$
(10)
where \(h\left( j \right)\) = node; \(j\) = hidden layer nodes; \(h_{j}\) = mother wavelet; \(w_{ij}\) = input and hidden layer connecting weight; \(b_{j}\) = translation factor, and \(a_{j}\) = scaling factor for \(h_{j}\).
Adopting the DB4 [25, 36], the output layer is determined as in Eq. 11:
$$ y\left( k \right) = \sum\nolimits_{i = 1}^{k} {\omega_{jk} } \times h_{j} \left[ {\frac{{\sum\nolimits_{i = 1}^{n} {\omega_{ij} x_{i} - b_{j} } }}{{a_{j} }}} \right] = \sum\nolimits_{i = 1}^{k} {\omega_{jk} } h\left( j \right). $$
(11)
The updating of the WNN weights and the wavelet function parameters is as the following steps:
Step 1: WNN prediction error \(E\left( W \right)\) computation:
$$ E\left( W \right) = e = \sum\limits_{1}^{k} {\left[ {y_{t} \left( k \right) - y\left( k \right)} \right]^{2} } , $$
(12)
where \(y(k)\) = prediction output value and \(y_{t} \left( k \right)\) = target output.
Step 2: WNN weights update and variation of wavelet according to the prediction \(e\):
$$ \begin{gathered} \omega_{n \cdot k}^{{\left( {i + 1} \right)}} = \omega_{n \cdot k}^{\left( i \right)} + \Delta \omega_{n \cdot k}^{{\left( {i + 1} \right)}} \hfill \\ a_{k}^{{\left( {i + 1} \right)}} = a_{k}^{\left( i \right)} + \Delta a_{k}^{{\left( {i + 1} \right)}} \hfill \\ b_{k}^{{\left( {i + 1} \right)}} = b_{k}^{\left( i \right)} + \Delta b_{k}^{{\left( {i + 1} \right)}} \hfill \\ \end{gathered} $$
(13)
where \(\Delta \omega_{n.k}^{{\left( {i + 1} \right)}}\), \(\Delta a_{k}^{{\left( {i + 1} \right)}}\) and \(\Delta b_{k}^{{\left( {i + 1} \right)}}\) are calculated by prediction error of the network:
$$ \begin{gathered} \Delta \omega_{n \cdot k}^{{\left( {i + 1} \right)}} = - \eta \frac{\partial e}{{\partial \omega_{n,k}^{\left( i \right)} }} \hfill \\ \Delta a_{k}^{{\left( {i + 1} \right)}} = - \eta \frac{\partial e}{{\partial a_{k}^{\left( i \right)} }} \hfill \\ \Delta b_{k}^{{\left( {i + 1} \right)}} = - \eta \frac{\partial e}{{\partial b_{k}^{\left( i \right)} }}, \hfill \\ \end{gathered} $$
(14)
with \(\eta\) as the network learning rate.
The training of the network comprises of the following steps [48]:
1.
Data pre-processing: normalized data division into training (70%), testing (15%) and validation (15%) datasets. The validation is part of training the model and updating the parameters. It utilizes part of datasets to validate and update the model parameters after each training epoch.
 
2.
Network initialization: random initialization of weights, translation, translation and scale factor, and the learning rate {\(\omega_{ij}\) and \(\omega_{jk}\), \(b_{k}\), \(a_{k}\), \(\eta\)}.
 
3.
Network training: training, prediction and prediction error \(e\) estimation between output and expected value.
 
4.
Weights updating: parameter and network weights update depending on magnitude of \(e\).
 
5.
Network testing: use test dataset for network reliability testing, else iterate to Step 3.
 

Input data quantification and normalization

The [1] normalization based on min–max predefined boundary method (Eq. 15), was used to linearly transform the original data and to maintain the inherent relationships within the respective datasets [56]. After the test output, denormalization is carried out to be able to relate the prediction output with the observed data.
$$ f:x \to y\left( {\frac{{x - x_{{{\text{min}}}} }}{{x_{{{\text{max}}}} - x_{{{\text{min}}}} }}} \right), $$
(15)
where \(x,y \in R^{n}\), \(x_{{{\text{min}}}} = {\text{min}}\left( x \right)\), \(x_{{{\text{max}}}} = {\text{max}}\left( x \right)\) and \(x\) = input data. The values are converted into the range [1] through the normalization, that is \(x_{{{\text{max}}}} \in \left[ {0,{ 1}} \right]\), \(i\) = 1, 2,...,\(n\).

Metrics for model performance evaluation

To compare and evaluate the models, the following statistical measures were used \(R^{2}\), \(RMSE\) and \(MAE\) (Eqs. 1618):
$$ R^{2} = \frac{{\left( {\sum\nolimits_{i = 1}^{n} {\left( {P_{i} - \overline{P}} \right)\left( {P^{\prime}_{i} - \overline{P}^{\prime}} \right)} } \right)^{2} }}{{\sum\nolimits_{i = 1}^{n} {\left( {P_{i} - \overline{P}} \right)^{2} \sum\nolimits_{i = 1}^{n} {\left( {P^{\prime}_{i} - \overline{P}^{\prime}} \right)^{2} } } }}, $$
(16)
$$ {\text{RMSE}} = \sqrt {\frac{{\sum\nolimits_{i = 1}^{n} {\left( {P - P^{\prime}_{i} } \right)^{2} } }}{n}} = \sqrt {\frac{1}{n}\sum\limits_{i = 1}^{n} {e_{i}^{2} } } , $$
(17)
$$ {\text{MAE}} = \frac{{\sum\nolimits_{i = 1}^{n} {\left| {P_{i} - P^{\prime}_{i} } \right|} }}{n} = \frac{1}{n}\sum\limits_{i = 1}^{n} {\left| {e_{i} } \right|} , $$
(18)
where \(P_{i}\) = observed data \(P_{i}^{\prime }\) = simulated data; \(\overline{P}\) = mean observed data; \(\overline{P}^{\prime}\) = mean simulated data and \(e\) = model errors.

Results and discussions

Neural network optimum architecture for rainfall and runoff prediction

The construction of LSTM and WNN model architectures comprises of the creation of the topology of the deep learning network which is significantly determined by the hidden layer neurons and the selection of the optimal training parameters. Through trial-and-error input parameter combinations, the output performance of hidden layers is determined using R2, MAE and RMSE (Table 2). To determine the optimal architecture in rainfall and runoff predictions, the hidden layers were varied from 1 to 5 layers, with 30 neurons in each layer and for each sub-basin. The training and validation results from LSTM and WNN models are summarized in Table 2, and are based on all the five input parameters in the entire basin.
Table 2
Performance statistics of LSTM and WNN hidden layer architectures for rainfall and runoff simulations
ANN model
Number of hidden layers
Rainfall
Runoff
R2
MAE (mm)
RMSE (mm)
R2
MAE (m3 s−1)
RMSE (m3 s−1)
LSTM
1
0.6448
13.3589
29.1334
0.6321
18.2446
15.7903
2
0.7561
11.4248
26.3519
0.7468
12.0772
15.2249
3
0.8534
8.3901
25.0677
0.8900
10.5237
14.5134
4
0.8610
13.5769
20.5492
0.9249
10. 2213
13.0459
5
0.8577
13.9505
20.5833
0.8981
11.7404
13.0520
WNN
1
0.7219
33.4401
27.6993
0.8209
12.5991
16.4821
2
0.7404
21.0580
26.4530
0.8243
13.1103
17.5491
3
0.7187
28.2397
26.5166
0.8561
11.4854
17.1507
4
0.7825
24.4877
25.4206
0.8579
12.3735
15.2128
5
0.7812
25.9110
26.0488
0.8447
12.9546
15.2312
The prediction results in Table 2 using R2 evaluation shows that the performance of the LSTM and WNN models improved with increase in the number hidden layers and corresponding neurons. Notably, after the fourth hidden layer, there was observed to be a consistent decrease in the prediction performance for the models. The LSTM prediction of rainfall as measured with R2 increased from 0.6448 with 1 hidden layer of 30 neurons, to a maximum of 0.8610 with a topology of 4 hidden layers with 30 neurons each. At one hidden layer, WNN marginally outperformed the LSTM by about 6%, however with four hidden layers LSTM performed better than the WNN by 8% as measured in terms of R2. Similar patterns in the results were also observed for runoff prediction with the tested network topological structures as in Table 2. The MAE and RMSE performance metrics showed that with increase in neurons, WNN tended to marginally minimize the prediction errors in comparison with the LSTM neural network model. It is considered that both WNN and LSTM are capable of predicting the rainfall and runoff trends, however with deep learning structure, LSTM marginally outperformed the WNN.
Notably in Table 2, MAE for rainfall prediction using LSTM with three hidden layers received the lowest error, while the opposite observations are obtained when using RMSE evaluation and when using WNN with two hidden layers. Similarly for runoff predictions using LSTM with four hidden layers and WNN with three hidden layers, the inverse variabilities in MAE and RMSE with increase in hidden layers are observed. These observations could be attributed in part to the fact that for the same data, the differences in the observed RMSE and MAE may arise when the error distributions in the data are biased or non-Gaussian. Further, while the MAE gives the same weight to all errors, the RMSE penalizes variance as it gives errors with larger absolute values more weight than errors with smaller absolute values. This could be the cause of the varied prediction results in Table 2 for rainfall and runoff using LSTM and WNN.
To further assess the difference in performance between LSTM and WNN in rainfall and runoff trend prediction in terms of computing time, the variations of the epochs and the RMSE as the standard deviation of the prediction residuals is presented in Fig. 6. In the prediction of rainfall, RMSE converged to a minimum of 14.55 mm in 31st epochs for LSTM and at 15.17 mm in 37th epochs using WNN (Fig. 6a). Further training of the networks towards loss minimization and to obtain possible higher accuracy resulted in an increase in RMSE and reaches a saturation point after 43 epochs with WNN.
In the prediction of runoff, WNN is observed to consistently overestimate the mean runoff within the basin with a minimum RMSE of 15.17 m3/s at 34th epochs (Fig. 6b). Using LSTM, the minimum RMSE is at approximately 13.12 m3/s was achieved between the 25th–35th epochs. The results further confirm that both models are suitable for the prediction of rainfall and runoff trends with reasonable computing time.
Figure 6c, d shows the accuracy performance the LSTM and WNN models calculated for 50 iteration epochs using training and validation datasets for prediction of rainfall and runoff for the entire basins. In general, the model accuracies on the training and validation datasets increases after each iteration with fluctuations, which could be attributed to some randomnesses in the network. As the model trains during the first pass through the data, both training and validation accuracy increases indicating that the model is learning the structure of the rainfall and runoff prediction data as well as the temporal correlations of the time-series. In the first and consecutive iterations, the validation accuracy did not increase significantly and always higher than the training accuracy. This indicates that the network did not overfit the training data and accurately generalized to the unseen validation data. The optimal accuracy was obtained between 30 and 40 epochs with LSTM being higher WNN in predicting rainfall and runoff respectively by approximately 10% and 5%. The prediction of runoff was consistently recorded with a higher R2 accuracy as compared to rainfall using both WNN and LSTM (Fig. 6c, d).

Runoff prediction with LSTM and WNN

Runoff prediction results using the LSTM and WNN models

Adopting the optimal four hidden-layer configuration, the runoff prediction results at the three discharge stations (1BC01, 1DA02 and 1EF01) using the LSTM model is presented in Fig. 7. Similarly, the runoff prediction results using WNN for the three discharge measurement stations are shown in Fig. 8. For both models, the input consisted of the mean monthly satellite-based meteorological datasets.
A statistical comparison of the runoff prediction results from LSTM and WNN are presented in Table 3 in terms of R2, MAE and RMSE. Except for station IBC01 where WNN predicted the runoff with the R2 of 0.7820, both models predicted the runoff at the three stations with R2 greater than 0.80 using the five input parameters as input (Fig. 9). This is evidenced in the 30-year prediction accuracy with the MAE and RMSE of less than 13 m3/s for both models. In overall for the entire basin, it is observed that LSTM marginally outperformed the WNN model, with MAE = 11.1452 m3/s and RMSE = 12.1933 m3/s at the basin outlet 1EF01. In practical applications, it is conclusive that both models can be used in the prediction of runoff in data scarce basins.
Table 3
Performance evaluation for runoff prediction with LSTM and WNN using the meteorological parameters
Station ID
LSTM model
WNN model
R2
MAE (m3 s−1)
RMSE (m3 s−1)
R2
MAE (m3 s−1)
RMSE (m3 s−1)
1BC01
0.8017
12.0890
12.1450
0.7820
12.6490
12.8793
1DA02
0.8330
12.0971
13.1318
0.8812
12.5637
12.7465
1EF01
0.8967
11.1452
12.1933
0.8053
12.4917
12.3040
A comparison of the goodness-of-fit for the prediction of streamflow runoff with the two models as presented in Fig. 9 shows that for the three stations, the use of satellite data to predict streamflow is acceptable as the R values were more than 85%. The accuracy of streamflow prediction is observed to increases with increase in the number of prediction stations within the sub-basins.

Performance of individual meteorological factors in runoff prediction

To determine the significance and accuracy of the contributions of the satellite-based meteorological parameters in the prediction of runoff, each of the five parameters were used as independent inputs with runoff as output. The comparative output results are presented in Fig. 10 representing the runoff predictions for the three discharge stations.
The results in Fig. 10 show that rainfall is the highest contributing indicator variable in runoff prediction with R2 > 0.8 for the three discharge stations. This confirms the fact that the amount of rainfall that remains after storage, infiltration, interception, evaporation and transpiration contributes to runoff. The least contributing meteorological factor is the relative humidity with R2 ranging between 0.6 and 0.65 for the three discharge stations using LSTM and WNN. The rest of the parameters, average temperature, wind speed and solar radiation estimated the runoff in the three stations with R2 ranging between 0.675 and 0.80, with temperature performing better than wind speed and solar radiation. Conclusively, as the input increases from rainfall to all the datasets (1–5) and the hidden layers increased from 1 and 4, the accuracy of runoff prediction is observed to increase for the model training, testing and validation by up to 10%. Further investigations on the explanation of the basis of the individual predictions by comparing the contribution of each feature to each prediction using a unified approach such as SHapley Additive exPlanations (SHAP) [57] is recommended.

Rainfall prediction using LSTM and WNN

Performance of individual meteorological parameters for rainfall prediction

In evaluating the significance of the satellite-based meteorological parameters in rainfall trend prediction, Table 4 presents the performance results of basin mean monthly rainfall prediction using the different meteorological parameters. Satellite-based precipitation is observed to be the most significant predictor in estimating the measured rainfall with R2 > 0.8 and the least MAE and RMSE errors measures. This is contributed to by the fact that for medium sized and climatically homogenous basins like the Nzoia Basin, the climate factors tend to be replicated throughout the catchment area with minimal variabilities. As such the occurrence of rainfall at one station is generally an indication of rainfall also being recorded at a near distant station within the basin.
Table 4
Performance of individual meteorological parameters for rainfall prediction
ANN model
Meteorological input parameter
Predicted rainfall
R2
MAE (mm)
RMSE (mm)
LSTM
Rainfall
0.7824
9.0264
17.2264
Average temperature
0.6938
10.9790
20.1791
Relative humidity
0.6729
10.1826
20.2338
Wind speeds
0.6551
9.2838
18.2254
Solar radiation
0.6604
11.2005
21.2700
WNN
Rainfall
0.7013
11.9301
14.6411
Average temperature
0.6490
10.0717
15.3029
Relative humidity
0.6136
12.4593
15.6574
Wind speeds
0.6050
11.8406
14.9878
Solarradiation
0.6038
12.4955
16.1606
Temperature is the second best predictor for rainfall prediction, and the effect of temperature on rainfall arises from the fact that increased temperature leads to increased evaporation, an accelerated rate of the hydrological cycle and more precipitation especially during the wet season. Humidity, wind speed and solar radiation are consecutively ranked as in Table 4 with nearly the same contribution effects on rainfall prediction, implying that they are highly correlated within the basin in terms of their contribution in rainfall prediction. This is also attributed to the size of the basin and the fact that the climate factors are nearly similar within the basin.

Rainfall prediction with combined satellite-based meteorological data

Results of the mean monthly predicted rainfall for the four stations using the five meteorological datasets are respectively presented in Figs. 11 and 12 for LSTM and WNN. The results are samples of distributed gauge stations representing high (station 8,835,034), mid (stations 8,934,071 and 8,934,023) and low (stations 8,934,059) elevation areas.
The coefficients of determination for the four representative stations 8,835,034, 8,934,071, 8,934,023 and 8,934,059 are presented in Fig. 13. The comparative performance between the two models imply that the LSTM predicted the basin mean rainfall with higher R2 = 0.8610 for the ten stations, while WNN’s prediction was at R2 = 0.7825. The higher accuracy prediction results at individual rainfall stations could be attributed to continuous and accurate gauge data.
In addition to the statistical performance evaluations, graphical comparisons of the observed and modelled runoff and rainfall with LSTM and WNN in Figs. 7, 8 and 11, 12, respectively, shows that the LSTM results matched closely in spatial position and trend with the observed data as compared to the WNN results. Further, the regression lines for runoff and rainfall in Figs. 9 and 13, respectively, shows that LSTM modelled the parameters closer to the 45° line of fit as compared to the WNN. The slightly lower performance from WNN results could be attributed to the feedforward ANN used in the training the input signals.
The study results on the prediction of rainfall and runoff trends show that LSTM and wavelet-based neural networks are able to overcome the timescale conversion problems in time-series data analysis for accurate forecasting [8, 9], as they are capable of capturing the quasi-periodic signals in long-term rainfall and runoff data which are also characterized by cyclical fluctuations with inherent noise [1012]. The LSTM and WNN are considered to be superior to the conventional ANN models since they appear conserve the crucial information input data sequence order because of the deep learning process in the hidden layers [13, 2123, 45].
According to [24], WNN as a data-driven model is able to take into account the non-stationarity in the assimilation of rainfall and runoff time-series data as they account for sequence dependency, periodicity and non-linearity in such data [24, 25, 3941]. LSTM on the other hand can dynamically incorporate past learning experience due to internal recurrence, thus presenting a powerful internal state for accurate learning and predictions in data with long delay and mixed frequencies [23, 26, 42, 43].
The distribution of the measured and predicted rainfall from the results in Figs. 11 and 12 for the year 1999 were spatially interpolated using ordinary Kriging [33], and the results presented in Fig. 14. The year 1999 is chosen because it had the most continuous measured precipitation in all the gauge stations within the basin, thus suitable for comparative analysis. Figure 14a presents the observed mean monthly precipitation in 1999 and the results from LSTM and WNN are respectively presented in Fig. 14b, c. It is observed that LSTM has the ability to accurately infer the long-term patterns in the 30-year rainfall data in most parts of the basin. As compared to the LSTM, WNN tended to overestimate the higher precipitation values and underestimate the lower precipitations. The results in Fig. 14 also illustrates that despite the good statistical evaluation results, the spatial representation of the phenomenon gives a more insightful area-based comparison of the results.
In the prediction of rainfall and runoff in hydrologic basins with scarce data, the LSTM model performed marginally better than the wavelet-based neural network model. Both the models displayed the capability to learn the inherent temporal dynamics in the time-series data, and also to capture the seasonality in the quasi-periodic rainfall and runoff data. The results show that optimized effectively, LSTM and WNN can resolve the non-stationarity and non-linearity problems associated with trend analysis of rainfall and runoff data.

Conclusions

This study presented a performance evaluation of recurrent neural network implementation based on LSTM architecture and wavelet neural network for rainfall and runoff trend analysis in sparsely gauged river basin. With 30 years (1980–2009) of measured rainfall and streamflow data, and satellite-based meteorological data comprising of precipitation, average temperature, relative humidity, wind speed and solar radiation from 10 satellite stations, the two neural network models were compared for rainfall and runoff prediction in the Nzoia hydrologic river basin in Kenya.
With the same optimal neural network topological structure of 4 hidden layers each consisting of 30 neurons, both LSTM and WNN models predicted runoff with average R2 value of 0.8 for all the 3 stations, except station 1BC01 using WNN. The RMSE and MAE metrics from both models in runoff prediction was achieved at less than 13 m3/s for the 30-year study period. The evaluation of the significance of each meteorological parameter in the prediction of runoff showed rainfall as the most significant input parameter, followed by temperature, and solar radiation as the least contributing factor. Best results were obtained by including all the parameters in the prediction model. In the forecasting of rainfall, LSTM gave the best predictive results with R2 = 0.8610 for the average monthly basin rainfall from the ten stations, with satellite-based precipitation being the best rainfall predictor. WNN estimated the mean basin rainfall with R2 = 0.7825 using the five satellite data. At the sub-basins scale, it was observed that the performance of the models improved with increase in the number if input parameters and number of data stations.
The study results shows that for catchments with scarce and low quality hydrological and meteorological data observations, use of satellite data in optimized LSTM and wavelet neural network models can be relied upon for the prediction of rainfall and runoff trends. It is recommended that similar studies be carried out with the inclusion of basin physical characteristics such as elevation, slope and flow accumulation as training inputs to determine the significance of the physical watershed characteristics in rainfall and runoff predictions.

Acknowledgements

The Lake Victoria North Water Services Board (LVNWSB) is highly acknowledged for providing the in-situ data sets used in this study.

Declarations

Conflict of interest

The authors declare no competing interests in the work reported in this paper.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Literatur
5.
15.
Zurück zum Zitat Somvanshi VK, Pandey OP, Agrawal PK, Kalanker NV, Prakash MR, Ramesh C (2006) Modelling and prediction of rainfall using artificial neural network and ARIMA techniques. J Indian Geophys Union 10:141–151 Somvanshi VK, Pandey OP, Agrawal PK, Kalanker NV, Prakash MR, Ramesh C (2006) Modelling and prediction of rainfall using artificial neural network and ARIMA techniques. J Indian Geophys Union 10:141–151
16.
Zurück zum Zitat Liu X, Zhang A, Shi C, Wang H (2009) Filtering and multi-scale RBF prediction model of rainfall based on EMD method. In Proceedings of the 2009 1st International Conference on Information Science and Engineering (ICISE), Nanjing, China, December 26–28, pp. 3785–3788. https://doi.org/10.1109/ICISE.2009.592 Liu X, Zhang A, Shi C, Wang H (2009) Filtering and multi-scale RBF prediction model of rainfall based on EMD method. In Proceedings of the 2009 1st International Conference on Information Science and Engineering (ICISE), Nanjing, China, December 26–28, pp. 3785–3788. https://​doi.​org/​10.​1109/​ICISE.​2009.​592
20.
Zurück zum Zitat Nelson M, Hill T, Remus T, O’Connor M (1999) Time-series forecasting using neural networks. Should the data be deseasonalized first? J Forecast 18:359–367CrossRef Nelson M, Hill T, Remus T, O’Connor M (1999) Time-series forecasting using neural networks. Should the data be deseasonalized first? J Forecast 18:359–367CrossRef
22.
Zurück zum Zitat Cui Z, Ke R, Wang Y (2018) Deep Bidirectional and Unidirectional LSTM Recurrent Neural Network for Network-wide Traffic Speed Prediction. arXiv:1801.02143. Cui Z, Ke R, Wang Y (2018) Deep Bidirectional and Unidirectional LSTM Recurrent Neural Network for Network-wide Traffic Speed Prediction. arXiv:1801.02143.
28.
Zurück zum Zitat Rumelhart DE, Hinton GE, Williams RJ (1985) Learning internal representations by error propagation. California University San Diego La Jolla Inst for Cognitive Science, No. ICS-8506 Rumelhart DE, Hinton GE, Williams RJ (1985) Learning internal representations by error propagation. California University San Diego La Jolla Inst for Cognitive Science, No. ICS-8506
29.
Zurück zum Zitat Carriere P, Mohaghegh S, Gaskar R (1996) Performance of a virtual runoff hydrographic system. Water Resourc Plann Manag 122:120–125 Carriere P, Mohaghegh S, Gaskar R (1996) Performance of a virtual runoff hydrographic system. Water Resourc Plann Manag 122:120–125
31.
Zurück zum Zitat Hsu K, Gupta HV, Soroochian S (1997) Application of a recurrent neural network to rainfall-runoff modelling. Proc., Aesthetics in the Constructed Environment, ASCE, New York, pp. 68–73 Hsu K, Gupta HV, Soroochian S (1997) Application of a recurrent neural network to rainfall-runoff modelling. Proc., Aesthetics in the Constructed Environment, ASCE, New York, pp. 68–73
34.
Zurück zum Zitat Chang FJ, Lo YC, Chen PA, Chang LC (2015) Shieh MC multi-step-ahead reservoir inflow forecasting by artificial intelligence techniques. Springer International Publishing, Cham, pp 235–249 Chang FJ, Lo YC, Chen PA, Chang LC (2015) Shieh MC multi-step-ahead reservoir inflow forecasting by artificial intelligence techniques. Springer International Publishing, Cham, pp 235–249
40.
Zurück zum Zitat Shabri A (2015) A hybrid model for stream flow forecasting using wavelet and least Squares support vector machines. Jurnal Teknologi 73:89–96CrossRef Shabri A (2015) A hybrid model for stream flow forecasting using wavelet and least Squares support vector machines. Jurnal Teknologi 73:89–96CrossRef
48.
51.
Zurück zum Zitat Komasi M (2007) Modelling rainfall-runoff model using a combination of wavelet-ANN. Tabriz University, Tabriz Komasi M (2007) Modelling rainfall-runoff model using a combination of wavelet-ANN. Tabriz University, Tabriz
58.
Zurück zum Zitat Lundberg SM et al (2020) From local explanations to global understanding with explainable AI for trees. Nat Mach Intell 2:2522–5839CrossRef Lundberg SM et al (2020) From local explanations to global understanding with explainable AI for trees. Nat Mach Intell 2:2522–5839CrossRef
Metadaten
Titel
Rainfall and runoff time-series trend analysis using LSTM recurrent neural network and wavelet neural network with satellite-based meteorological data: case study of Nzoia hydrologic basin
verfasst von
Yashon O. Ouma
Rodrick Cheruyot
Alice N. Wachera
Publikationsdatum
15.04.2021
Verlag
Springer International Publishing
Erschienen in
Complex & Intelligent Systems / Ausgabe 1/2022
Print ISSN: 2199-4536
Elektronische ISSN: 2198-6053
DOI
https://doi.org/10.1007/s40747-021-00365-2

Weitere Artikel der Ausgabe 1/2022

Complex & Intelligent Systems 1/2022 Zur Ausgabe

Premium Partner