Multiobjective analysis of chaotic dynamic systems with sparse learning machines
Introduction
Chaotic systems are nonlinear, dynamic, fully deterministic, highly sensitive to initial conditions, and can be modeled using state-space reconstruction via a time delay embedding theorem [40]. Though Poincare described chaotic behavior in nonlinear systems in the late 1880s, a resurgence in chaos theory arose with Lorenz [50] in his work with weather prediction models wherein he discovered that nonlinear models could be exponentially divergent (i.e., sensitive to small differences in initial conditions) [71], [70], [68], [24]. Thus, unlike many other processes the erratic data produced from chaos are due to complex outcome of a nonlinear system and initial conditions that are identified from uncertain data rather than only intrinsic randomness. The behaviors of many water resources systems have been observed to be chaotic and thus chaos has received significant attention in hydrology.
Chaos theory states that the time series itself carries enough information about the behavior of the system to carry out forecasting [50]. Therefore, a deterministic chaotic system behaves in the future in a similar manner as in the past. The embedding theorem emerged in the light of chaos theory, which states that given a recognized state-space representation of a chaotic time series, through estimation of the time delay and the embedding dimension (i.e., state-space reconstruction) a full knowledge of the system behavior is guaranteed [72]. Nonetheless, the time series must: be sampled at sufficient resolution, not be corrupted by noise, and measured over a long period of time [59] in order to avoid the biases of many state-space reconstruction techniques. In addition, the state evolution of a chaotic system is dynamic and constitutes an inverse problem for which there is no unique solution, and for which there might be no stable solution, either.
Capturing the behavior of a chaotic time series becomes more complicated in the presence of noise (i.e., background noise, or an inaccuracy of the measurements of system behavior) [67]. The process of measuring system states using physical sensors, in addition to the lack or neglect of exogenous stresses, introduces some amount of noise [25]. This noise causes some uncertainties in both the model structure and, accordingly, in predictions about the future performance of the system. Contamination with noise is almost inherent in any hydrological time series. This, in essence, runs counter to many widely used methodologies that are based on theories developed on assumptions of infinite and noise-free time series [67]. Moreover, the structure of the hydrological processes exhibits temporal, spatial, and scale variability. A failure to account for the underlying system structure limits the ability of the modeling approaches to identify a unique mathematical representation of the hydrological processes [67]. This translated to an impediment to both traditional state-space forecasting methodologies and learning machines to predict future system behaviors with confidence. In light of these modeling issues, a principal objective of this paper is to quantify the amount of uncertainty introduced in the analysis of complex hydrological processes by the specification of model structure.
From a pragmatic engineering point of view, state-space reconstruction techniques have shortcomings that can be attributed to the fact that their prediction accuracy is often inadequate since the state-space parameters are not derived with the intention of minimizing the prediction error, but instead are developed to characterize the nonlinear dynamic process in question [12], [85], [59].
In this sense, the other objective of this paper is to link the powerful state-space reconstruction methodology via exploiting the appealing regularization concepts of both support vector machines (SVM) and relevance vector machines (RVM) within a multiobjective optimization framework. The parameters of chaos theory and the unintuitive parameters of learning machines will be optimized with the assistance of a multiobjective shuffle complex evolution Metropolis algorithm (MOSCEM). The chosen objective functions will be optimized both independently and simultaneously. This will yield multiple feasible solutions accounting for the trade-off (e.g., bias-variance trade-off; trade-offs between seepage, precipitation, and evaporation induced signals) and moreover capture the uncertainty in the model structure.
In this manuscript, efforts will be made to assess the uncertainty and robustness of the machines in learning and forecasting as a function of model structure and bootstrapping samples. The proposed framework, using sparse learning techniques, allows for compact representations of system dynamics. In other words, models that are developed from the learning machines used here normally have a structure, including parameterization that suits the information content of the available data, and can be used to develop time series forecasts for multiple lead times. The goal of this paper is to introduce new learning machines in a multiobjective framework that identify a suite of model parameters and that consequently enable an ensemble forecast of time series. A theoretical background is first described. Then the framework utility is demonstrated by applying it to a Great Salt Lake (GSL) biweekly volume dataset.
Section snippets
Chaotic and nonlinear time series
Chaos occurs as a feature of orbits x(t) arising from systems of differential equations of dx(t)/dt = F(x(t)) with three or more degrees of freedom or invertible maps of x(t + 1) = F(x(t)). As a class of observable signals, x(t), chaos lies logically between the well-studied domain of predictable, regular, or quasi-periodic signal and the totally irregular stochastic signals [5]. In many systems the interaction between the underlying physical processes that are responsible for the evolution of system
Description of the study area
The Great Salt Lake (GSL) of Utah is the fourth largest terminal (i.e., has no outlet) lake in the world (http://ut.water.usgs.gov/greatsaltlake/). The GSL basin encompasses a drainage area of 89,000 km2 including much of Utah, parts of southeastern Idaho, and southwestern Wyoming (Fig. 2) [32]. The three rivers that drain into the GSL are the Bear, the Weber, and the Jordan which, in total, comprise about 66% of the average annual water inflow to the GSL; precipitation contributes about 31%,
Results and discussion
Experience with the forecasting of complex dynamical processes has shown that the resulting predictions always suffer from different sources of error. It is reasonable to speculate that any hydrological model can fall victim to errors resulting from missing processes and parameters, limited knowledge of the governing equations and laws underlying the processes (i.e., heuristic assumptions), errors in the measured data, approximations in the computations (e.g., numerical discretization),
Conclusions
The ability of both SVMs and RVMs to capture the behavior of a chaotic dynamical system from a single observable time series has been demonstrated. Both the SVM and RVM models provide an accurate forecast methodology that could be exploited for the planning and management of the GSL. The sparse machines are theoretically elegant and well-regularized. While SVMs rely on structural risk minimization to reach a sparse structure, RVMs integrate over the uncertainty of state estimates and capture
Acknowledgments
Portions of this work were supported by the Utah Water Research Laboratory, College of Engineering, Utah State University, and Utah Center for Water Resources Research. The authors would like to thank Dr. Wallace Gwynn of the Utah Geological Survey, Utah Department of Natural Resources, for providing the biweekly GSL stage and volume data used in the study. Thanks are also due to anonymous reviewers for their insightful comments.
References (87)
Multi-criteria validation of precipitation-runoff model
J Hydrol
(2002)- et al.
Using a multiobjective approach to retrieve information on surface properties used in a SVAT model
J Hydrol
(2004) - et al.
Optimal use of the SCE-UA global optimization method for calibrating watershed models
J Hydrol
(1994) - et al.
Characterization and prediction of runoff dynamics: a nonlinear dynamical view
Adv Water Resour
(2002) - et al.
Noise reduction and prediction of hydrometeorological time series: dynamical system approach vs. stochastic approach
J Hydrol
(2000) Chaos theory in hydrology: important issues and interpretations
J Hydrol
(2000)- et al.
A systematic approach to noise reduction in chaotic hydrological time series
J Hydrol
(1999) - et al.
Multiobjective global optimization for hydrologic models
J Hydrol
(1998) - et al.
Nonlinear dynamics of the Great Salt Lake: system identification and prediction
Climate Dyn
(1996) - Arnow T. Water level and water quality changes in Great Salt Lake, Utah, 1847–1983. Geological Survey Circular 913, US...
Support vectors-based groundwater head observation networks design
Water Resour Res
Statistical decision theory and Bayesian analysis
Prophecy, reality, and uncertainty in distributed hydrologic modeling
Adv Water Resour
Reconstruction of time-delay systems from chaotic time series
J Phys Rev E
Determining the minimum embedding dimension of nonlinear time series based on prediction method
J Chin Phys
Support vector networks
J Mach Learning
Evaluating the use of “goodness-of-fit” measures in hydrologic and hydroclimatic model validation
Water Resour Res
Effective and efficient global optimization for conceptual rainfall-runoff models
Water Resour Res
Shuffle complex evolution approach for effective and efficient global minimization
J Optim Theory Applic
Predicting chaotic time series
Phys Rev Lett
Independent coordinates for strange attractors from mutual information
Phys Rev A
Multivariate adaptive regression splines
Ann Stat
Automatic calibration of conceptual rainfall-runoff models: optimization algorithms, catchments conditions, and model structure
Water Resour Res
Neural networks and the bias/variance dilemma
Neural Comput
Kalman filtering: theory and practice using MATLAB
Parameter estimation of land surface scheme using multi-criteria methods
J Geophys Res
Multiple criteria global optimization for watershed model calibration
Toward improved calibration of hydrologic models: multiple and noncommensurable measures of information
Water Resour Res
Calibration strategy for urban catchment parameters
J Hydr Eng
Neural networks
Learning and soft computing: support vector machines, neural networks, and fuzzy logic models
Determining embedding dimension for phase-space reconstruction using a geometrical construction
Phys Rev Lett A
Use of soft information to describe the relative uncertainty of calibration data in hydrologic models
Water Resour Res
Applicability of statistical learning algorithms in ground water quality modeling
Water Resour Res
Basin-Scale water management and forecasting using multisensor data and neutral networks
J Am Water Resour Assoc
Deterministic chaos versus stochasticity in analysis and modeling of point rainfall series
J Geophys Res
Cited by (67)
Reconstructing terrestrial water storage anomalies using convolution-based support vector machine
2023, Journal of Hydrology: Regional StudiesPrediction of forward osmosis membrane engineering factors using artificial intelligence approach
2022, Journal of Environmental ManagementDevelopment of artificial intelligence-based models for the prediction of filtration performance and membrane fouling in an osmotic membrane bioreactor
2021, Journal of Environmental Chemical EngineeringConcepts and methodologies for agricultural water management
2020, Agricultural Water Management: Theories and PracticesImproving global terrestrial evapotranspiration estimation using support vector machine by integrating three process-based algorithms
2017, Agricultural and Forest MeteorologyProjection of climate change impacts on precipitation using soft-computing techniques: A case study in Zayandeh-rud Basin, Iran
2016, Global and Planetary ChangeCitation Excerpt :This method incorporates the effect of serial correlation in detection of the trend statistic. In this method, the original projected precipitation is resampled in predetermined blocks for a large number of times to estimate the significance of the Mann-Kendall (MK) test statistic S. Briefly, the method involves the following steps (Khalil et al., 2006): (i) estimate the test statistic of the selected trend identification test (e.g. rSRC) from the original projected precipitation time series, (ii) estimate the number of significant serial correlations k at αL level, (iii) resample precipitation time series in blocks of k + η for a large number of times and estimate the trend identification test static for each simulated time series to produce a simulated distribution of the test statistic, (iv) estimate the significance of the observed test statistic in step 1 from the simulated distribution in step 3. More details about the BBS method is found in Khalil et al. (Khalil et al., 2006) and (Önöz and Bayazit, 2012).