Comparison of direct and iterative artificial neural network forecast approaches in multi-periodic time series forecasting

doi:10.1016/j.eswa.2008.02.042

Expert Systems with Applications

Volume 36, Issue 2, Part 2, March 2009, Pages 3839-3844

https://doi.org/10.1016/j.eswa.2008.02.042 Get rights and content

Abstract

Artificial neural network is a valuable tool for time series forecasting. In the case of performing multi-periodic forecasting with artificial neural networks, two methods, namely iterative and direct, can be used. In iterative method, first subsequent period information is predicted through past observations. Afterwards, the estimated value is used as an input; thereby the next period is predicted. The process is carried on until the end of the forecast horizon. In the direct forecast method, successive periods can be predicted all at once. Hence, this method is thought to yield better results as only observed data is utilized in order to predict future periods. In this study, forecasting was performed using direct and iterative methods, and results of the methods are compared using grey relational analysis to find the method which gives a better result.

Introduction

There are a few methods commonly used for time series forecasting. The Box–Jenkins method, which gives quite good results with linear time series, is one of them. The Box–Jenkins technique is considerably effective especially with linear and stationary data sets and with series that are non-stationary but turned into stationary via transformations. However, series extracted from the real world are not generally linear. Therefore, modeling nonlinear time series requires different techniques. For the last 25 years, a number of nonlinear time series models have been developed; such as the Bilinear Model, threshold autoregressive (TAR) model, and autoregressive conditional heteroscedastic (ARCH) model. These models are applicable, when information concerning the correlation between data is available and clear (Zhang, Patuwo, & Hu, 1998). Moreover, none of these models is effective in elucidating a whole nonlinear structure hidden in the data set (Zhang, Patuwo, & Hu, 2001). Although these models are better than the linear models, their applications are difficult, problem specific and not prone to generalization (Ghiassi, Saidane, & Zimbra, 2004).

One of the techniques that has been used for time series forecasting since the late 1980s is artificial neural networks (ANN). ANN, an approach to linear and nonlinear time series, is widely used to forecast the future. ANN provides linear and nonlinear modeling without the necessity of preliminary information and assumptions as to the relation between input and output variables. Therefore, ANN is more flexible and applicable than other methods (Zhang et al., 1998).

There have been many studies aimed at the utilization of ANN for time series forecasting. Tang, Almeida, & Fishwick, 1991 studied ANN and Box–Jenkins forecasting comparatively in time series. They remarked that ANN structure and training parameters such as learning rate and momentum coefficient affect ANN performance. Hill, O’Connor, and Remus (1996) discussed the data period (annual, quarterly, monthly), functional form of the series (linear, nonlinear), forecasting horizon, and number of observation as factors that affect the ANN performance. Faraway and Chatfield (1998) studied with a seasonal time series which is called airline data. Faraway and Chatfield’s (1998) study showed the dependence of different ANN structure on time series forecasting performance. Detailed literature studies were presented by Zhang et al. (1998), and Adya and Collopy (1998). Nelson, Hill, Remus, and O’Connor (1999) focused on seasonality in time series structure and discussed the necessity of deseasonalizing the data. Chu and Zhang (2003) compared the accuracy of various linear and nonlinear models for forecasting aggregate retail sales.

Time series forecasting can be performed for both single and multiple periods. Two different approaches can be used for forecasting multiple periods. One of them is a single period iterative forecast, as in the Box–Jenkins models. The other one is a direct method in which multiple periods are estimated simultaneously.

Using the autoregressive methods, Findley (1985) showed that the direct forecasting approach is better than the iterative forecasting approach for a covariance stationary autoregressive process that is not degenerate to a finite order. The results founded by Bhansali, 1996, Bhansali, 1997 support Findley’s (1985) conclusion. Kang (2003) studied monthly data on various US economic time series. The results found by Kang (2003) show that the direct method may or may not improve forecast accuracy upon the iterative method. Kang stated that the forecast performance of the direct method relative to the iterative method appears to depend on, optimal order selection criteria, forecast periods, forecast horizons and the time series to be forecasted.

On the other hand, there are contrary findings when forecasting the time series with ANN. Weigend, Huberman, and Rumelhart (1992) showed that the iterative forecast yields better results than the direct forecast method in their sunspot data analysis. This conclusion is supported by Hill, Marquez, O’Connor, and Remus (1994), in their 111 M-competition time series study. According to Zhang (1994), the direct method gives better results. Lachtermacher and Fuller (1995) emphasize that the direct method requires a larger time series than the iterative method in order to avoid generalization. Kline (2004) used three methods (iterative, independent and joint method) for multi-step forecasting. In Kline’s study (2004), the independent method is based on the separated networks forecasting for each period, and the joint method is the same as the direct method used in this study. According to Kline (2004), the independent method is better than joint method but training sample size and forecast horizon might affect this superiority.

The above studies show that it is not clear which method gives better results. It is thought that the direct method gives better results when compared with the iterative method because it is based on past data. In this paper, a comparison of the iterative and direct forecasting methods is presented. They are considered to be influential on multi-periodic forecast performance of ANN. The organization of rest of the paper is as follows: the second section presents time series. Important parameters of the ANN models are given in the third section. The fourth section discusses an example application. To assess and compare the performance of the methods in the application part, grey relational analysis (GRA) is utilized. The last section gives the main conclusions.

Section snippets

Time series

A time series can be defined as a set of values of a variable during consecutive time increments. Time increments vary from series to series. That is to say, time series can be generated according to hourly, daily, weekly, monthly, quarterly, and annual data or according to another time scale. In a time series, observed data at any given time t is represented by Y_t.

The most comprehensive of all popular and widely known statistical methods used for time series forecasting are Box–Jenkins Models.

Artificial neural networks

ANN, imitating the functioning of human brain, is a tool of great importance in classification, pattern recognition and forecasting. A typical ANN model is a combination of layers made of neurons. The most widely used type of ANN for forecasting is the multi layer perceptron (MLP).

In a MLP designed for time series forecasting, determining the variables, such as the number of input, hidden and output neurons, is highly important. However, these parameters depend on the problem. In other words,

An application of the methods

In this study, the series employed by Hansen, Mcdonald, and Nelson (1999) are utilized to forecast through an application aimed to handle real life time series. They worked on six different series (Series A, B, C, D, E, F) that Box and Jenkins (1976) used previously and which are shown in Fig. 1.

Hansen et al. (1999) compared their results based on two ANN models with the techniques suggested by McDonald and Xu (1994). Partially adaptive estimation techniques described by McDonald and Xu (1994)

Conclusions

In this paper, comparisons of the iterative and direct forecasting methods, which are considered to be influential on multi-periodic forecast performance of ANN, were presented. Also, in this study, performances of these approaches were compared with each other by using standard ARIMA models and partial adaptive estimation techniques. GRA was utilized to compare the performance of the investigated methods over different time series. Based on the performed comparison, superiority of the direct

References (34)

C. Chu et al.
A comparative study of linear and nonlinear models for aggregate retail sales forecasting
International Journal of Production Economics
(2003)
T. Hill et al.
Artificial neural networks for forecasting and decision making
International Journal of Forecasting
(1994)
K. Hornik et al.
Multilayer feed-forward networks are universal approximators
Neural Networks
(1989)
I. Kaastra et al.
Designing a neural network for forecasting financial and econometric time series
Neurocomputing
(1996)
I. Kang
Multi-period forecasting using different models for different horizons: an application to U.S. economic time series data
International Journal of Forecasting
(2003)
J. McDonald et al.
Some forecasting applications of partially adaptive estimators of ARIMA models
Economics Letters
(1994)
F.S. Wong
Time series forecasting using backpropagation neural networks
Neurocomputing
(1991)
G. Zhang et al.
Forecasting with artificial neural networks: the state of the art
International Journal of Forecasting
(1998)
G. Zhang et al.
A simulation study of artificial neural networks for nonlinear time-series forecasting
Computers and Operations Research
(2001)
M. Adya et al.
How effective are neural networks at forecasting and prediction? A review and evaluation
Journal of Forecasting
(1998)

D. Baily et al.

Developing neural network applications

AI Expert

(1990)

R.J. Bhansali

Asymptotically efficient autoregressive model selection for multistep prediction

Annals of the Institute of Statistical Mathematics

(1996)

R.J. Bhansali

Direct autoregressive predictors for multi step prediction: order selection and performance relative to the plug in predictors

Statistica Sinica

(1997)

G.P. Box et al.

Time series analysis forecasting and control

(1976)

G. Cybenko

Approximation by superposition of a sigmoidal function

Mathematics of Control Signals and Systems

(1989)

J.L. Deng

Introduction to grey system

Journal of Grey System

(1989)

J. Faraway et al.

Time series forecasting with neural networks: a comparative study using the airline data

Applied Statistics

(1998)

Cited by (184)

Deep learning based prediction of traffic peaks in mobile networks
2024, Computer Networks
In mobile networks, it is essential to configure networks more efficiently to provide mobile users with services having better quality. For the adjacent cells, sometimes the mobile traffic concentrates in a single cell, leading to traffic imbalance; in many cases, the detection of imbalance is strongly related to the prediction of traffic peaks, which is extremely difficult because many peaks appear suddenly for no apparent reason. To better predict the peaks and traffic imbalance, we propose two novel mobile traffic predictors. The first is a Mixture of Experts (MoE) model which yields significantly better peak prediction along with excellent interpretability by establishing a cooperation mechanism between different experts, whereas the second predictor is a lightweight Multilayer Perceptron (MLP) which can obtain similar peak forecasting performance but operating more flexibly and consuming less computational power. The obtained predictions are then used to aid the predictive detection of traffic imbalance. To this end, we first perform a large-scale analysis of mobile traffic and then propose different approaches to detect the future traffic imbalance based on the predictions. To evaluate the performance of the proposed approaches, we have conducted extensive experiments on real-world mobile network datasets; all models are implemented in Python 3.8.8 with Pytorch 1.7.0, and they were run on a computer with i7-7700K CPU @ 4.2 GHz running Linux, 64 GB memory and a single Nvidia TITAN X Pascal 12GB GPU. The proposed models are compared with 10 widely used deep learning based predictors, and the results show that our approaches have significantly improved the sensitivity (peak prediction accuracy) from 39.3% to 58.5% (i.e., roughly a 50% increase) along with excellent interpretability. Moreover, the predictions are further used by the predictive detection approach, where the best multi-model approach improves the predictive classification accuracy of congestion (imbalance prediction accuracy) by 10.3% in contrast with the naive approach; this result also verifies the effectiveness of our peak predictions.
SA–EMD–LSTM: A novel hybrid method for long-term prediction of classroom PM<inf>2.5</inf> concentration
2023, Expert Systems with Applications
The prediction of classroom PM_2.5 concentration has practical importance for the management of classroom environment. Most of the existing researches focus on the prediction of indoor carbon dioxide and temperature, and lack of indoor PM_2.5 prediction research, especially the long-term predictions in ten minutes or even thirty minutes, which could be more helpful in the practical situations. In this paper, an improved hybrid method, called SA–EMD–LSTM, is proposed to solve the challenge of long-term prediction, which employs the state-of-the-art techniques from machine learning, including self-attention (SA) mechanism, empirical mode decomposition (EMD) algorithm, and long-short term memory (LSTM) network. In this method, firstly, the original PM_2.5 sequence is decomposed into several subsequences by the EMD algorithm, which aims to resolve the complex problem into simplified sub-problems. Then, the subsequences are reconstructed with an improved SA mechanism, which aims to figure out the relationship between subsequences. Finally, the reconstructed sequence group is used as the input to the LSTM model, and returns the prediction results for the overall problem. The experimental results show that for prediction in 5–30 min, the R² of our method reaches 99.63%–95.96%, and the mean absolute error is 1.91 µg/m³–6.31 µg/m³. Compared with the existing methods, our method performs best and improves the prediction accuracy by 46%–28%. And the introduction of SA mechanism reduces the complexity of the problem, saves 83% of running time and 62% of memory consumption.
Efficacy of temporal and spatial abstraction for training accurate machine learning models: A case study in smart thermostats
2023, Energy and Buildings
Smart thermostats are increasingly popular in homes and buildings as they improve occupant comfort, lower energy use in heating and cooling systems, and reduce utility bills by automatically adjusting room temperature according to measurements of their built-in sensors. To maximize energy savings, many of these thermostats use machine learning (ML) for more accurate prediction and optimal control. These ML models must be trained for each individual building to ensure higher thermal comfort and energy savings, owing to the fact that buildings are custom-built, can be located in different climates, and may have unique occupancy patterns. This kind of training requires a significant amount of energy and sharing raw data emitted by the sensors in each building. To address these issues, we propose a novel methodology to train accurate and personalized thermal models for each home with a minimal energy footprint. Specifically, we use temporal and spatial abstraction to downsample sensor data and cluster homes with similar characteristics to train representative thermal models for each cluster. These models are customized for each home using meta-learning, achieving personalized models with high accuracy. Additionally, combining multi-step ahead prediction with our proposed abstraction technique would permit resource-constrained devices (e.g., microcontrollers) to accurately forecast indoor temperature for larger time intervals with negligible computation overhead. Experiments with smart thermostat data from 1,000 homes show that our methodology offers accurate and personalized thermal models with substantial savings in network bandwidth and training energy. To be specific, our methodology is approximately 384 (300) times more energy efficient than training an LSTM (RNN) model using the conventional approaches. Given the anticipated increase in the demand for smart thermostats and the fact that thermal models must be (re)trained regularly (e.g., every season), the proposed methodology could significantly reduce the environmental impact of training ML models for thermal comfort optimization in the long run.
Optimal dispatch of a multi-energy system microgrid under uncertainty: A renewable energy community in Austria
2023, Applied Energy
Microgrids can integrate variable renewable energy sources into the energy system by controlling flexible assets locally. However, as the energy system is dynamic, an effective microgrid controller must be able to receive feedback from the system in real-time, plan ahead and take into account the active electricity tariff, to maximize the benefits to the operator. These requirements motivate the use of optimization-based control methods, such as Model Predictive Control to optimally dispatch flexible assets in microgrids. However, the major bottleneck to achieve maximum benefits with these methods is their predictive accuracy. This paper addresses this bottleneck by developing a novel multi-step forecasting method for a Model Predictive Control framework. The presented methods are applied to a real test-bed of a renewable energy community in Austria, where its operational costs and CO₂ emissions are benchmarked with those of a rule-based control strategy for Flat, Time-of-Use, Demand Charge and variable energy price tariffs. In addition, the impact of forecast errors and electric battery capacity on energy community operational savings are examined. The key results indicate that the proposed controller can outperform a rule-based dispatch strategy by 24.7% in operational costs and by 8.4% in CO₂ emissions through optimal operation of flexibilities if it has perfect foresight. However, if the controller is deployed in a realistic environment, where forecasts for electrical load and PV generation are required, the same savings are reduced to 3.3% for cost and 7.3% for CO₂, respectively. In such environments, the proposed controller performs best in highly dynamic tariffs such as Time-of-Use and Real-time pricing rates, achieving real cost savings of up to 6.3%. These results show that the profitability of optimization-based control of microgrids is threatened by forecast errors. This motivates future research on control strategies that compensate for forecast errors in real-world operation and more accurate forecasting methods.
A dynamic-inner LSTM prediction method for key alarm variables forecasting in chemical process
2023, Chinese Journal of Chemical Engineering
With the increase in the complexity of industrial system, simply detecting and diagnosing a fault may be insufficient in some cases, and prognosing the fault ahead of time could have a certain necessity. Accurate prediction of key alarm variables in chemical process can indicate the possible change to reduce the probability of abnormal conditions. According to the characteristics of chemical process data, this work proposed a key alarm variables prediction model in chemical process based on dynamic-inner principal component analysis (DiPCA) and long short-term memory (LSTM). DiPCA is used to extract the most dynamic components for prediction. While LSTM is used to learn the relationship and predict the key alarm variables. This work used a simulation data set and a real hydrogenation process data set for applications and explained the model validity from the essential characteristics. Comparison of results with different models shows that our model has better prediction accuracy and performance, which can provide the basis for fault prognosis and health management.
Nonlinearity in forecasting energy commodity prices: Evidence from a focused time-delayed neural network
2023, Research in International Business and Finance
Citation Excerpt :
The use of nonlinear transfer functions in hidden layers allows the ANN to learn a nonlinear and linear relationship, while the linear function in the output layer is enough to produce accurate forecasts. For our purpose, this kind of function sequence will be used in accordance with the main studies and practices in forecasting (Zhang et al., 1998; HamzaCebi et al., 2009). The momentum term causes the method to converge fast and minimizes oscillations in the weight space.
This paper aims to develop an artificial neural networkbased forecasting model employing a nonlinear focused time-delayed neural network (FTDNN) for energy commodity market forecasts. To validate the proposed model, crude oil and natural gas prices are used for the period 2007–2020, including the Covid-19 period. Empirical findings show that the FTDNN model outperforms existing baselines and artificial neural networkbased models in forecasting West Texas Intermediate and Brent crude oil prices and National Balancing Point and Henry Hub natural gas prices. As a result, we demonstrate the predictability of energy commodity prices during the volatile crisis period, which is attributed to the flexibility of the model parameters, implying that our study can facilitate a better understanding of the dynamics of commodity prices in the energy market.

View all citing articles on Scopus

View full text

Comparison of direct and iterative artificial neural network forecast approaches in multi-periodic time series forecasting

Abstract

Introduction

Section snippets

Time series

Artificial neural networks

An application of the methods

Conclusions

International Journal of Production Economics

International Journal of Forecasting

Neural Networks

Neurocomputing

International Journal of Forecasting

Economics Letters

Neurocomputing

International Journal of Forecasting

Computers and Operations Research

How effective are neural networks at forecasting and prediction? A review and evaluation

Journal of Forecasting

Developing neural network applications

AI Expert

Asymptotically efficient autoregressive model selection for multistep prediction

Annals of the Institute of Statistical Mathematics

Direct autoregressive predictors for multi step prediction: order selection and performance relative to the plug in predictors

Statistica Sinica

Time series analysis forecasting and control

Approximation by superposition of a sigmoidal function

Mathematics of Control Signals and Systems

Introduction to grey system

Journal of Grey System

Time series forecasting with neural networks: a comparative study using the airline data

Applied Statistics