A new class of hybrid models for time series forecasting

https://doi.org/10.1016/j.eswa.2011.09.157Get rights and content

Abstract

Applying quantitative models for forecasting and assisting investment decision making has become more indispensable in business practices than ever before. Improving forecasting especially time series forecasting accuracy is an important yet often difficult task facing forecasters. Both theoretical and empirical findings have indicated that integration of different models can be an effective way of improving upon their predictive performance, especially when the models in the ensemble are quite different. In the literature, several hybrid techniques have been proposed by combining different time series models together, in order to overcome the deficiencies of single models and yield hybrid models that are more accurate. In this paper, in contrast of the traditional hybrid models, a new methodology is proposed in order to construct a new class of hybrid models using a time series model as basis model and a classifier. As classifiers cannot be lonely applied as forecasting model for continuous problems, in the first stage of the proposed model, a forecasting model is used as basis model. Then, the estimated values of the basis model are modified in the second stage, based on the distinguished trend of the residuals of the basis model and the optimum step length, which are respectively calculated by a classifier model and a mathematical programming model. Empirical results with three well-known real data sets indicate that the proposed model can be an effective way in order to construct a more accurate hybrid model than its basis time series model. Therefore, it can be used as an appropriate alternative model for forecasting tasks, especially when higher forecasting accuracy is needed.

Introduction

Applying quantitative methods for forecasting and assisting investment decision making has become more indispensable in business practices than ever before. Time series forecasting is one of the most important quantitative models in which historical observations of the same variable are collected and analyzed to develop a model that captures the underlying data generating process. Then the model is used to predict the future. This modeling approach is particularly useful when little knowledge is available on the underlying data generating process or when there is no satisfactory explanatory model that relates the prediction variable to other explanatory variables. Over the past several decades, much effort has been devoted to the development and improvement of time series forecasting models (Zhang, Patuwo, & Hu, 1998).

Combining several models or using hybrid models can be an effective way to overcome the limitations of each components model and improve forecasting performance. Theoretical as well empirical evidences in the literature suggest that by using dissimilar models or models that disagree each other strongly, the hybrid model will have lower generalization variance or error. In combined models, the aim is to reduce the risk of using an inappropriate model by combining several models to reduce the risk of failure and obtain results that are more accurate (Hibon & Evgeniou, 2005). Typically, this is done because the underlying process cannot easily be determined. The motivation for using hybrid models comes from the assumption that either one cannot identify the true data generating process or that a single model may not be totally sufficient to identify all the characteristics of the time series (Terui & van Dijk, 2002).

In the literature, different combination techniques have been proposed in order to overcome the deficiencies of single models and to improve forecasting performance. The difference between these combination techniques can be described using terminology developed for the classification and neural network literature (Sharkey, 2002). Hybrid models can be homogeneous, such as using differently configured neural networks, or heterogeneous, such as with both linear and nonlinear models (Taskaya & Casey, 2005). In a competitive architecture, the aim is to build appropriate modules to represent different parts of the time series, and to be able to switch control to the most appropriate. For example, a time series may exhibit nonlinear behavior generally, but this may change to linearity depending on the input conditions. Early work on threshold autoregressive models (TAR) used two different linear AR processes, each of which change control among themselves according to the input values (Tong, 1990). An alternative is a mixture density model, also known as nonlinear gated expert, which comprises neural networks integrated with a feedforward gating network (Taskaya & Casey, 2005).

In a cooperative modular combination, the aim is to combine models to build a complete picture from a number of partial solutions (Sharkey, 2002). The assumption is that a model may not be sufficient to represent the complete behavior of a time series, for example, if a time series exhibits both linear and nonlinear patterns during the same time interval, neither linear models nor nonlinear models alone are able to model both components simultaneously. A good exemplar is models that fuse autoregressive integrated moving average with artificial neural networks. In such hybrids, whilst the neural network model deals with nonlinearity, the autoregressive integrated moving average model deals with the non-stationary linear component (Tseng et al., 2002, Valenzuela et al., 2008).

Much effort has been devoted to develop and improve the hybrid time series forecasting models, since the early work of Reid (1968), and Bates and Granger (1969). In pioneering work on combined forecasts, Bates and Granger showed that a linear combination of forecasts would give a smaller error variance than any of the individual methods. Since then, the studies on this topic have expanded dramatically. Makridakis et al. (1982) claimed that using a hybrid model or combining several models has become common practice in improving forecasting accuracy ever since the well-known M-competition in which a combination of forecasts from more than one model often leads to improved forecasting performance. Likewise, Pelikan et al., 1992, Ginzburg and Horn, 1994 proposed combining several feedforward neural networks to improve time series forecasting accuracy. In 1989, Clemen (1989) provided a comprehensive review and annotated bibliography in this area.

In recent years, more hybrid forecasting models have been proposed and applied in many areas with good prediction performance. Pai and Lin (2005) proposed a hybrid methodology to exploit the unique strength of autoregressive integrated moving average models and support vector machines (SVMs) for stock prices forecasting. Chen and Wang (2007) constructed a combination model incorporating seasonal autoregressive integrated moving average (SARIMA) model and support vector machines for seasonal time series forecasting. Zhou and Hu (2008) proposed a hybrid modeling and forecasting approach based on grey and the Box–Jenkins autoregressive moving average models. Armano, Marchesi, and Murru (2005) presented a new hybrid approach that integrated artificial neural networks (ANNs) with genetic algorithms (Gas) to stock market forecast. Yu, Wang, and Lai (2005) proposed a novel nonlinear ensemble forecasting model integrating generalized linear auto regression (GLAR) with artificial neural networks in order to obtain accurate prediction in foreign exchange market. Khashei, Hejazi, and Bijari (2008) proposed a new hybrid model in order to overcome the data limitation of artificial neural networks and yield more accurate results than traditional neural networks in financial markets forecasting. Lin and Cobourn (2007) combined the Takagi–Sugeno fuzzy system and a nonlinear regression (NLR) model for time series forecasting. Pai (2006) proposed the hybrid ellipsoidal fuzzy system for time series forecasting (HEFST) model to forecast regional electricity loads in Taiwan.

Kim and Shin (2007) investigated the effectiveness of a hybrid approach based on the artificial neural networks for time series properties, such as the adaptive time delay neural networks (ATNNs) and the time delay neural networks (TDNNs), with the genetic algorithms in detecting temporal patterns for stock market prediction tasks. Zhang (2003) presented a hybrid autoregressive moving average integrated and artificial neural networks approach for time series forecasting. Tseng, Tzeng, Yu, and Yuana (2001) proposed a hybrid model called FARIMA in order to use the advantages and to fulfill the limitations of the fuzzy regression and ARIMA models for time series forecasting. Ince and Trafalis (2006) proposed a two-stage hybrid model which incorporates parametric techniques such as autoregressive integrated moving average, vector autoregressive (VAR) and co-integration techniques, and nonparametric techniques such as support vector regression (SVR) and artificial neural networks for exchange rate prediction. Chang, Liu, and Wang (2006) developed a hybrid model by integrating self organization map (SOM) neural network, genetic algorithms (GAs) and fuzzy rule base (FRB) to forecast the future sales of a printed circuit board factory. Huarng and Yu (2006) described a combining methodology using neural networks to forecast fuzzy time series.

In this paper, classifier methods are applied to construct a new hybrid model using a basis time series model in order to yield more accurate results. In our proposed model, the residuals of the basis time series model are considered by a classifier in order to distinguish their trend. In the next stage, the optimum step length is calculated by a mathematical programming model using the distinguished trend obtained in the previous stage. Then, the estimated values of the basis time series model are modified according to the optimum step length and the distinguished trend. In this paper, probabilistic neural networks (PNNs) are used as classifier. Technically, probabilistic neural network is a classifier and is able to deduce the class/group of a given input vector after the training process is completed. There are a number of appealing features, which justify our adoption of this type of neural networks to this study. First, training of probabilistic neural networks is rapid, enabling us to develop a frequently updated training scheme. Essentially, the network is re-trained each time the data set is updated and thus the most current information can be reflected in estimation. Second, the logic of probabilistic neural network is able to extenuate the effects of outliers and questionable data points and thereby reduces extra effort on scrutinizing training data. Third and the most important, probabilistic neural networks are conceptually built on the Bayesian method of classification which given enough data, is capable of classifying a sample with the maximum probability of success (Wasserman, 1993).

Given the advantages of the probabilistic neural networks, it is not surprising that this methodology has attracted overwhelming attention in prediction (Kim and Chun, 1998, Yang et al., 1999), identification (Gaganis et al., 2007, Sun et al., 2006), and especially in classification task (Karthikeyan et al., 2005, Xue et al., 2005) in various areas. Chen, Leung, and Daouk (2003) used the probabilistic neural networks in order to model and predict the direction of return on market index of the Taiwan stock exchange. Axinte (2006) applied the probabilistic neural networks for automated classification of tool malfunctions in broaching. Hajmeer and Basheer (2002) proposed to use probabilistic neural networks (PNNs) for classification of bacterial growth/no-growth data and modeling the probability of growth. Tam, Tong, Lau, and Chan (2004) used the probabilistic neural networks for Diagnosis of prestressed concrete pile defects. Shan, Zhao, Xu, Liebich, and Zhang (2002) presented an application of probabilistic neural network in the clinical diagnosis of cancers based on clinical chemistry data. Al-Omari and Al-Jarrah (2004) presented a system for recognition of the handwritten Indian numerals using the probabilistic neural networks. Kim, Kim, and Chang (2008) presented an application of probabilistic neural network to design breakwater armor blocks. Srinivasan, Jin, and Cheu (2005) proposed and applied a constructive probabilistic neural network (CPNN) model for automatic incident detection on freeways. Shang, Huang, Du, and Zheng (2006) investigated on the Palm print recognition using Fast ICA algorithm and radial basis probabilistic neural network.

The rest of the paper is organized as follows. In the next section, the basic concepts of autoregressive integrated moving average (ARIMA) and artificial neural networks (ANNs), which are chosen as basis models to construct hybrid model, are briefly reviewed. In Section 3, probabilistic neural networks (PNNs), which are selected as classifier method, are reviewed. In Section 4, the formulation of the proposed model is introduced. In Section 5, the proposed model is applied to three well-known real data sets—the Wolf’s sunspot data, the Canadian lynx data, and the British pound against the United States dollar exchange rate data—forecasting and its performance is compared with those of other forecasting models in order to show the appropriateness and effectiveness of the proposed method. Section 6 contains the concluding remarks.

Section snippets

Time series forecasting models

There are several different approaches to time series forecasting, which are generally categorized as follow. Traditional statistical models including moving average, exponential smoothing, and autoregressive integrated moving average are linear in that predictions of the future values are constrained to be linear functions of past observations. Second category of time series models are nonlinear models. Several classes of nonlinear models have been proposed in the literature in order to

Probabilistic neural networks (PNNs)

The probabilistic neural network (PNN) is a Bayes–Parzen classifier (Masters, 1995) that is often an excellent pattern classifier in practice. The foundation of the approach is well known decades ago (1960s), however, the method was not of a widespread use because of the lack of sufficient computation power until recently. The probabilistic neural networks were first introduced by Donald Specht in 1990, who demonstrated how the Bayes–Parzen classifier could be broken up into a large number of

Formulation of the proposed model

Despite the numerous time series models available, the accuracy of time series forecasting currently is fundamental to many decision processes, and hence, never research into ways of improving the effectiveness of forecasting models been given up. Many researches in time series forecasting have been argued that predictive performance improves in combined models (Taskaya & Casey, 2005). In the literature, different combination techniques have been proposed in order to overcome the deficiencies

Application of the proposed model to time series forecasting

In this section, the proposed model is applied to time series forecasting using the three well-known real data sets in order to demonstrate the appropriateness and effectiveness of the proposed model and its performance is compared with those of other forecasting models.

Conclusions

Improving forecasting especially time series forecasting accuracy is an important yet often difficult task facing forecasters. Despite the numerous time series models available, the research for improving the effectiveness of forecasting models has never stopped. Several large-scale forecasting competitions with a large number of commonly used time series forecasting models conclude that combining forecasts from more than one model often leads to improved performance, especially when the models

Acknowledgements

The authors wish to express their gratitude to Seyed Reza Hejazi, assist professor of industrial engineering, Isfahan University of Technology, for their insightful and constructive comments, which helped to improve the paper greatly.

References (86)

  • C. Gaganis et al.

    Probabilistic neural networks for the identification of qualified audit opinions

    Expert Systems with Applications

    (2007)
  • M. Hajmeer et al.

    A probabilistic neural network approach for modeling and classification of bacterial growth/no-growth data

    Journal of Microbiological Methods

    (2002)
  • M. Haseyama et al.

    An ARMA order selection method with fuzzy reasoning

    Signal Process

    (2001)
  • M. Hibon et al.

    To combine or not to combine: Selecting among forecasts and their combinations

    International Journal of Forecasting

    (2005)
  • H. Hosseini et al.

    The comparison of different feed forward neural network architectures for ECG signal diagnosis

    Medical Engineering & Physics

    (2006)
  • K. Huarng et al.

    The application of neural networks to forecast fuzzy time series

    Physica A

    (2006)
  • H. Ince et al.

    A hybrid model for exchange rate prediction

    Decision Support Systems

    (2006)
  • X. Jiang et al.

    Constructing and training feed-forward neural networks for pattern classification

    Pattern Recognition

    (2003)
  • B. Karthikeyan et al.

    Conception of complex probabilistic neural network system for classification of partial discharge patterns using multifarious inputs

    Expert Systems with Applications

    (2005)
  • M. Khashei et al.

    A new hybrid artificial neural networks and fuzzy regression model for time series forecasting

    Fuzzy Sets and Systems

    (2008)
  • S.H. Kim et al.

    Graded forecasting using an array of bipolar predictions: application of probabilistic neural networks to a stock market index

    International Journal of Forecasting

    (1998)
  • D. Kim et al.

    Application of probabilistic neural network to design breakwater armor blocks

    Ocean Engineering

    (2008)
  • H. Kim et al.

    A hybrid approach based on neural networks and genetic algorithms for detecting temporal patterns in stock markets

    Applied Soft Computing

    (2007)
  • J. Lee et al.

    GA based meta-modeling of BPN architecture for constrained approximate optimization

    International Journal of Solids and Structures

    (2007)
  • J. Leski et al.

    A new artificial network based fuzzy interference system with moving consequents in if-then rules and selected applications

    Fuzzy Sets and Systems

    (1999)
  • Y. Lin et al.

    Fuzzy system models combined with nonlinear regression for daily ground-level ozone predictions

    Atmospheric Environment

    (2007)
  • L. Ma et al.

    A new strategy for adaptively constructing multilayer feed-forward neural networks

    Neurocomputing

    (2003)
  • R.A. Meese et al.

    Empirical exchange rate models of the seventies: do they/t out of samples?

    Journal of International Economics

    (1983)
  • C. Ong et al.

    Model identification of ARIMA family using genetic algorithms

    Applied Mathematical Computation

    (2005)
  • P.F. Pai

    Hybrid ellipsoidal fuzzy systems in forecasting regional electricity loads

    Energy Conversion Management

    (2006)
  • P.F. Pai et al.

    A hybrid ARIMA and support vector machines model in stock price forecasting

    Omega

    (2005)
  • Y. Shan et al.

    Application of probabilistic neural network in the clinical diagnosis of cancers based on clinical chemistry data

    Analytica Chimica Acta

    (2002)
  • L. Shang et al.

    Palm print recognition using fast ICA algorithm and radial basis probabilistic neural network

    Neurocomputing

    (2006)
  • D. Specht

    Probabilistic neural networks

    Neural Networks

    (1990)
  • D. Srinivasan et al.

    Adaptive neural network models for automatic incident detection on freeways

    Neurocomputing

    (2005)
  • L. Stone et al.

    Chaotic oscillations and cycles in multi-trophic ecological systems

    Journal of Theoretical Biology

    (2007)
  • G. Sun et al.

    Tumor tissue identification based on gene expression data using DWT feature extraction and PNN classifier

    Neurocomputing

    (2006)
  • C.M. Tam et al.

    Diagnosis of prestressed concrete pile defects using probabilistic neural networks

    Engineering Structures

    (2004)
  • Y. Tang et al.

    A consistent nonparametric Bayesian procedure for estimating autoregressive conditional densities

    Computational Statistics & Data Analysis

    (2007)
  • N. Terui et al.

    Combined forecasts from linear and nonlinear time series models

    International Journal of Forecasting

    (2002)
  • F.M. Tseng et al.

    Fuzzy ARIMA model for forecasting the foreign exchange market

    Fuzzy Sets and Systems

    (2001)
  • F.M. Tseng et al.

    Combining neural network model with seasonal time series ARIMA model

    Technological Forecasting & Social Change

    (2002)
  • C. Xue et al.

    Study of probabilistic neural networks to classify the active compounds in medicinal plants

    Journal of Pharmaceutical and Biomedical Analysis

    (2005)
  • Cited by (90)

    • Series Hybridization of Parallel (SHOP) models for time series forecasting

      2022, Physica A: Statistical Mechanics and its Applications
    • Weighted sequential hybrid approaches for time series forecasting

      2019, Physica A: Statistical Mechanics and its Applications
      Citation Excerpt :

      These models can be homogeneous or heterogeneous. Literature shows that hybrid models, which use heterogeneous models, can yield better performance such as statistical/intelligent, fuzzy/crisp, linear/nonlinear combination topology [16]. Combining statistical and intelligent models is a well-known structure of hybrid models, which are frequently applied in many studies.

    View all citing articles on Scopus
    View full text