Elsevier

Applied Soft Computing

Volume 42, May 2016, Pages 368-376
Applied Soft Computing

A hybrid ANFIS model based on empirical mode decomposition for stock time series forecasting

https://doi.org/10.1016/j.asoc.2016.01.027Get rights and content

Highlights

  • This paper proposes a hybrid time-series ANFIS model based on EMD to forecast stock price.

  • In order to evaluate the forecasting performances, the proposed model is compared with other models.

  • The experimental results show that proposed model is superior to the listing models.

Abstract

Time series forecasting is an important and widely popular topic in the research of system modeling, and stock index forecasting is an important issue in time series forecasting. Accurate stock price forecasting is a challenging task in predicting financial time series. Time series methods have been applied successfully to forecasting models in many domains, including the stock market. Unfortunately, there are 3 major drawbacks of using time series methods for the stock market: (1) some models can not be applied to datasets that do not follow statistical assumptions; (2) most time series models that use stock data with a significant amount of noise involutedly (caused by changes in market conditions and environments) have worse forecasting performance; and (3) the rules that are mined from artificial neural networks (ANNs) are not easily understandable.

To address these problems and improve the forecasting performance of time series models, this paper proposes a hybrid time series adaptive network-based fuzzy inference system (ANFIS) model that is centered around empirical mode decomposition (EMD) to forecast stock prices in the Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX) and Hang Seng Stock Index (HSI). To measure its forecasting performance, the proposed model is compared with Chen's model, Yu's model, the autoregressive (AR) model, the ANFIS model, and the support vector regression (SVR) model. The results show that our model is superior to the other models, based on root mean squared error (RMSE) values.

Introduction

System modeling is an intelligent modeling approach for establishing a target system for function approximation, optimization, and forecasting and has been examined extensively for many years. Time series forecasting is an important application in system modeling. In a time series, time is usually an important variable in making decisions or forecasts. Managers usually use historical data to forecast various types of variables, such as changes in stock price and product sales. The use of time series to forecast future tendencies has to make use of detailed data that were generated some time in the past to understand changes in trends.

Forecasting stock prices is an important topic in finance and a good application for demonstrating the value of time series forecasting. Successful forecasting of stock prices presumably results in substantial monetary rewards. Modern business and economic activities, in essence, are dynamic and change frequently and financial time series show nonlinearity and nonstationary behavior. Thus, it is difficult to accurately forecast the movements of stock prices, because stock markets are affected by many intricately related economic and political factors.

Many traditional time series models have been proposed and applied to economic forecasting. Engle [7] developed the ARCH (p) (autoregressive conditional heteroscedasticity) model, which has been used by many financial analysts; the GARCH [1] (generalized ARCH) model is the generalized form of ARCH. Box and Jenkins [2] established the autoregressive moving average (ARMA) model, which combines a moving average with a linear difference equation and makes forecasts under linear stationary conditions.

Models that describe such homogeneous nonstationary behavior can be obtained by assuming suitable differences in the process that is to be stationary. To this end, the autoregressive integrated moving average model (ARIMA) [2], with the assumption of linearity between variables, was proposed to handle nonstationary behavior datasets. However, the conventional time series method requires more historical data, and the data must have a normal distribution to effect better forecasting. Further, linguistic expressions are often used to describe daily observations.

Thus, Song and Chissom [29] proposed the original fuzzy time series model, and subsequent researchers focused on the 2 major processes of the fuzzy time series model: (1) fuzzification and (2) establishment of fuzzy relationships and forecasting. In the fuzzification process, the length of intervals for the universe of discourse can affect the forecast, and Huarng proposed a distribution-based and average-based length to address this issue [11]. In addition, Chen proposed a method [5] in which the length of linguistic intervals is tuned by genetic algorithms. In establishing fuzzy relationships and forecasting, Yu [35] argued that recurrent fuzzy relationships should be considered in forecasting and recommended that different weights be assigned to various fuzzy relationships. Thus, Yu [35] introduced a weighted fuzzy time series method to forecast the TAIEX. To take advantage of neural networks (nonlinear capabilities), Huarng and Yu [15] chose a neural network to establish fuzzy relationships in fuzzy time series, which are also nonlinear, but the process of mining fuzzy logical relationships is not easily understood [3].

Artificial intelligence (AI) is a rapidly growing technology with regard to information processing applications. It has been applied to various disciplines, such as business, engineering, management, science, and the military. In the financial domain, AI can be used to predict stock prices, credit scores, and potential bankruptcy problems. Stock prices change dramatically, which impedes the accurate prediction of stock price volatility.

In the past decade, many studies have been conducted to mine financial time series data, including traditional statistical approaches and data-mining techniques. Many researchers have applied data-mining techniques to financial analysis [12], [19], [20], [21], [22], [23], [25], [27]. Further, many hybrid forecasting techniques have been published recently [9], [10], [24], [26], [28], [32]. ANFIS [17] (an AI method) is a hybrid technique that integrates the advantage of learning in an ANN and using a set of fuzzy if-then rules with appropriate membership functions to generate input-output pairs with a high degree of accuracy [17]. In recent years, that ANFIS system has been used widely to generate nonlinear models of processes to determine input-output relationships. Thus, ANFIS is appropriate for forecasting nonlinear financial time series and generating meaningful rules for strategizing investment tactics.

The research topic of this paper is financial time series, the characteristics of which, such as its nonlinear or nonstationary nature and the presence of noises in the raw data, should be considered. However, traditional time series models have linear limitations and are unsuitable for financial forecasting. Thus, hybrid models are widely used to circumvent the limitations in financial time series forecasting. Empirical mode decomposition (EMD) [14] is perfectly suitable for nonlinear and nonstationary time series signal analysis [14] and identifies tendencies in financial time series. Based on EMD, any complicated signal can be decomposed into a finite and often small number of intrinsic mode functions (IMFs) [14], which have simpler frequency components and stronger correlations, rendering them easier to forecast more accurately. EMD has been used widely in many fields, such as in the analysis of earthquake signals, structure analysis, bridge and construction monitoring [34], sea wave data [13], and the diagnosis of faults in machines [36].

Based on the findings above, there are 3 major drawbacks of these models: (1) For certain statistical models, specific assumptions are required for observations, and the models cannot be applied to datasets that do not follow them. (2) Most conventional time series models use late-day stock price as the input variable in forecasting. However, there is a significant amount of involuted noise in raw stock data that is caused by changes in market conditions and environments. Traditional time series models that use complicated raw data certainly have reduced forecasting performance. For this reason, forecasting models should decompose complicated raw data into simpler frequency components and high-correlation variables to improve their forecasting accuracy. (3) ANN is a black box method, and the rules mined from ANNs are not easily understandable [3]. Nevertheless, forecasting rules are useful for investors in buying and selling stocks.

To overcome the drawbacks above, this paper hypothesizes that EMD can decompose complicated raw data (stock index) into simpler frequency components and highly correlating variables, adopting it into an AR model to build the primary model. Then, the results of the primary model are refined and optimized by an adaptive network-based fuzzy inference system (ANFIS) that uses fuzzy if-then rules to model the qualitative aspects of human knowledge for applicability to human activities.

Based on the concepts that have been discussed, this paper proposes a hybrid ANFIS model that considers stock index (t) to forecast future stock index values (t + 1). First, this approach uses EMD to decompose the original stock index (t) data into IMFs (intrinsic mode functions) and a residue (R). Then, the tendencies of these IMFs and the residue are modeled and forecasted using ANFIS, which can overcome the limitations of statistical methods (the data need to obey some mathematical distribution). Finally, the prediction results are integrated to obtain a final forecasting value. Thus, this study expects that the proposed model can generate significant profits for investors by more accurately forecasting the stock market.

The remainder of the paper is organized as follows. Section 2 describes the related studies. Section 3 briefly presents the proposed model. Section 4 discusses the experiments and makes comparisons. Section 5 presents the findings of the experiment results. Finally, conclusions are made in Section 6.

Section snippets

Related works

This section reviews the relevant studies on various forecasting models for the stock market, empirical mode decomposition, subtractive clustering (Subclust), and ANFIS.

Proposed model

Based on our literature review, there are three major drawbacks of stock forecasting models. To overcome these drawbacks, this paper hypothesizes that EMD can decompose noisy raw data (stock index) into simpler frequency components and highly correlating variables and adapts it to the AR model to build the primary model. Then, the results of primary model will be refined by ANFIS, which can overcome the limitations of statistical methods (data need obey some mathematical distribution) and

Experiments and comparisons

In this section, we perform evaluations and make comparisons, using the RMSE as the evaluation criterion. To verify the proposed model, TAIEX datasets from 2000 to 2006 and Hang Seng index (HSI) datasets from 2000 to 2004 are used as the experiment datasets; each year of the TAIEX and HSI dataset is a subdataset.

Each subdataset for the previous 10 months is used for training, and those from November to December are selected for testing. Further, this paper compares the performance of the

Findings

Based on our verification and comparison, the proposed method outperforms other methods, except for the 2005 TAIEX and HSI datasets, rendering it superior. By examining the performance of these models, there are 2 important findings as follows:

  • (1)

    The advantage of the hybrid model:

    According to Table 1, Table 2, the proposed model is superior to the other methods in terms of RMSE, primarily because it takes into account the EMD method with ANFIS learning for stock index forecasting, integrating the

Conclusions

This paper has developed a stock forecasting model by integrating EMD and ANFIS. The main contribution of the paper is that it proposes a novel method and a simple approach for making stable predictions of fluctuating data. The proposed method preprocesses stock index (t) and decomposes the index into more stationary and regular components (IMF or residue) using the EMD technique. Further, the corresponding ANFIS model for each divided component is easier to build. After the IMF components and

References (36)

Cited by (152)

View all citing articles on Scopus
View full text