A hybrid ANFIS model based on empirical mode decomposition for stock time series forecasting
Graphical abstract
Introduction
System modeling is an intelligent modeling approach for establishing a target system for function approximation, optimization, and forecasting and has been examined extensively for many years. Time series forecasting is an important application in system modeling. In a time series, time is usually an important variable in making decisions or forecasts. Managers usually use historical data to forecast various types of variables, such as changes in stock price and product sales. The use of time series to forecast future tendencies has to make use of detailed data that were generated some time in the past to understand changes in trends.
Forecasting stock prices is an important topic in finance and a good application for demonstrating the value of time series forecasting. Successful forecasting of stock prices presumably results in substantial monetary rewards. Modern business and economic activities, in essence, are dynamic and change frequently and financial time series show nonlinearity and nonstationary behavior. Thus, it is difficult to accurately forecast the movements of stock prices, because stock markets are affected by many intricately related economic and political factors.
Many traditional time series models have been proposed and applied to economic forecasting. Engle [7] developed the ARCH (p) (autoregressive conditional heteroscedasticity) model, which has been used by many financial analysts; the GARCH [1] (generalized ARCH) model is the generalized form of ARCH. Box and Jenkins [2] established the autoregressive moving average (ARMA) model, which combines a moving average with a linear difference equation and makes forecasts under linear stationary conditions.
Models that describe such homogeneous nonstationary behavior can be obtained by assuming suitable differences in the process that is to be stationary. To this end, the autoregressive integrated moving average model (ARIMA) [2], with the assumption of linearity between variables, was proposed to handle nonstationary behavior datasets. However, the conventional time series method requires more historical data, and the data must have a normal distribution to effect better forecasting. Further, linguistic expressions are often used to describe daily observations.
Thus, Song and Chissom [29] proposed the original fuzzy time series model, and subsequent researchers focused on the 2 major processes of the fuzzy time series model: (1) fuzzification and (2) establishment of fuzzy relationships and forecasting. In the fuzzification process, the length of intervals for the universe of discourse can affect the forecast, and Huarng proposed a distribution-based and average-based length to address this issue [11]. In addition, Chen proposed a method [5] in which the length of linguistic intervals is tuned by genetic algorithms. In establishing fuzzy relationships and forecasting, Yu [35] argued that recurrent fuzzy relationships should be considered in forecasting and recommended that different weights be assigned to various fuzzy relationships. Thus, Yu [35] introduced a weighted fuzzy time series method to forecast the TAIEX. To take advantage of neural networks (nonlinear capabilities), Huarng and Yu [15] chose a neural network to establish fuzzy relationships in fuzzy time series, which are also nonlinear, but the process of mining fuzzy logical relationships is not easily understood [3].
Artificial intelligence (AI) is a rapidly growing technology with regard to information processing applications. It has been applied to various disciplines, such as business, engineering, management, science, and the military. In the financial domain, AI can be used to predict stock prices, credit scores, and potential bankruptcy problems. Stock prices change dramatically, which impedes the accurate prediction of stock price volatility.
In the past decade, many studies have been conducted to mine financial time series data, including traditional statistical approaches and data-mining techniques. Many researchers have applied data-mining techniques to financial analysis [12], [19], [20], [21], [22], [23], [25], [27]. Further, many hybrid forecasting techniques have been published recently [9], [10], [24], [26], [28], [32]. ANFIS [17] (an AI method) is a hybrid technique that integrates the advantage of learning in an ANN and using a set of fuzzy if-then rules with appropriate membership functions to generate input-output pairs with a high degree of accuracy [17]. In recent years, that ANFIS system has been used widely to generate nonlinear models of processes to determine input-output relationships. Thus, ANFIS is appropriate for forecasting nonlinear financial time series and generating meaningful rules for strategizing investment tactics.
The research topic of this paper is financial time series, the characteristics of which, such as its nonlinear or nonstationary nature and the presence of noises in the raw data, should be considered. However, traditional time series models have linear limitations and are unsuitable for financial forecasting. Thus, hybrid models are widely used to circumvent the limitations in financial time series forecasting. Empirical mode decomposition (EMD) [14] is perfectly suitable for nonlinear and nonstationary time series signal analysis [14] and identifies tendencies in financial time series. Based on EMD, any complicated signal can be decomposed into a finite and often small number of intrinsic mode functions (IMFs) [14], which have simpler frequency components and stronger correlations, rendering them easier to forecast more accurately. EMD has been used widely in many fields, such as in the analysis of earthquake signals, structure analysis, bridge and construction monitoring [34], sea wave data [13], and the diagnosis of faults in machines [36].
Based on the findings above, there are 3 major drawbacks of these models: (1) For certain statistical models, specific assumptions are required for observations, and the models cannot be applied to datasets that do not follow them. (2) Most conventional time series models use late-day stock price as the input variable in forecasting. However, there is a significant amount of involuted noise in raw stock data that is caused by changes in market conditions and environments. Traditional time series models that use complicated raw data certainly have reduced forecasting performance. For this reason, forecasting models should decompose complicated raw data into simpler frequency components and high-correlation variables to improve their forecasting accuracy. (3) ANN is a black box method, and the rules mined from ANNs are not easily understandable [3]. Nevertheless, forecasting rules are useful for investors in buying and selling stocks.
To overcome the drawbacks above, this paper hypothesizes that EMD can decompose complicated raw data (stock index) into simpler frequency components and highly correlating variables, adopting it into an AR model to build the primary model. Then, the results of the primary model are refined and optimized by an adaptive network-based fuzzy inference system (ANFIS) that uses fuzzy if-then rules to model the qualitative aspects of human knowledge for applicability to human activities.
Based on the concepts that have been discussed, this paper proposes a hybrid ANFIS model that considers stock index (t) to forecast future stock index values (t + 1). First, this approach uses EMD to decompose the original stock index (t) data into IMFs (intrinsic mode functions) and a residue (R). Then, the tendencies of these IMFs and the residue are modeled and forecasted using ANFIS, which can overcome the limitations of statistical methods (the data need to obey some mathematical distribution). Finally, the prediction results are integrated to obtain a final forecasting value. Thus, this study expects that the proposed model can generate significant profits for investors by more accurately forecasting the stock market.
The remainder of the paper is organized as follows. Section 2 describes the related studies. Section 3 briefly presents the proposed model. Section 4 discusses the experiments and makes comparisons. Section 5 presents the findings of the experiment results. Finally, conclusions are made in Section 6.
Section snippets
Related works
This section reviews the relevant studies on various forecasting models for the stock market, empirical mode decomposition, subtractive clustering (Subclust), and ANFIS.
Proposed model
Based on our literature review, there are three major drawbacks of stock forecasting models. To overcome these drawbacks, this paper hypothesizes that EMD can decompose noisy raw data (stock index) into simpler frequency components and highly correlating variables and adapts it to the AR model to build the primary model. Then, the results of primary model will be refined by ANFIS, which can overcome the limitations of statistical methods (data need obey some mathematical distribution) and
Experiments and comparisons
In this section, we perform evaluations and make comparisons, using the RMSE as the evaluation criterion. To verify the proposed model, TAIEX datasets from 2000 to 2006 and Hang Seng index (HSI) datasets from 2000 to 2004 are used as the experiment datasets; each year of the TAIEX and HSI dataset is a subdataset.
Each subdataset for the previous 10 months is used for training, and those from November to December are selected for testing. Further, this paper compares the performance of the
Findings
Based on our verification and comparison, the proposed method outperforms other methods, except for the 2005 TAIEX and HSI datasets, rendering it superior. By examining the performance of these models, there are 2 important findings as follows:
- (1)
The advantage of the hybrid model:
According to Table 1, Table 2, the proposed model is superior to the other methods in terms of RMSE, primarily because it takes into account the EMD method with ANFIS learning for stock index forecasting, integrating the
Conclusions
This paper has developed a stock forecasting model by integrating EMD and ANFIS. The main contribution of the paper is that it proposes a novel method and a simple approach for making stable predictions of fluctuating data. The proposed method preprocesses stock index (t) and decomposes the index into more stationary and regular components (IMF or residue) using the EMD technique. Further, the corresponding ANFIS model for each divided component is easier to build. After the IMF components and
References (36)
Generalized autoregressive conditional heteroscedasticity
J. Econom.
(1986)- et al.
High-order fuzzy time-series based on multi-period adaptation model for forecasting stock markets
Phys. A
(2008) Forecasting enrollments based on fuzzy time-series
Fuzzy Sets Syst.
(1996)- et al.
Developing a hybrid artificial intelligence model for outpatient visits forecasting in hospitals
Appl. Soft Comput.
(2012) - et al.
SVR with hybrid chaotic genetic algorithms for tourism demand forecasting
Appl. Soft Comput.
(2011) - et al.
Chaos-based support vector regressions for exchange rate forecasting
Expert Syst. Appl.
(2010) - et al.
The application of neural networks to forecast fuzzy time series
Phys. A
(2006) Financial time series forecasting using support vector machines
Neurocomputing
(2003)- et al.
Genetic algorithms approach to feature discretization in artificial neural networks for prediction of stock index
Expert Syst. Appl.
(2000) - et al.
Maximum and minimum stock price forecasting of Brazilian power distribution companies based on artificial neural networks
Appl. Soft Comput.
(2015)
Financial time series forecasting using independent component analysis and support vector machine
Decis. Support Syst.
Multiobjective optimization based adaptive models with fuzzy decision making for stock market forecasting
Neurocomputing
Short-term load forecasting using Bayesian neural networks learned by Hybrid Monte Carlo algorithm
Appl. Soft Comput.
Hybrid PSO-SVM method for short-term load forecasting during periods with significant temperature variations in city of Burbank
Appl. Soft Comput.
Derivation of fuzzy control rules from human operator's control actions
The adaptive selection of financial and economic variables for use with artificial neural networks
Neurocomputing
A hybrid artificial intelligence model for river flow forecasting
Appl. Soft Comput.
Weighted fuzzy time-series models for TAIEX forecasting
Phys. A
Cited by (152)
A stock series prediction model based on variational mode decomposition and dual-channel attention network
2024, Expert Systems with ApplicationsCentralized decomposition approach in LSTM for Bitcoin price prediction
2024, Expert Systems with ApplicationsA novel hybrid model for stock price forecasting integrating Encoder Forest and Informer
2023, Expert Systems with ApplicationsStock market and securities index prediction using artificial intelligence: A systematic review
2024, Multidisciplinary Reviews