Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Predicting the Direction of Stock Market Index Movement Using an Optimized Artificial Neural Network Model

Abstract

In the business sector, it has always been a difficult task to predict the exact daily price of the stock market index; hence, there is a great deal of research being conducted regarding the prediction of the direction of stock price index movement. Many factors such as political events, general economic conditions, and traders’ expectations may have an influence on the stock market index. There are numerous research studies that use similar indicators to forecast the direction of the stock market index. In this study, we compare two basic types of input variables to predict the direction of the daily stock market index. The main contribution of this study is the ability to predict the direction of the next day’s price of the Japanese stock market index by using an optimized artificial neural network (ANN) model. To improve the prediction accuracy of the trend of the stock market index in the future, we optimize the ANN model using genetic algorithms (GA). We demonstrate and verify the predictability of stock price direction by using the hybrid GA-ANN model and then compare the performance with prior studies. Empirical results show that the Type 2 input variables can generate a higher forecast accuracy and that it is possible to enhance the performance of the optimized ANN model by selecting input variables appropriately.

Introduction

The direction of the stock market index refers to the movement of the price index or the trend of fluctuation in the stock market index in the future. Predicting the direction is a practical issue that heavily influences a financial trader’s decision to buy or sell an instrument. Accurate forecast of the trends of the stock index can help investors to acquire opportunities for gaining profit in the stock exchange. Hence, precise forecasting of the trends of the stock price index can be extremely advantageous for investors [1]. Leung, Daouk [2] hold the view that trading could be made profitable by an accurate prediction of the direction of movement of the stock index. Their work suggested that financial forecasters and traders should focus on accurately predicting the direction of movement so as to minimize the estimates’ deviations from the actual observed values. Mostafa [3] also believes that accurate predictions of the direction of stock price indices are very important for investors. However, the behavior of stock markets depends on many qualitative factors such as political, economic, and natural factors, among many others. The stock markets are dynamic and exhibit wide variation, and the prediction of the stock market thus becomes a highly challenging task because of the highly non-linear nature and complex dimensionality [4, 5]. Forecasting of the financial index is characterized by data intensity, noise, non-stationarity, unstructured nature, high degree of uncertainty, and hidden relationships [68].

Previous studies have applied various models in forecasting the direction of the stock market index movement. Huang, Nakamori [9] forecasted stock market movement using support vector machines (SVM), and concluded that the model was good at predicting the direction. Kara, Boyacioglu [10] applied Artificial Neural Network (ANN) and SVM in predicting the direction of the Istanbul stock exchange. Their study proves that the two different models are both useful prediction tools, and ANN is significantly better than the SVM model. Şenol and Özturan [11] applied seven different prediction system models for predicting the direction of the stock market index in Turkey, concluding that ANN could be one of the most robust techniques for forecasting. The ANN model has been popularly claimed to be a useful technique for stock index prediction because of its ability to capture subtle functional relationships among the empirical data even though the underlying relationships are unknown or hard to describe [12, 13]. Application of ANN has become the most popular machine learning method, and it has been proven that such an approach can outperform most conventional methods [1420]. In this study, we attempt to apply an ANN model to forecast the direction of the Japanese stock market index.

The most popular neural network training algorithm for financial forecasting is the back propagation (BP) algorithm, which is also a widely applied classical learning algorithm for neural networks [2124]. The BP network has been widely used in the area of financial time series forecasting because of its broad applicability to many business problems and its preeminent learning ability [25]. However, many papers have reported that the ANN model, trained by the BP algorithm, has some limitations in forecasting, and it can easily converge to the regional (local) minimum because of the tremendous noise and complex dimensionality of the stock market data. In view of these limitations, genetic algorithms (GA) has been proposed to overcome the local convergence issue for nonlinear optimization problems. We attempt to determine the optimal set of weights and biases to enhance the accuracy of the ANN model by using GA.

The main objective of this study is to improve the prediction accuracy of the direction of stock price index movement by using the ANN model. First, we focus on the selection of the input variables for forecasting the future trend of the stock market index. The selection of effective indicators that can be used to forecast the output variable of an ANN model is significant prior to modeling. In this study, we compare two basic types of input variables that have been widely used in previous studies to predict the direction of the daily stock market index. To evaluate the performance of these two sets of input variables, the Japanese stock market index is used as an illustrative example. In addition, we improve the prediction accuracy according to the optimization of the learning algorithm of the ANN model. The BP algorithm is a widely applied classical learning algorithm for neural networks. However, it has significant drawbacks that need to be improved using other training algorithms. In this study, genetic algorithm (GA) is employed to improve the prediction accuracy of the ANN model and overcome the local convergence problem of the BP algorithm. The empirical results suggest that the proposed method improves the accuracy further for predicting stock market direction, in comparison with previous studies.

The remainder of this paper is organized as follows. Section 2 describes the ANN model trained by the BP algorithm, and the improvement using the GA. Then, we plot the data and showcase two basic types of input variables that are used to forecast the direction, and the procedure of predicting the stock market direction in Section 3. Section 4 provides the experimental results and compares it with similar studies. Finally, Section 5 presents the discussion and conclusion. Formulas and summary of the statistics for each feature of input variables are given in S1 Appendix.

Prediction Model

Artificial neural network (ANN) model

Funahashi [26], Hornik, Stinchcombe [27] have shown that neural networks with sufficient complexity could approximate any unknown function to any degree of desired accuracy with only one hidden layer. Therefore, the ANN model in this study consists of an input layer, a hidden layer, and an output layer, each of which is connected to the other in the same sequence as listed here. The architecture of the ANN is shown in Fig 1. The input layer corresponds to the input variables. We analyze two basic types of input variables for comparing the forecasting accuracy. The hidden layer is used for capturing the nonlinear relationships among variables. In this study, the output layer consists of only one neuron that represents the predicted direction of the daily stock market index.

thumbnail
Fig 1. The architecture of the back propagation neural network.

https://doi.org/10.1371/journal.pone.0155133.g001

Back propagation neural network

The BP algorithm is a widely applied classical learning algorithm for neural networks [22, 23]. As shown in Fig 1, the BP process determines the weights for the connections among the nodes (e.g., W11 denotes the weight between Node 1 of the input layer and Node 1 of the hidden layer) and their biases (e.g., θ1 denotes the bias of Node 1 in the hidden layer) on the basis of the training data. The network weights and biases are assigned initial values first, and the error between the predicted and actual output values is back-propagated via the network for updating the weights and biases repeatedly [28]. When the error is less than a specified value or when the termination criterion is satisfied, training is considered to be completed and the weights and bias values of the network are stored. Detailed descriptions of using the BP algorithm for training the ANN model can be found in Ref. [29].

Although researchers have commonly trained the ANN model by using the gradient technique of the BP algorithm, limitations of gradient search techniques are more apparent when ANNs are applied to complex nonlinear optimization problems [30]. The BP algorithm has two significant drawbacks, i.e., slowness in convergence and an inability to escape local optima [31]. In view of these limitations, global search techniques, such as GA, are proposed to overcome the local convergence problem for nonlinear optimization problems. In this study, we propose to apply the GA technique to optimize the weights and biases of the ANN model, and then predict the direction of the daily closing price movement of the stock market index.

Improvement using genetic algorithms (GA)

There are many studies that have used GA-based hybrid models to overcome the drawbacks of the BP approach [3234]. The results of these studies support the notion that GA can enhance the accuracy of ANN models and can decrease the time required for experimentation [35]. In this study, the GA algorithm is utilized to optimize the initial weights and bias of the ANN model. Subsequently, the ANN model is trained by the BP algorithm using the determined weights and bias values.

Fig 2 shows the procedural flow of the proposed hybrid GA and BP algorithm.

The algorithm consists of the following steps:

  1. Step 1: Considering the wide range of the data, we normalize it to make sure that the value of all the variables scale down to vary between zero and one. The normalization is carried out as follows: (1) where R is a sample data. RN is the normalized value of R, Rmin is the minimum value of R and Rmax is the maximum value of R.
  2. Step 2: Encode all the weights and biases in a string and generate the initial population. Each solution generated from the GA is called a chromosome (or an individual). The collection of chromosomes is called a population. Here each chromosome describes the ANN with a certain set of weights and bias values.
  3. Step 3: Train the ANN model using the BP algorithm and then evaluate each chromosome of the current population using a fitness function based on the MSE (mean squared error) value.
    (2) where yt denotes the actual value, and is the predicted value. The value of the fitness function is inversely proportional to the error.
  4. Step 4: Rank all the individuals using the fitness proportion method and select the individuals with a higher fitness value to pass on to the next generation directly.
  5. Step 5: Apply genetic algorithms (e.g., crossover, mutation) to the current population and create new chromosomes. Evaluate the fitness value of the new chromosomes and insert these new chromosomes into the population to replace worse individuals of the current population. Following this, we get the new population.
  6. Step 6: Repeat Steps 3–5 until the stop criterion is satisfied.

Experimental Design

Data

The Nikkei 225 index is the most widely used market index for the Tokyo stock exchange. It includes 225 equally weighted stocks and has been calculated daily ever since 1950. In this study, we attempt to predict the direction of the daily Nikkei 225 index. The research data used in this study are technical indicators that are calculated from the daily price of the Nikkei 225 index. The total number of samples is 1,707 trading days, from January 2007 to December 2013. The total 1,707 data points of the daily Nikkei 225 closing cash prices in the data set are plotted in Fig 3. We divide the entire data into two parts, 78.6% of the data (January 23th, 2007 to October 18th, 2012) is used for in-sample training and 21.4% (October 19th, 2012 to December 30th, 2013) are considered as out-of-sample data. The in-sample data is used to determine the specifications of the model and parameters whereas the out-of-sample data is reserved for the evaluation of the model. The financial data used in this study is obtained from Yahoo Finance.

thumbnail
Fig 3. Plots showing the daily Nikkei 225 closing prices from January 23, 2007 to December 30, 2013.

https://doi.org/10.1371/journal.pone.0155133.g003

The original data are normalized before being subjected to the ANN algorithm routine. The goal of linear scaling is to independently normalize each feature component to a specified range. It also ensures that the larger value input attributes do not overwhelm smaller value inputs, which in turn helps decrease prediction errors.

The prediction performance Hit ratio is evaluated using the following equation: (3) where Pi is the prediction result for the ith trading day, which is defined by Eq 3. The variable yt denotes the actual value of the closing stock index for the ith trading day, and is the predicted value for the ith trading day. The variable n denotes the number of test samples.

(4)

Input variables

In the light of previous studies, it is hypothesized that various technical indicators may be used as input variables in the construction of prediction models to forecast the direction of movement of the stock price index [36]. Most financial managers and investors agree on the efficiency of technical indicators and exploit them as a signal for forecasting future trends. On the basis of the reviews of domain experts and prior studies, we notice that most researchers prefer to choose the input variables as shown in Table 1, whereas a few others utilize the variables of Type 2 input variables (shown in Table 2). Tables 1 and 2 list the selected features and their formulas, and we select these technical indicators as the feature subsets based on reviews of prior research studies [11, 37, 38]. One of the aims of this study is to conduct the experiments by using the ANN model with two types of input variables, and then compare the performance of these two experiments with prior studies.

By reviewing previously published studies [7, 10, 11, 25, 39], we set 13 technical indicators as Type 1 feature subset and 9 technical indicators as Type 2 feature subset. Tables 1 and 2 show these indicators, their formulas, and the summary of the statistics for each feature of the two types is presented in S1 Appendix. Technical indicators of the two types of input variables are usually used to predict the future trends, and they are derived from the real stock composite index.

thumbnail
Table 1. Selected technical indicators and their formulas (Type 1).

https://doi.org/10.1371/journal.pone.0155133.t001

thumbnail
Table 2. Selected technical indicators and their formulas (Type 2).

https://doi.org/10.1371/journal.pone.0155133.t002

Prediction process

After we finish the work of collecting the real stock composite index data and calculating the two types of input variables that we will compare in the following process, we plug in the data into the optimized ANN model to forecast the future direction of the stock market. We conduct the prediction process as follows: First, we calculate all the indicators for the two types of input variables. Then, we normalize the data to decrease the experimental errors. Before we enter the data into the ANN model, we optimize all the weights and biases of the ANN model using the GA algorithm. After that, we apply two types of indicators for predicting the direction of next day’s movement by the GA-ANN hybrid model. After we finish all the experiments, the performance among the two types of input variable sets is compared with prior reports.

Experimental Results

Comparison of the performances between the two types of input variables

For training the GA-ANN hybrid model, we used the in-sample data. In this section, we test the performance of the two sets of input variables by using out-of-sample data, which includes 300 data points. The hybrid model requires a number of parameters that can influence the performance of the model, and these parameters are described here in Table 3.

thumbnail
Table 3. Description of parameters that are used in the hybrid model.

https://doi.org/10.1371/journal.pone.0155133.t003

First, we conducted experiments based on the initial parameter setting, which is mentioned in Table 3. Then, we tested the performance of the two types of indicators by changing the different parameter combinations of the GA-ANN hybrid model. Table 4 shows the best performance of each type of input variables. The hit ratio denotes the percentage of trials when the predicted direction was correct. From Table 4, we observe that the best hit ratio for forecasting the direction correctly by applying Type 1 input variables is 60.87% and 81.27% for Type 2 input variables. We conclude that Type 2 input variables are more effective in predicting the direction of the daily closing price of the Nikkei 225 index than the Type 1 input variables. We infer that the accurate prediction performance of the ANN model by using Type 2 input variables is useful for investors and can become a good candidate for predicting the direction of next day’s closing price.

thumbnail
Table 4. Comparison of the hit ratio between the two types of input variables.

https://doi.org/10.1371/journal.pone.0155133.t004

Comparison with similar studies

Predicting the direction of the stock market index is an important topic for most investors. There are many studies published in the recent past that focus on the prediction of these movements. Table 5 lists out some of these prior studies that aim to predict the direction of the stock market indices using various methods. The results of this study are also compared with these research studies in Table 5.

thumbnail
Table 5. Comparison of our study with prior research reports.

https://doi.org/10.1371/journal.pone.0155133.t005

According to Table 5, we find that the prediction accuracy is significantly different in various studies, and our model is superior to all the other models. Thus, we consider the set of input indicators and the GA algorithm adopted in this study to be more appropriate for prediction.

In light of the previous studies, many researchers have compared ANN with SVM. For example, Kim [21] applied SVM to predict the stock price index, and compare it with the back-propagation neural networks. Their study shows that SVM outperforms BP neural networks in financial time-series forecasting. We suppose that researchers usually focus on the parameter selection of BP neural networks when they compare it with other models. If they combine the selection of input variables and the optimal adjustment of the weights and biases of the ANN model, the optimized ANN model may still provide a promising alternative to stock market prediction.

Conclusion

In this study, we applied two types of technical indicators to predict the direction of next day’s Nikkei 225 index movement. We adjusted the weights and biases of the ANN model using the GA algorithm and then tested the performance of the GA-ANN hybrid model by applying these two types of input variables and comparing the predictions with actual data. The experiments revealed that Type 2 input variables can provide better performance and the hit ratio for predicting the direction is 81.27%. We also compared the performance of the GA-ANN hybrid model with similar studies and the results showed that our method was more effective and resulted in higher prediction accuracy.

However, the prediction performance of this study may be improved further in three means. The first method is to combine the two types of input indicators, or test a subset of these variables. In addition, we can include a few other variables that may affect the prediction performance. Second, optimal methods other than the GA may also be utilized to adjust the parameters of ANN model. We may even use models based on probabilistic neural networks for predicting the movement of the stock index. Lastly, we could even propose an investment strategy (portfolio) based on the prediction outcomes of this study for future research, practical use and further validation.

Author Contributions

Conceived and designed the experiments: MYQ YS. Performed the experiments: MYQ. Analyzed the data: MYQ YS. Contributed reagents/materials/analysis tools: MYQ. Wrote the paper: MYQ YS.

References

  1. 1. Gholamiangonabadi D, Mohseni Taheri SD, Mohammadi A, Menhaj MB, editors. Investigating the performance of technical indicators in electrical industry in Tehran's Stock Exchange using hybrid methods of SRA, PCA and Neural Networks. Therm Power Plants IEEE 2014;2014:75–82.
  2. 2. Leung MT, Daouk H, Chen A. Forecasting stock indices: a comparison of classification and level estimation models. Int J Forecast. 2000;16(2):173–190.
  3. 3. Mostafa MM. Forecasting stock exchange movements using neural networks: Empirical evidence from Kuwait. Expert Syst Appl. 2010;37(9):6302–6309.
  4. 4. Guresen E, Kayakutlu G, Daim TU. Using artificial neural network models in stock market index prediction. Expert Syst Appl. 2011;38(8):89–97.
  5. 5. Lee T, Chiu C. Neural network forecasting of an opening cash price index. Int J Syst Sci. 2002;33(3):29–37.
  6. 6. Khan MAI. Financial volatility forecasting by nonlinear support vector machine heterogeneous autoregressive model: evidence from Nikkei 225 stock index. Int J Econ Finance. 2014;3(4):138–150.
  7. 7. Tay FE, Cao L. Application of support vector machines in financial time series forecasting. Omega. 2001;29(4):309–317.
  8. 8. Hall J. Adaptive selection of US stocks with neural nets, trading on the edge: Neural, genetic, and fuzzy systems for chaotic financial markets. 1st ed. Wiley; 1994.
  9. 9. Huang W, Nakamori Y, Wang S. Forecasting stock market movement direction with support vector machine. Comput Oper Res. 2005;32(10):13–22.
  10. 10. Kara Y, Boyacioglu MA, Baykan ÖK. Predicting direction of stock price index movement using artificial neural networks and support vector machines: The sample of the Istanbul Stock Exchange. Expert Syst Appl. 2011;38(5):11–19.
  11. 11. Şenol D, Özturan M. Stock price direction prediction using artificial neural network approach: the case of Turkey. J Artif Intell. 2008;1(2):70–77.
  12. 12. Vellido A, Lisboa PJ, Vaughan J. Neural networks in business: a survey of applications (1992–1998). Expert Syst Appl. 1999;17(1):51–70.
  13. 13. Zhang G, Patuwo BE, Hu MY. Forecasting with artificial neural networks: The state of the art. Int J Forecast. 1998;14(1):35–62.
  14. 14. Fernando FR, Christian GM, Simon SR. On the profitability of technical trading rules based on artificial neural networks: Evidence from the Madrid stock market. Econ Lett. 2000;69(1):89–94.
  15. 15. Lu C. Integrating independent component analysis-based denoising scheme with neural network for stock price prediction. Expert Syst Appl. 2010;37(10):56–64.
  16. 16. Versace M, Bhatt R, Hinds O, Shiffer M. Predicting the exchange traded fund DIA with a combination of genetic algorithms and neural networks. Expert Syst Appl. 2004;27(3):17–25.
  17. 17. Specht DF. A general regression neural network. IEEE Trans Neural Netw. 1991;2(6):68–76.
  18. 18. Wang Z, Wang L, Szolnoki A, Perc M. Evolutionary games on multilayer networks: a colloquium. Eur Phys J B. 2015;88(5):1–15.
  19. 19. Wang Z, Kokubo S, Jusup M, Tanimoto J. Universal scaling for the dilemma strength in evolutionary games. Phys Life Rev. 2015;14:1–30. pmid:25979121
  20. 20. Wang Z, Zhao DW, Wang L, Sun GQ, Jin Z. Immunity of multiplex networks via acquaintance vaccination. Europhys Lett. 2015;112(4):48002–48007.
  21. 21. Jo T. VTG schemes for using back propagation for multivariate time series prediction. Appl Soft Comput. 2013;13(5):692–702.
  22. 22. Sexton RS, Gupta JN. Comparative evaluation of genetic algorithm and backpropagation for training neural networks. Inf Sci. 2000;129(1):45–59.
  23. 23. Werbos PJ. The roots of backpropagation: from ordered derivatives to neural networks and political forecasting. 1st ed. John Wiley & Sons; 1994.
  24. 24. Wang Z, Andrews MA, Wu ZX, Wang L, Bauch CT. Coupled disease–behavior dynamics on complex networks: A review. Phys Life Rev. 2015;15:1–29. pmid:26211717
  25. 25. Kim KJ. Financial time series forecasting using support vector machines. Neurocomputing. 2003;55(1):307–319.
  26. 26. Funahashi K. On the approximate realization of continuous mappings by neural networks. Neural Netw. 1989;2(3):183–192.
  27. 27. Hornik K, Stinchcombe M, White H. Multilayer feedforward networks are universal approximators. Neural Netw. 1989;2(5):359–366.
  28. 28. Wang YH. Nonlinear neural network forecasting model for stock index option price: Hybrid GJR–GARCH approach. Expert Syst Appl. 2009;36(1):64–70.
  29. 29. Chauvin Y, Rumelhart DE. Backpropagation: theory, architectures, and applications. 1st ed. Psychology Press; 1995.
  30. 30. Salchenberger LM, Cinar E, Lash NA. Neural networks: A new tool for predicting thrift failures. Decis Sci. 1992;23(4):899–916.
  31. 31. Lee Y, Oh S, Kim M. The effect of initial weights on premature saturation in back propagation learning. Int Jt Conf Neural Netw. 1991;1:65–70.
  32. 32. Montana DJ, Davis L. Training feedforward neural networks using genetic algorithms. Int Jt Conf Artif Intell. 1989;89:762–767.
  33. 33. Kim K, Han I. Genetic algorithms approach to feature discretization in artificial neural networks for the prediction of stock price index. Expert Syst Appl. 2000;19(2):125–132.
  34. 34. Nair BB, Sai SG, Naveen A, Lakshmi A, Venkatesh G, Mohandas V. A GA-artificial neural network hybrid system for financial time series forecasting. Inf Techno Mob Commun. 2011;147:499–506.
  35. 35. Jadav K, Panchal M. Optimizing weights of artificial neural networks using genetic algorithms. Int J Adv Res Comput Sci Electron Eng. 2012;1(10):47–51.
  36. 36. Saad EW, Prokhorov DV, Wunsch DC. Comparative study of stock trend prediction using time delay, recurrent and probabilistic neural networks. IEEE Trans Neural Netw. 1998;9(6):56–70.
  37. 37. Huang C, Tsai C. A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting. Expert Syst Appl. 2009;36(2):29–39.
  38. 38. Armano G, Marchesi M, Murru A. A hybrid genetic-neural architecture for stock indexes forecasting. Inf Sci. 2005;170(1):3–33.
  39. 39. Yixin Z, Zhang J. Stock data analysis based on BP neural network. Int Conf Commun Softw Netw. 2010;39:6–9.