Skip to main content
Top
Published in: Soft Computing 5/2015

01-05-2015 | Methodologies and Application

A critical feature extraction by kernel PCA in stock trading model

Authors: Pei-Chann Chang, Jheng-Long Wu

Published in: Soft Computing | Issue 5/2015

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This paper presents a kernel-based principal component analysis (kernel PCA) to extract critical features for improving the performance of a stock trading model. The feature extraction method is one of the techniques to solve dimensionality reduction problems (DRP). The kernel PCA is a feature extraction approach which has been applied to data transformation from known variables to capture critical information. The kernel PCA is a kernel-based data mapping tool that has characteristics of both principal component analysis and non-linear mapping. The feature selection method is another DRP technique that selects only a small set of features from known variables, but these features still indicate possible collinearity problems that fail to reflect clear information. However, most feature extraction methods use a variable mapping application to eliminate noisy and collinear variables. In this research, we use the kernel-PCA method in a stock trading model to transform stock technical indices (TI) which allows features of smaller dimension to be formed. The kernel-PCA method has been applied to various stocks and sliding window testing methods using both half-year and 1-year testing strategies. The experimental results show that the proposed method generates more profits than other DRP methods on the America stock market. This stock trading model is very practical for real-world application, and it can be implemented in a real-time environment.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Achelis B (2000) Technical analysis from A to Z, 4th edn. McGraw-Hill, New York Achelis B (2000) Technical analysis from A to Z, 4th edn. McGraw-Hill, New York
go back to reference Chang PC, Liao TW, Lin JJ, Fan CY (2011) A dynamic threshold decision system for stock trading signals detection. Appl Soft Comput 1(5):3998–4010CrossRef Chang PC, Liao TW, Lin JJ, Fan CY (2011) A dynamic threshold decision system for stock trading signals detection. Appl Soft Comput 1(5):3998–4010CrossRef
go back to reference Chang PC, Lin JJ, Hsieh JC (2012) Myocardial infarction classification with multi-lead ECG using hidden Markov models and Gaussian mixture models. Appl Soft Comput 12(10):3165–3175CrossRef Chang PC, Lin JJ, Hsieh JC (2012) Myocardial infarction classification with multi-lead ECG using hidden Markov models and Gaussian mixture models. Appl Soft Comput 12(10):3165–3175CrossRef
go back to reference Comon P (1994) Independent component analysis, a new concept? Signal Process 36(3):287–314CrossRefMATH Comon P (1994) Independent component analysis, a new concept? Signal Process 36(3):287–314CrossRefMATH
go back to reference Dai J, Xu Q (2013) Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification. Appl Soft Comput 13(1):211–221CrossRef Dai J, Xu Q (2013) Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification. Appl Soft Comput 13(1):211–221CrossRef
go back to reference Derrac J, Verbiest N, García S, Cornelis C, Herrera F (2013) On the use of evolutionary feature selection for improving fuzzy rough set based prototype selection. Soft Comput 17(2):223–238CrossRef Derrac J, Verbiest N, García S, Cornelis C, Herrera F (2013) On the use of evolutionary feature selection for improving fuzzy rough set based prototype selection. Soft Comput 17(2):223–238CrossRef
go back to reference Diamantaras KI, Kung SY (1996) Principal component neural networks. Wiley, New YorkMATH Diamantaras KI, Kung SY (1996) Principal component neural networks. Wiley, New YorkMATH
go back to reference Ding C, He X, Zha H, Simon HD (2003) Adaptive dimension reduction for clustering high dimensional data. In: Proceedings of IEEE international conference on data mining, pp 147–154 Ding C, He X, Zha H, Simon HD (2003) Adaptive dimension reduction for clustering high dimensional data. In: Proceedings of IEEE international conference on data mining, pp 147–154
go back to reference Draper N, Smith H (1981) Applied regression analysis, 2nd edn. Wiley, New York Draper N, Smith H (1981) Applied regression analysis, 2nd edn. Wiley, New York
go back to reference Ekbal A, Saha S (2013) Combining feature selection and classifier ensemble using a multiobjective simulated annealing approach: application to named entity recognition. Soft Comput 17(1):1–16CrossRef Ekbal A, Saha S (2013) Combining feature selection and classifier ensemble using a multiobjective simulated annealing approach: application to named entity recognition. Soft Comput 17(1):1–16CrossRef
go back to reference Fan TH, Cheng KF (2007) Tests and variables selection on regression analysis for massive datasets. Data Knowl Eng 63(3):811–819CrossRef Fan TH, Cheng KF (2007) Tests and variables selection on regression analysis for massive datasets. Data Knowl Eng 63(3):811–819CrossRef
go back to reference Guo L, Rivero D, Dorado J, Munteanu CR, Pazos A (2011) Automatic feature extraction using genetic programming: an application to epileptic EEG classification. Expert Syst Appl 38(8):10425–10436CrossRef Guo L, Rivero D, Dorado J, Munteanu CR, Pazos A (2011) Automatic feature extraction using genetic programming: an application to epileptic EEG classification. Expert Syst Appl 38(8):10425–10436CrossRef
go back to reference Guo Z, Wang H, Liu Q (2013) Financial time series forecasting using LPP and SVM optimized by PSO. Soft Comput 17(5):805–818CrossRefMathSciNet Guo Z, Wang H, Liu Q (2013) Financial time series forecasting using LPP and SVM optimized by PSO. Soft Comput 17(5):805–818CrossRefMathSciNet
go back to reference Hoyer PO, Hyvärinen A (2000) Independent component analysis applied to feature extraction from colour and stereo images. Network 11(3):191–210CrossRefMATH Hoyer PO, Hyvärinen A (2000) Independent component analysis applied to feature extraction from colour and stereo images. Network 11(3):191–210CrossRefMATH
go back to reference Hoyer PO, Hyvärinen A, Yamamoto R (2012) Intraday technical analysis of individual stocks on the Tokyo Stock Exchange. J Bank Financ 36(8):3033–3047 Hoyer PO, Hyvärinen A, Yamamoto R (2012) Intraday technical analysis of individual stocks on the Tokyo Stock Exchange. J Bank Financ 36(8):3033–3047
go back to reference Jolliffe IT (2002) Principal component analysis, 2nd edn., Springer series in statisticsSpringer, New YorkMATH Jolliffe IT (2002) Principal component analysis, 2nd edn., Springer series in statisticsSpringer, New YorkMATH
go back to reference Li W, Liu Z (2011) A method of SVM with normalization in intrusion detection. Procedia Environ Sci 11(A): 256–262 Li W, Liu Z (2011) A method of SVM with normalization in intrusion detection. Procedia Environ Sci 11(A): 256–262
go back to reference Lin X, Yang Z, Song Y (2011) Intelligent stock trading system based on improved technical analysis and echo state network. Expert Syst Appl 38(9):11347–11354CrossRef Lin X, Yang Z, Song Y (2011) Intelligent stock trading system based on improved technical analysis and echo state network. Expert Syst Appl 38(9):11347–11354CrossRef
go back to reference Luna I, Ballini R (2011) Top-down strategies based on adaptive fuzzy rule-based systems for daily time series forecasting. Int J Forecast 27(3):708–724CrossRef Luna I, Ballini R (2011) Top-down strategies based on adaptive fuzzy rule-based systems for daily time series forecasting. Int J Forecast 27(3):708–724CrossRef
go back to reference Mika S, Schölkopf B, Smola A, Müller KR, Scholz M, Rätsch G (1998) Kernel PCA and de-noising in feature spaces. In: Proceeding of the 1998 conference on advances in neural information processing system II, pp 536–542 Mika S, Schölkopf B, Smola A, Müller KR, Scholz M, Rätsch G (1998) Kernel PCA and de-noising in feature spaces. In: Proceeding of the 1998 conference on advances in neural information processing system II, pp 536–542
go back to reference Samet H (2006) Foundations of multidimensional and metric data structures. Morgan Kaufmann, San FranciscoMATH Samet H (2006) Foundations of multidimensional and metric data structures. Morgan Kaufmann, San FranciscoMATH
go back to reference Scholkopf B, Smola A, Muller KR (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5):1299–1319 Scholkopf B, Smola A, Muller KR (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5):1299–1319
go back to reference Scholkopf B, Mika S, Burges CJC, Knirsch P, Muller KR, Ratsch G, Smola A (1999) Input space versus feature space in kernel-based methods. IEEE Trans Neural Netw 10(5):1000–1017CrossRef Scholkopf B, Mika S, Burges CJC, Knirsch P, Muller KR, Ratsch G, Smola A (1999) Input space versus feature space in kernel-based methods. IEEE Trans Neural Netw 10(5):1000–1017CrossRef
go back to reference Schölkopf B, Smola A, Muller KR (1999) Kernel principal component analysis. In: Schölkopf B, Burges CJC, Smola AJ (eds) Advances in kernel methods-support vector learning. MIT Press, Cambridge, pp 327–352 Schölkopf B, Smola A, Muller KR (1999) Kernel principal component analysis. In: Schölkopf B, Burges CJC, Smola AJ (eds) Advances in kernel methods-support vector learning. MIT Press, Cambridge, pp 327–352
go back to reference Scholz M, Kaplan F, Guy CL, Kopka J, Selbig J (2005) Non-linear PCA: a missing data approach. Bioinformatics 21(15):3887–3895CrossRef Scholz M, Kaplan F, Guy CL, Kopka J, Selbig J (2005) Non-linear PCA: a missing data approach. Bioinformatics 21(15):3887–3895CrossRef
go back to reference Smola A, Schölkopf B (2004) A tutorial on support vector regression. J Stat Comput 14(3):199–222CrossRef Smola A, Schölkopf B (2004) A tutorial on support vector regression. J Stat Comput 14(3):199–222CrossRef
go back to reference Ssegane H, Tollner EW, Mohamoud YM, Rasmussen TC, Dowd JF (2012) Advances in variable selection methods I: causal selection methods versus stepwise regression and principal component analysis on data of known and unknown functional relationships. J Hydrol 438–439:16–25 Ssegane H, Tollner EW, Mohamoud YM, Rasmussen TC, Dowd JF (2012) Advances in variable selection methods I: causal selection methods versus stepwise regression and principal component analysis on data of known and unknown functional relationships. J Hydrol 438–439:16–25
go back to reference Tan F, Fu X, Zhang Y, Bourgeois AG (2006) A genetic algorithm-based method for feature subset selection. Soft Comput 12(2):111–120 Tan F, Fu X, Zhang Y, Bourgeois AG (2006) A genetic algorithm-based method for feature subset selection. Soft Comput 12(2):111–120
go back to reference Tsai CF, Hsiao YC (2010) Combining multiple feature selection methods for stock prediction: union, intersection, and multi-intersection approaches. Decis Support Syst 50(1):258–269CrossRef Tsai CF, Hsiao YC (2010) Combining multiple feature selection methods for stock prediction: union, intersection, and multi-intersection approaches. Decis Support Syst 50(1):258–269CrossRef
go back to reference Wu JL, Chang PC (2012) A trend-based segmentation method and the support vector regression for financial time series forecasting. Math Probl Eng, 20 pp. Article ID 615152 Wu JL, Chang PC (2012) A trend-based segmentation method and the support vector regression for financial time series forecasting. Math Probl Eng, 20 pp. Article ID 615152
go back to reference Wu JL, Chang PC, Chang KT, Zhang L (2011) A collaborative trading model by support vector regression and TS fuzzy rule for daily stock turning points detection. In: Proceedings of the 2011 3rd international conference on computer engineering and technology, pp 185–190 Wu JL, Chang PC, Chang KT, Zhang L (2011) A collaborative trading model by support vector regression and TS fuzzy rule for daily stock turning points detection. In: Proceedings of the 2011 3rd international conference on computer engineering and technology, pp 185–190
go back to reference Wu JL, Yu LC, Chang PC (2011) Emotion classification by removal of the overlap from incremental association language features. J Chin Inst Eng 34(7):947–955CrossRef Wu JL, Yu LC, Chang PC (2011) Emotion classification by removal of the overlap from incremental association language features. J Chin Inst Eng 34(7):947–955CrossRef
go back to reference Zhang C, Xiang S, Nie F, Song Y (2009) Nonlinear dimensionality reduction with relative distance comparison. Neurocomputing 72(7–9):1719–1731CrossRef Zhang C, Xiang S, Nie F, Song Y (2009) Nonlinear dimensionality reduction with relative distance comparison. Neurocomputing 72(7–9):1719–1731CrossRef
go back to reference Zhu X, Huang Z, Yang Y, Shen HT, Xu C, Luo J (2013) Self-taught dimensionality reduction on the high-dimensional small-sized data. Pattern Recognit 46(1):215–229CrossRefMATH Zhu X, Huang Z, Yang Y, Shen HT, Xu C, Luo J (2013) Self-taught dimensionality reduction on the high-dimensional small-sized data. Pattern Recognit 46(1):215–229CrossRefMATH
Metadata
Title
A critical feature extraction by kernel PCA in stock trading model
Authors
Pei-Chann Chang
Jheng-Long Wu
Publication date
01-05-2015
Publisher
Springer Berlin Heidelberg
Published in
Soft Computing / Issue 5/2015
Print ISSN: 1432-7643
Electronic ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-014-1350-5

Other articles of this Issue 5/2015

Soft Computing 5/2015 Go to the issue

Premium Partner