Skip to main content
Erschienen in: International Journal of Data Science and Analytics 4/2019

31.08.2018 | Regular Paper

Batch and incremental dynamic factor machine learning for multivariate and multi-step-ahead forecasting

verfasst von: Jacopo De Stefani, Yann-Aël Le Borgne, Olivier Caelen, Dalila Hattab, Gianluca Bontempi

Erschienen in: International Journal of Data Science and Analytics | Ausgabe 4/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Most multivariate forecasting methods in the literature are restricted to vector time series of low dimension, linear methods and short horizons. Big data revolution is instead shifting the focus to problems (e.g., issued from the IoT technology) characterized by very large dimension, nonlinearity and long forecasting horizons. This paper discusses and compares a set of state-of-the-art methods which could be promising in tackling such challenges. Also, it proposes DFML, a machine-learning version of the dynamic factor model, a successful forecasting methodology well-known in econometrics. The DFML strategy is based on a out-of-sample selection of the nonlinear forecaster, the number of latent components and the multi-step-ahead strategy. We will discuss both a batch and an incremental version of DFML, and we will show that it can consistently outperform state-of-the-art methods in a number of Synthetic and real forecasting tasks.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
In the following of the paper, we will assume \(d=0\) for the sake of simplicity.
 
2
In batch PCA the components are recomputed at each instant t while in incremental they are updated sequentially (after initialization with a small data subset).
 
3
The experimental sessions can be reproduced by means of the R code available in the github repository https://​github.​com/​gbonte/​panel.
 
Literatur
1.
Zurück zum Zitat Andrecut, M.: Parallel GPU implementation of iterative PCA algorithms. J. Comput. Biol. 16(11), 1593–1599 (2009)MathSciNetCrossRef Andrecut, M.: Parallel GPU implementation of iterative PCA algorithms. J. Comput. Biol. 16(11), 1593–1599 (2009)MathSciNetCrossRef
2.
Zurück zum Zitat Arora, R., Cotter, A., Livescu, K., Srebro, N.: Stochastic optimization for PCA and PLS. In: 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 861–868. IEEE (2012) Arora, R., Cotter, A., Livescu, K., Srebro, N.: Stochastic optimization for PCA and PLS. In: 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 861–868. IEEE (2012)
3.
Zurück zum Zitat Ben Taieb, S., Atiya, A.: A bias and variance analysis for multistep-ahead time series forecasting. IEEE Trans. Neural Netw. Learn. Syst. 27(1), 62–76 (2016)MathSciNetCrossRef Ben Taieb, S., Atiya, A.: A bias and variance analysis for multistep-ahead time series forecasting. IEEE Trans. Neural Netw. Learn. Syst. 27(1), 62–76 (2016)MathSciNetCrossRef
4.
Zurück zum Zitat Ben Taieb, S., Bontempi, G., Atiya, A., Sorjamaa, A.: A review and comparison of strategies for multi-step ahead time series forecasting based on the NN5 forecasting competition. Expert Syst. Appl. 39(8), 7067–7083 (2012)CrossRef Ben Taieb, S., Bontempi, G., Atiya, A., Sorjamaa, A.: A review and comparison of strategies for multi-step ahead time series forecasting based on the NN5 forecasting competition. Expert Syst. Appl. 39(8), 7067–7083 (2012)CrossRef
5.
Zurück zum Zitat Ben Taieb, S., Bontempi, G., Sorjamaa, A., Lendasse, A.: Long-term prediction of time series by combining direct and mimo strategies. In: Proceedings of the 2009 IEEE International Joint Conference on Neural Networks, pp. 3054–3061. Atlanta, USA (2009) Ben Taieb, S., Bontempi, G., Sorjamaa, A., Lendasse, A.: Long-term prediction of time series by combining direct and mimo strategies. In: Proceedings of the 2009 IEEE International Joint Conference on Neural Networks, pp. 3054–3061. Atlanta, USA (2009)
6.
Zurück zum Zitat Ben Taieb, S., Sorjamaa, A., Bontempi, G.: Multiple-output modelling for multi-step-ahead forecasting. Neurocomputing 73, 1950–1957 (2010)CrossRef Ben Taieb, S., Sorjamaa, A., Bontempi, G.: Multiple-output modelling for multi-step-ahead forecasting. Neurocomputing 73, 1950–1957 (2010)CrossRef
8.
Zurück zum Zitat Blum, A., Rivest, R.L.: Training a 3-node neural network is np-complete. In: Proceedings of the 1st International Conference on Neural Information Processing Systems, pp. 494–501. MIT Press (1988) Blum, A., Rivest, R.L.: Training a 3-node neural network is np-complete. In: Proceedings of the 1st International Conference on Neural Information Processing Systems, pp. 494–501. MIT Press (1988)
9.
Zurück zum Zitat Bontempi, G.: Long term time series prediction with multi-input multi-output local learning. In: Proceedings of the 2nd European Symposium on Time Series Prediction (TSP), ESTSP08 pp. 145–154 (2008) Bontempi, G.: Long term time series prediction with multi-input multi-output local learning. In: Proceedings of the 2nd European Symposium on Time Series Prediction (TSP), ESTSP08 pp. 145–154 (2008)
12.
Zurück zum Zitat Bontempi, G., Birattari, M., Bersini, H.: Lazy learning for modeling and control design. Int. J. Control 72(7/8), 643–658 (1999)CrossRefMATH Bontempi, G., Birattari, M., Bersini, H.: Lazy learning for modeling and control design. Int. J. Control 72(7/8), 643–658 (1999)CrossRefMATH
13.
Zurück zum Zitat Bontempi, G., Birattari, M., Bersini, H.: Local learning for iterated time-series prediction. In: Bratko, I., Dzeroski, S. (eds.) Machine Learning: Proceedings of the Sixteenth International Conference, pp. 32–38. Morgan Kaufmann Publishers, San Francisco (1999) Bontempi, G., Birattari, M., Bersini, H.: Local learning for iterated time-series prediction. In: Bratko, I., Dzeroski, S. (eds.) Machine Learning: Proceedings of the Sixteenth International Conference, pp. 32–38. Morgan Kaufmann Publishers, San Francisco (1999)
14.
Zurück zum Zitat Bontempi, G., Le Borgne, Y.A., De Stefani, J.: A dynamic factor machine learning method for multi-variate and multi-step-ahead forecasting. In: 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 222–231. IEEE (2017) Bontempi, G., Le Borgne, Y.A., De Stefani, J.: A dynamic factor machine learning method for multi-variate and multi-step-ahead forecasting. In: 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 222–231. IEEE (2017)
15.
Zurück zum Zitat Bontempi, G., Taieb, S.B.: Conditionally dependent strategies for multiple-step-ahead prediction in local learning. Int. J. Forecast. 27(3), 689–699 (2011)CrossRef Bontempi, G., Taieb, S.B.: Conditionally dependent strategies for multiple-step-ahead prediction in local learning. Int. J. Forecast. 27(3), 689–699 (2011)CrossRef
16.
Zurück zum Zitat Bourlard, H., Kamp, Y.: Auto-association by multilayer perceptrons and singular value decomposition. Biol. Cybern. 59(4), 291–294 (1988)MathSciNetCrossRefMATH Bourlard, H., Kamp, Y.: Auto-association by multilayer perceptrons and singular value decomposition. Biol. Cybern. 59(4), 291–294 (1988)MathSciNetCrossRefMATH
18.
Zurück zum Zitat Cheng, H., Tan, P.N., Gao, J., Scripps, J.: Multistep-ahead time series prediction. In: PAKDD, pp. 765–774 (2006) Cheng, H., Tan, P.N., Gao, J., Scripps, J.: Multistep-ahead time series prediction. In: PAKDD, pp. 765–774 (2006)
19.
Zurück zum Zitat Chevillon, G.: Direct multi-step estimation and forecasting. J. Econ. Surv. 21(4), 746–785 (2007)CrossRef Chevillon, G.: Direct multi-step estimation and forecasting. J. Econ. Surv. 21(4), 746–785 (2007)CrossRef
22.
Zurück zum Zitat Franses, P., Legerstee, R.: A unifying view on multi-step forecasting using an autoregression. J. Econ. Surv. 24(3), 389–401 (2010) Franses, P., Legerstee, R.: A unifying view on multi-step forecasting using an autoregression. J. Econ. Surv. 24(3), 389–401 (2010)
24.
Zurück zum Zitat Garman, M.B., Klass, M.J.: On the estimation of security price volatilities from historical data. J. Bus. 53, 67–78 (1980)CrossRef Garman, M.B., Klass, M.J.: On the estimation of security price volatilities from historical data. J. Bus. 53, 67–78 (1980)CrossRef
25.
Zurück zum Zitat Gilbert, P.D.: State space and ARMA models : an overview of the equivalence. Bank of Canada, Ottawa (1993) Gilbert, P.D.: State space and ARMA models : an overview of the equivalence. Bank of Canada, Ottawa (1993)
26.
Zurück zum Zitat Golyandina, N., Korobeynikov, A., Shlemov, A., Usevich, K.: Multivariate and 2d extensions of singular spectrum analysis with the RSSA package. J. Stat. Softw. 67, 1–78 (2015)CrossRef Golyandina, N., Korobeynikov, A., Shlemov, A., Usevich, K.: Multivariate and 2d extensions of singular spectrum analysis with the RSSA package. J. Stat. Softw. 67, 1–78 (2015)CrossRef
27.
Zurück zum Zitat Golyandina, N., Nekrutkin, V., Zhigljavsky, A.: Analysis of Time Series Structure: SSA and Related Techniques. CRC Press, Boca Raton (2001)CrossRefMATH Golyandina, N., Nekrutkin, V., Zhigljavsky, A.: Analysis of Time Series Structure: SSA and Related Techniques. CRC Press, Boca Raton (2001)CrossRefMATH
28.
Zurück zum Zitat Graves, A.: Supervised Sequence Labelling with Recurrent Neural Networks. Springer, Berlin (2012)CrossRefMATH Graves, A.: Supervised Sequence Labelling with Recurrent Neural Networks. Springer, Berlin (2012)CrossRefMATH
29.
Zurück zum Zitat Guo, M., Bai, Z., An, H.: Multi-step prediction for nonlinear autoregressive models based on empirical distributions. Stat. Sin. 9, 559–570 (1999)MATH Guo, M., Bai, Z., An, H.: Multi-step prediction for nonlinear autoregressive models based on empirical distributions. Stat. Sin. 9, 559–570 (1999)MATH
30.
Zurück zum Zitat Hegde, A., Principe, J.C., Erdogmus, D., Ozertem, U., Rao, Y.N., Peddaneni, H.: Perturbation-based eigenvector updates for on-line principal components analysis and canonical correlation analysis. J. VLSI Signal Process. 45(1), 85–95 (2006)CrossRef Hegde, A., Principe, J.C., Erdogmus, D., Ozertem, U., Rao, Y.N., Peddaneni, H.: Perturbation-based eigenvector updates for on-line principal components analysis and canonical correlation analysis. J. VLSI Signal Process. 45(1), 85–95 (2006)CrossRef
31.
Zurück zum Zitat Jolliffe, I.: Principal Component Analysis. Springer, Berlin (2002)MATH Jolliffe, I.: Principal Component Analysis. Springer, Berlin (2002)MATH
32.
Zurück zum Zitat Jurgovsky, J., Granitzer, M., Ziegler, K., Calabretto, S., Portier, P.E., He-Guelton, L., Caelen, O.: Sequence classification for credit-card fraud detection. Expert Syst. Appl. 100, 234–245 (2018)CrossRef Jurgovsky, J., Granitzer, M., Ziegler, K., Calabretto, S., Portier, P.E., He-Guelton, L., Caelen, O.: Sequence classification for credit-card fraud detection. Expert Syst. Appl. 100, 234–245 (2018)CrossRef
33.
Zurück zum Zitat Kirchgassner, G., Wolters, J.: Introduction to Modern Time Series Analysis. Springer, Berlin (2007)CrossRefMATH Kirchgassner, G., Wolters, J.: Introduction to Modern Time Series Analysis. Springer, Berlin (2007)CrossRefMATH
34.
Zurück zum Zitat Kline,D.M.:Methods for multi-step time series forecasting neural networks. In: Neural networks in business forecasting, pp. 226–250. IGI Global, Hershey Kline,D.M.:Methods for multi-step time series forecasting neural networks. In: Neural networks in business forecasting, pp. 226–250. IGI Global, Hershey
35.
Zurück zum Zitat Lipton, Z.C., Berkowitz, J., Elkan, C.: A critical review of recurrent neural networks for sequence learning (2015). arXiv preprint arXiv:1506.00019 Lipton, Z.C., Berkowitz, J., Elkan, C.: A critical review of recurrent neural networks for sequence learning (2015). arXiv preprint arXiv:​1506.​00019
36.
Zurück zum Zitat Matías, J.M.: Multi-output nonparametric regression. In: EPIA, pp. 288–292 (2005) Matías, J.M.: Multi-output nonparametric regression. In: EPIA, pp. 288–292 (2005)
37.
Zurück zum Zitat McNames, J.: A nearest trajectory strategy for time series prediction. In: Proceedings of the International Workshop on Advanced Black-Box Techniques for Nonlinear Modeling, pp. 112–128. K.U. Leuven, Belgium (1998) McNames, J.: A nearest trajectory strategy for time series prediction. In: Proceedings of the International Workshop on Advanced Black-Box Techniques for Nonlinear Modeling, pp. 112–128. K.U. Leuven, Belgium (1998)
39.
Zurück zum Zitat Mitliagkas, I., Caramanis, C., Jain, P.: Memory limited, streaming PCA. In: Advances in Neural Information Processing Systems, pp. 2886–2894 (2013) Mitliagkas, I., Caramanis, C., Jain, P.: Memory limited, streaming PCA. In: Advances in Neural Information Processing Systems, pp. 2886–2894 (2013)
40.
Zurück zum Zitat Oja, E.: Principal components, minor components, and linear neural networks. Neural Netw. 5(6), 927–935 (1992)CrossRef Oja, E.: Principal components, minor components, and linear neural networks. Neural Netw. 5(6), 927–935 (1992)CrossRef
41.
Zurück zum Zitat Papadimitriou, S., Sun, J., Faloutsos, C.: Streaming pattern discovery in multiple time-series. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp. 697–708 (2005) Papadimitriou, S., Sun, J., Faloutsos, C.: Streaming pattern discovery in multiple time-series. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp. 697–708 (2005)
42.
Zurück zum Zitat Peña, D., Poncela, P.: Dimension Reduction in Multivariate Time Series, pp. 433–458. Birkhäuser Boston, Boston (2006)MATH Peña, D., Poncela, P.: Dimension Reduction in Multivariate Time Series, pp. 433–458. Birkhäuser Boston, Boston (2006)MATH
44.
Zurück zum Zitat Poon, S.H., Granger, C.W.: Forecasting volatility in financial markets: a review. J. Econ. Lit. 41(2), 478–539 (2003)CrossRef Poon, S.H., Granger, C.W.: Forecasting volatility in financial markets: a review. J. Econ. Lit. 41(2), 478–539 (2003)CrossRef
46.
Zurück zum Zitat Sanger, T.D.: Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural Netw. 2(6), 459–473 (1989)CrossRef Sanger, T.D.: Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural Netw. 2(6), 459–473 (1989)CrossRef
48.
Zurück zum Zitat Stock, J., Watson, M.: Forecasting using principal components from a large number of predictors. J. Am. Stat. Assoc. 97(460), 1167–1179 (2002)MathSciNetCrossRefMATH Stock, J., Watson, M.: Forecasting using principal components from a large number of predictors. J. Am. Stat. Assoc. 97(460), 1167–1179 (2002)MathSciNetCrossRefMATH
49.
Zurück zum Zitat Stock, J., Watson, M.: Dynamic factor models. In: Clements, M., Hendry, D. (eds.) Oxford Handbook of Economic Forecasting. Oxford University Press, Oxford (2010) Stock, J., Watson, M.: Dynamic factor models. In: Clements, M., Hendry, D. (eds.) Oxford Handbook of Economic Forecasting. Oxford University Press, Oxford (2010)
50.
52.
Zurück zum Zitat Tashman, L.J.: Out-of-sample tests of forecasting accuracy: an analysis and review. Int. J. Forecast. 16(4), 437–450 (2000)CrossRef Tashman, L.J.: Out-of-sample tests of forecasting accuracy: an analysis and review. Int. J. Forecast. 16(4), 437–450 (2000)CrossRef
53.
Zurück zum Zitat Tong, H.: Threshold Models in Nonlinear Time Series Analysis. Springer, Berlin (1983)CrossRefMATH Tong, H.: Threshold Models in Nonlinear Time Series Analysis. Springer, Berlin (1983)CrossRefMATH
55.
Zurück zum Zitat Tsay, R.S.: Multivariate Time Series Analysis with R and Financial Applications. Wiley, Hoboken (2014)MATH Tsay, R.S.: Multivariate Time Series Analysis with R and Financial Applications. Wiley, Hoboken (2014)MATH
57.
Zurück zum Zitat Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)MathSciNetMATH Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)MathSciNetMATH
58.
Zurück zum Zitat Weigend, A., Gershenfeld, N.: Time Series Prediction: forecasting the future and understanding the past. Addison Wesley, Harlow (1994) Weigend, A., Gershenfeld, N.: Time Series Prediction: forecasting the future and understanding the past. Addison Wesley, Harlow (1994)
59.
Zurück zum Zitat Weng, J., Zhang, Y., Hwang, W.S.: Candid covariance-free incremental principal component analysis. IEEE Trans. Pattern Anal. Mach. Intell. 25(8), 1034–1040 (2003)CrossRef Weng, J., Zhang, Y., Hwang, W.S.: Candid covariance-free incremental principal component analysis. IEEE Trans. Pattern Anal. Mach. Intell. 25(8), 1034–1040 (2003)CrossRef
Metadaten
Titel
Batch and incremental dynamic factor machine learning for multivariate and multi-step-ahead forecasting
verfasst von
Jacopo De Stefani
Yann-Aël Le Borgne
Olivier Caelen
Dalila Hattab
Gianluca Bontempi
Publikationsdatum
31.08.2018
Verlag
Springer International Publishing
Erschienen in
International Journal of Data Science and Analytics / Ausgabe 4/2019
Print ISSN: 2364-415X
Elektronische ISSN: 2364-4168
DOI
https://doi.org/10.1007/s41060-018-0150-x

Weitere Artikel der Ausgabe 4/2019

International Journal of Data Science and Analytics 4/2019 Zur Ausgabe