ABSTRACT
Multivariate time series forecasting is an important machine learning problem across many domains, including predictions of solar plant energy output, electricity consumption, and traffic jam situation. Temporal data arise in these real-world applications often involves a mixture of long-term and short-term patterns, for which traditional approaches such as Autoregressive models and Gaussian Process may fail. In this paper, we proposed a novel deep learning framework, namely Long- and Short-term Time-series network (LSTNet), to address this open challenge. LSTNet uses the Convolution Neural Network (CNN) and the Recurrent Neural Network (RNN) to extract short-term local dependency patterns among variables and to discover long-term patterns for time series trends. Furthermore, we leverage traditional autoregressive model to tackle the scale insensitive problem of the neural network model. In our evaluation on real-world data with complex mixtures of repetitive patterns, LSTNet achieved significant performance improvements over that of several state-of-the-art baseline methods. All the data and experiment codes are available online.
- D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.Google Scholar
- G. E. Box, G. M. Jenkins, G. C. Reinsel, and G. M. Ljung. Time series analysis: forecasting and control. John Wiley & Sons, 2015.Google Scholar
- G. E. Box and D. A. Pierce. Distribution of residual autocorrelations in autoregressive-integrated moving average time series models. Journal of the American statistical Association, 65(332):1509--1526, 1970.Google Scholar
- L.-J. Cao and F. E. H. Tay. Support vector machine with adaptive parameters in financial time series forecasting. IEEE Transactions on neural networks, 14(6):1506--1518, 2003. Google ScholarDigital Library
- Z. Che, S. Purushotham, K. Cho, D. Sontag, and Y. Liu. Recurrent neural networks for multivariate time series with missing values. arXiv preprint arXiv:1606.01865, 2016.Google Scholar
- J. Chung, C. Gulcehre, K. Cho, and Y. Bengio. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555, 2014.Google ScholarDigital Library
- J. Connor, L. E. Atlas, and D. R. Martin. Recurrent networks and narma modeling. In NIPS, pages 301--308, 1991. Google ScholarDigital Library
- S. Dasgupta and T. Osogami. Nonlinear dynamic boltzmann machines for time-series prediction. AAAI-17. Extended research report available at goo. gl/Vd0wna, 2016.Google Scholar
- J. L. Elman. Finding structure in time. Cognitive science, 14(2):179--211, 1990.Google ScholarCross Ref
- R. Frigola, F. Lindsten, T. B. Schön, and C. E. Rasmussen. Bayesian inference and learning in gaussian process state-space models with particle mcmc. In Advances in Neural Information Processing Systems, pages 3156--3164, 2013. Google ScholarDigital Library
- R. Frigola-Alcade. Bayesian Time Series Learning with Gaussian Processes. PhD thesis, PhD thesis, University of Cambridge, 2015.Google Scholar
- J. D. Hamilton. Time series analysis, volume 2. Princeton university press Princeton, 1994.Google ScholarCross Ref
- N. Y. Hammerla, S. Halloran, and T. Ploetz. Deep, convolutional, and recurrent models for human activity recognition using wearables. arXiv preprint arXiv:1604.08880, 2016. Google ScholarDigital Library
- G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29(6):82--97, 2012.Google ScholarCross Ref
- S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735--1780, 1997. Google ScholarDigital Library
- A. Jain and A. M. Kumar. Hybrid neural network models for hydrologic time series forecasting. Applied Soft Computing, 7(2):585--592, 2007. Google ScholarDigital Library
- K.-j. Kim. Financial time series forecasting using support vector machines. Neurocomputing, 55(1):307--319, 2003.Google ScholarCross Ref
- D. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.Google Scholar
- A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097--1105, 2012. Google ScholarDigital Library
- C. Lea, R. Vidal, A. Reiter, and G. D. Hager. Temporal convolutional networks: A unified approach to action segmentation. In Computer Vision--ECCV 2016 Workshops, pages 47--54. Springer, 2016.Google ScholarCross Ref
- Y. LeCun and Y. Bengio. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 3361(10):1995, 1995. Google ScholarDigital Library
- J. Li and W. Chen. Forecasting macroeconomic time series: Lasso-based approaches and their forecast combinations with dynamic factor models. International Journal of Forecasting, 30(4):996--1015, 2014.Google ScholarCross Ref
- Z. C. Lipton, D. C. Kale, C. Elkan, and R. Wetzell. Learning to diagnose with lstm recurrent neural networks. arXiv preprint arXiv:1511.03677, 2015.Google Scholar
- H. Lütkepohl. New introduction to multiple time series analysis. Springer Science & Business Media, 2005. Google ScholarCross Ref
- E. McKenzie. General exponential smoothing and the equivalent arma process. Journal of Forecasting, 3(3):333--344, 1984.Google ScholarCross Ref
- I. Melnyk and A. Banerjee. Estimating structured vector autoregressive model. arXiv preprint arXiv:1602.06606, 2016.Google Scholar
- D. Neil, M. Pfeiffer, and S.-C. Liu. Phased lstm: Accelerating recurrent network training for long or event-based sequences. In Advances in Neural Information Processing Systems, pages 3882--3890, 2016. Google ScholarDigital Library
- H. Qiu, S. Xu, F. Han, H. Liu, and B. Caffo. Robust estimation of transition matrices in high dimensional heavy-tailed vector autoregressive processes. In Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pages 1843--1851, 2015. Google ScholarDigital Library
- S. Roberts, M. Osborne, M. Ebden, S. Reece, N. Gibson, and S. Aigrain. Gaussian processes for time-series modelling. Phil. Trans. R. Soc. A, 371(1984):20110550, 2013.Google ScholarCross Ref
- R. K. Srivastava, K. Greff, and J. Schmidhuber. Highway networks. arXiv preprint arXiv:1505.00387, 2015.Google Scholar
- V. Vapnik, S. E. Golowich, A. Smola, et al. Support vector method for function approximation, regression estimation, and signal processing. Advances in neural information processing systems, pages 281--287, 1997. Google ScholarDigital Library
- J. B. Yang, M. N. Nguyen, P. P. San, X. L. Li, and S. Krishnaswamy. Deep convolutional neural networks on multichannel time series for human activity recognition. In Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI), Buenos Aires, Argentina, pages 25--31, 2015. Google ScholarDigital Library
- H.-F. Yu, N. Rao, and I. S. Dhillon. Temporal regularized matrix factorization for high-dimensional time series prediction. In Advances in Neural Information Processing Systems, pages 847--855, 2016. Google ScholarDigital Library
- R. Yu, Y. Li, C. Shahabi, U. Demiryurek, and Y. Liu. Deep learning: A generic approach for extreme condition traffic forecasting. In Proceedings of the 2017 SIAM International Conference on Data Mining, pages 777--785. SIAM, 2017.Google ScholarCross Ref
- G. Zhang, B. E. Patuwo, and M. Y. Hu. Forecasting with artificial neural networks:: The state of the art. International journal of forecasting, 14(1):35--62, 1998.Google Scholar
- G. P. Zhang. Time series forecasting using a hybrid arima and neural network model. Neurocomputing, 50:159--175, 2003.Google ScholarCross Ref
- Y. Zhu, H. Li, Y. Liao, B. Wang, Z. Guan, H. Liu, and D. Cai. What to do next: modeling user behaviors by time-lstm. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17, pages 3602--3608, 2017. Google ScholarDigital Library
Index Terms
- Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks
Recommendations
Neural networks for pattern-based short-term load forecasting
In this work several univariate approaches for short-term load forecasting based on neural networks are proposed and compared. They include: multilayer perceptron, radial basis function neural network, generalized regression neural network, fuzzy ...
Comparing ambient temperature account methods in neural network based city short-term load forecasting
We offer a neural network model for forecasting the next day's hourly electric load of a city. We use a few ambient temperature account methods in the research to see how each of them affects the forecasting accuracy. Optimal meta-parameters are ...
Online learning of windmill time series using Long Short-term Cognitive Networks
AbstractForecasting windmill time series is often the basis of other processes such as anomaly detection, health monitoring, or maintenance scheduling. The amount of data generated by windmill farms makes online learning the most viable strategy to ...
Highlights- Traditional recurrent neural networks are often expensive to use in online settings.
- Long Short-term Cognitive Neural Networks seem promising for online learning.
- Neural blocks of this network process a batch of available data in ...
Comments