research-article

Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks

Authors:
Guokun Lai

Carnegie Mellon University, Pittsburgh, PA, USA

Carnegie Mellon University, Pittsburgh, PA, USA
View Profile

,
Wei-Cheng Chang

Carnegie Mellon University, Pittsburgh, PA, USA

Carnegie Mellon University, Pittsburgh, PA, USA
View Profile

,
Yiming Yang

Carnegie Mellon University, Pittsburgh, PA, USA

Carnegie Mellon University, Pittsburgh, PA, USA
View Profile

,
Hanxiao Liu

Carnegie Mellon University, Pittsburgh, PA, USA

Carnegie Mellon University, Pittsburgh, PA, USA
View Profile

SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information RetrievalJune 2018Pages 95–104https://doi.org/10.1145/3209978.3210006

Published:27 June 2018Publication History

SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

Pages 95–104

ABSTRACT

Multivariate time series forecasting is an important machine learning problem across many domains, including predictions of solar plant energy output, electricity consumption, and traffic jam situation. Temporal data arise in these real-world applications often involves a mixture of long-term and short-term patterns, for which traditional approaches such as Autoregressive models and Gaussian Process may fail. In this paper, we proposed a novel deep learning framework, namely Long- and Short-term Time-series network (LSTNet), to address this open challenge. LSTNet uses the Convolution Neural Network (CNN) and the Recurrent Neural Network (RNN) to extract short-term local dependency patterns among variables and to discover long-term patterns for time series trends. Furthermore, we leverage traditional autoregressive model to tackle the scale insensitive problem of the neural network model. In our evaluation on real-world data with complex mixtures of repetitive patterns, LSTNet achieved significant performance improvements over that of several state-of-the-art baseline methods. All the data and experiment codes are available online.

References

D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.Google Scholar
G. E. Box, G. M. Jenkins, G. C. Reinsel, and G. M. Ljung. Time series analysis: forecasting and control. John Wiley & Sons, 2015.Google Scholar
G. E. Box and D. A. Pierce. Distribution of residual autocorrelations in autoregressive-integrated moving average time series models. Journal of the American statistical Association, 65(332):1509--1526, 1970.Google Scholar
L.-J. Cao and F. E. H. Tay. Support vector machine with adaptive parameters in financial time series forecasting. IEEE Transactions on neural networks, 14(6):1506--1518, 2003. Google ScholarDigital Library
Z. Che, S. Purushotham, K. Cho, D. Sontag, and Y. Liu. Recurrent neural networks for multivariate time series with missing values. arXiv preprint arXiv:1606.01865, 2016.Google Scholar
J. Chung, C. Gulcehre, K. Cho, and Y. Bengio. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555, 2014.Google ScholarDigital Library
J. Connor, L. E. Atlas, and D. R. Martin. Recurrent networks and narma modeling. In NIPS, pages 301--308, 1991. Google ScholarDigital Library
S. Dasgupta and T. Osogami. Nonlinear dynamic boltzmann machines for time-series prediction. AAAI-17. Extended research report available at goo. gl/Vd0wna, 2016.Google Scholar
J. L. Elman. Finding structure in time. Cognitive science, 14(2):179--211, 1990.Google ScholarCross Ref
R. Frigola, F. Lindsten, T. B. Schön, and C. E. Rasmussen. Bayesian inference and learning in gaussian process state-space models with particle mcmc. In Advances in Neural Information Processing Systems, pages 3156--3164, 2013. Google ScholarDigital Library
R. Frigola-Alcade. Bayesian Time Series Learning with Gaussian Processes. PhD thesis, PhD thesis, University of Cambridge, 2015.Google Scholar
J. D. Hamilton. Time series analysis, volume 2. Princeton university press Princeton, 1994.Google ScholarCross Ref
N. Y. Hammerla, S. Halloran, and T. Ploetz. Deep, convolutional, and recurrent models for human activity recognition using wearables. arXiv preprint arXiv:1604.08880, 2016. Google ScholarDigital Library
G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29(6):82--97, 2012.Google ScholarCross Ref
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735--1780, 1997. Google ScholarDigital Library
A. Jain and A. M. Kumar. Hybrid neural network models for hydrologic time series forecasting. Applied Soft Computing, 7(2):585--592, 2007. Google ScholarDigital Library
K.-j. Kim. Financial time series forecasting using support vector machines. Neurocomputing, 55(1):307--319, 2003.Google ScholarCross Ref
D. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.Google Scholar
A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097--1105, 2012. Google ScholarDigital Library
C. Lea, R. Vidal, A. Reiter, and G. D. Hager. Temporal convolutional networks: A unified approach to action segmentation. In Computer Vision--ECCV 2016 Workshops, pages 47--54. Springer, 2016.Google ScholarCross Ref
Y. LeCun and Y. Bengio. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 3361(10):1995, 1995. Google ScholarDigital Library
J. Li and W. Chen. Forecasting macroeconomic time series: Lasso-based approaches and their forecast combinations with dynamic factor models. International Journal of Forecasting, 30(4):996--1015, 2014.Google ScholarCross Ref
Z. C. Lipton, D. C. Kale, C. Elkan, and R. Wetzell. Learning to diagnose with lstm recurrent neural networks. arXiv preprint arXiv:1511.03677, 2015.Google Scholar
H. Lütkepohl. New introduction to multiple time series analysis. Springer Science & Business Media, 2005. Google ScholarCross Ref
E. McKenzie. General exponential smoothing and the equivalent arma process. Journal of Forecasting, 3(3):333--344, 1984.Google ScholarCross Ref
I. Melnyk and A. Banerjee. Estimating structured vector autoregressive model. arXiv preprint arXiv:1602.06606, 2016.Google Scholar
D. Neil, M. Pfeiffer, and S.-C. Liu. Phased lstm: Accelerating recurrent network training for long or event-based sequences. In Advances in Neural Information Processing Systems, pages 3882--3890, 2016. Google ScholarDigital Library
H. Qiu, S. Xu, F. Han, H. Liu, and B. Caffo. Robust estimation of transition matrices in high dimensional heavy-tailed vector autoregressive processes. In Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pages 1843--1851, 2015. Google ScholarDigital Library
S. Roberts, M. Osborne, M. Ebden, S. Reece, N. Gibson, and S. Aigrain. Gaussian processes for time-series modelling. Phil. Trans. R. Soc. A, 371(1984):20110550, 2013.Google ScholarCross Ref
R. K. Srivastava, K. Greff, and J. Schmidhuber. Highway networks. arXiv preprint arXiv:1505.00387, 2015.Google Scholar
V. Vapnik, S. E. Golowich, A. Smola, et al. Support vector method for function approximation, regression estimation, and signal processing. Advances in neural information processing systems, pages 281--287, 1997. Google ScholarDigital Library
J. B. Yang, M. N. Nguyen, P. P. San, X. L. Li, and S. Krishnaswamy. Deep convolutional neural networks on multichannel time series for human activity recognition. In Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI), Buenos Aires, Argentina, pages 25--31, 2015. Google ScholarDigital Library
H.-F. Yu, N. Rao, and I. S. Dhillon. Temporal regularized matrix factorization for high-dimensional time series prediction. In Advances in Neural Information Processing Systems, pages 847--855, 2016. Google ScholarDigital Library
R. Yu, Y. Li, C. Shahabi, U. Demiryurek, and Y. Liu. Deep learning: A generic approach for extreme condition traffic forecasting. In Proceedings of the 2017 SIAM International Conference on Data Mining, pages 777--785. SIAM, 2017.Google ScholarCross Ref
G. Zhang, B. E. Patuwo, and M. Y. Hu. Forecasting with artificial neural networks:: The state of the art. International journal of forecasting, 14(1):35--62, 1998.Google Scholar
G. P. Zhang. Time series forecasting using a hybrid arima and neural network model. Neurocomputing, 50:159--175, 2003.Google ScholarCross Ref
Y. Zhu, H. Li, Y. Liao, B. Wang, Z. Guan, H. Liu, and D. Cai. What to do next: modeling user behaviors by time-lstm. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17, pages 3602--3608, 2017. Google ScholarDigital Library

Index Terms

Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks
1. Mathematics of computing
  1. Probability and statistics
    1. Statistical paradigms
      1. Time series analysis

Recommendations

Neural networks for pattern-based short-term load forecasting

In this work several univariate approaches for short-term load forecasting based on neural networks are proposed and compared. They include: multilayer perceptron, radial basis function neural network, generalized regression neural network, fuzzy ...
Read More
Comparing ambient temperature account methods in neural network based city short-term load forecasting

We offer a neural network model for forecasting the next day's hourly electric load of a city. We use a few ambient temperature account methods in the research to see how each of them affects the forecasting accuracy. Optimal meta-parameters are ...
Read More
Online learning of windmill time series using Long Short-term Cognitive Networks
Abstract
Forecasting windmill time series is often the basis of other processes such as anomaly detection, health monitoring, or maintenance scheduling. The amount of data generated by windmill farms makes online learning the most viable strategy to ...
Highlights
- Traditional recurrent neural networks are often expensive to use in online settings.
- Long Short-term Cognitive Neural Networks seem promising for online learning.
- Neural blocks of this network process a batch of available data in ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval
June 2018
1509 pages
ISBN:9781450356572
DOI:10.1145/3209978
General Chairs:
Kevyn Collins-Thompson
University of Michigan, United States
,
Qiaozhu Mei
University of Michigan, United States
,
Program Chairs:
Brian Davison
Lehigh University, United States
,
Yiqun Liu
Tsinghua University, China
,
Emine Yilmaz
University College London, United Kingdom
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 June 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
autoregressive models
multivariate time series
neural network
Qualifiers
- research-article
Conference

Acceptance Rates
SIGIR '18 Paper Acceptance Rate86of409submissions,21%Overall Acceptance Rate792of3,983submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 844
  Total Citations
  View Citations
- 5,088
  Total Downloads
- Downloads (Last 12 months)1,087
- Downloads (Last 6 weeks)171
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks

SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Neural networks for pattern-based short-term load forecasting

Comparing ambient temperature account methods in neural network based city short-term load forecasting

Online learning of windmill time series using Long Short-term Cognitive Networks