skip to main content
10.1145/3209978.3210006acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks

Published:27 June 2018Publication History

ABSTRACT

Multivariate time series forecasting is an important machine learning problem across many domains, including predictions of solar plant energy output, electricity consumption, and traffic jam situation. Temporal data arise in these real-world applications often involves a mixture of long-term and short-term patterns, for which traditional approaches such as Autoregressive models and Gaussian Process may fail. In this paper, we proposed a novel deep learning framework, namely Long- and Short-term Time-series network (LSTNet), to address this open challenge. LSTNet uses the Convolution Neural Network (CNN) and the Recurrent Neural Network (RNN) to extract short-term local dependency patterns among variables and to discover long-term patterns for time series trends. Furthermore, we leverage traditional autoregressive model to tackle the scale insensitive problem of the neural network model. In our evaluation on real-world data with complex mixtures of repetitive patterns, LSTNet achieved significant performance improvements over that of several state-of-the-art baseline methods. All the data and experiment codes are available online.

References

  1. D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.Google ScholarGoogle Scholar
  2. G. E. Box, G. M. Jenkins, G. C. Reinsel, and G. M. Ljung. Time series analysis: forecasting and control. John Wiley & Sons, 2015.Google ScholarGoogle Scholar
  3. G. E. Box and D. A. Pierce. Distribution of residual autocorrelations in autoregressive-integrated moving average time series models. Journal of the American statistical Association, 65(332):1509--1526, 1970.Google ScholarGoogle Scholar
  4. L.-J. Cao and F. E. H. Tay. Support vector machine with adaptive parameters in financial time series forecasting. IEEE Transactions on neural networks, 14(6):1506--1518, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Z. Che, S. Purushotham, K. Cho, D. Sontag, and Y. Liu. Recurrent neural networks for multivariate time series with missing values. arXiv preprint arXiv:1606.01865, 2016.Google ScholarGoogle Scholar
  6. J. Chung, C. Gulcehre, K. Cho, and Y. Bengio. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Connor, L. E. Atlas, and D. R. Martin. Recurrent networks and narma modeling. In NIPS, pages 301--308, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Dasgupta and T. Osogami. Nonlinear dynamic boltzmann machines for time-series prediction. AAAI-17. Extended research report available at goo. gl/Vd0wna, 2016.Google ScholarGoogle Scholar
  9. J. L. Elman. Finding structure in time. Cognitive science, 14(2):179--211, 1990.Google ScholarGoogle ScholarCross RefCross Ref
  10. R. Frigola, F. Lindsten, T. B. Schön, and C. E. Rasmussen. Bayesian inference and learning in gaussian process state-space models with particle mcmc. In Advances in Neural Information Processing Systems, pages 3156--3164, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. Frigola-Alcade. Bayesian Time Series Learning with Gaussian Processes. PhD thesis, PhD thesis, University of Cambridge, 2015.Google ScholarGoogle Scholar
  12. J. D. Hamilton. Time series analysis, volume 2. Princeton university press Princeton, 1994.Google ScholarGoogle ScholarCross RefCross Ref
  13. N. Y. Hammerla, S. Halloran, and T. Ploetz. Deep, convolutional, and recurrent models for human activity recognition using wearables. arXiv preprint arXiv:1604.08880, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29(6):82--97, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  15. S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735--1780, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Jain and A. M. Kumar. Hybrid neural network models for hydrologic time series forecasting. Applied Soft Computing, 7(2):585--592, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. K.-j. Kim. Financial time series forecasting using support vector machines. Neurocomputing, 55(1):307--319, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  18. D. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.Google ScholarGoogle Scholar
  19. A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097--1105, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. C. Lea, R. Vidal, A. Reiter, and G. D. Hager. Temporal convolutional networks: A unified approach to action segmentation. In Computer Vision--ECCV 2016 Workshops, pages 47--54. Springer, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  21. Y. LeCun and Y. Bengio. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 3361(10):1995, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Li and W. Chen. Forecasting macroeconomic time series: Lasso-based approaches and their forecast combinations with dynamic factor models. International Journal of Forecasting, 30(4):996--1015, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  23. Z. C. Lipton, D. C. Kale, C. Elkan, and R. Wetzell. Learning to diagnose with lstm recurrent neural networks. arXiv preprint arXiv:1511.03677, 2015.Google ScholarGoogle Scholar
  24. H. Lütkepohl. New introduction to multiple time series analysis. Springer Science & Business Media, 2005. Google ScholarGoogle ScholarCross RefCross Ref
  25. E. McKenzie. General exponential smoothing and the equivalent arma process. Journal of Forecasting, 3(3):333--344, 1984.Google ScholarGoogle ScholarCross RefCross Ref
  26. I. Melnyk and A. Banerjee. Estimating structured vector autoregressive model. arXiv preprint arXiv:1602.06606, 2016.Google ScholarGoogle Scholar
  27. D. Neil, M. Pfeiffer, and S.-C. Liu. Phased lstm: Accelerating recurrent network training for long or event-based sequences. In Advances in Neural Information Processing Systems, pages 3882--3890, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. H. Qiu, S. Xu, F. Han, H. Liu, and B. Caffo. Robust estimation of transition matrices in high dimensional heavy-tailed vector autoregressive processes. In Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pages 1843--1851, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. S. Roberts, M. Osborne, M. Ebden, S. Reece, N. Gibson, and S. Aigrain. Gaussian processes for time-series modelling. Phil. Trans. R. Soc. A, 371(1984):20110550, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  30. R. K. Srivastava, K. Greff, and J. Schmidhuber. Highway networks. arXiv preprint arXiv:1505.00387, 2015.Google ScholarGoogle Scholar
  31. V. Vapnik, S. E. Golowich, A. Smola, et al. Support vector method for function approximation, regression estimation, and signal processing. Advances in neural information processing systems, pages 281--287, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. B. Yang, M. N. Nguyen, P. P. San, X. L. Li, and S. Krishnaswamy. Deep convolutional neural networks on multichannel time series for human activity recognition. In Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI), Buenos Aires, Argentina, pages 25--31, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. H.-F. Yu, N. Rao, and I. S. Dhillon. Temporal regularized matrix factorization for high-dimensional time series prediction. In Advances in Neural Information Processing Systems, pages 847--855, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. R. Yu, Y. Li, C. Shahabi, U. Demiryurek, and Y. Liu. Deep learning: A generic approach for extreme condition traffic forecasting. In Proceedings of the 2017 SIAM International Conference on Data Mining, pages 777--785. SIAM, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  35. G. Zhang, B. E. Patuwo, and M. Y. Hu. Forecasting with artificial neural networks:: The state of the art. International journal of forecasting, 14(1):35--62, 1998.Google ScholarGoogle Scholar
  36. G. P. Zhang. Time series forecasting using a hybrid arima and neural network model. Neurocomputing, 50:159--175, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  37. Y. Zhu, H. Li, Y. Liao, B. Wang, Z. Guan, H. Liu, and D. Cai. What to do next: modeling user behaviors by time-lstm. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17, pages 3602--3608, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval
      June 2018
      1509 pages
      ISBN:9781450356572
      DOI:10.1145/3209978

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 June 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      SIGIR '18 Paper Acceptance Rate86of409submissions,21%Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader