skip to main content
10.1145/3437963.3441815acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Explainable Multivariate Time Series Classification: A Deep Neural Network Which Learns to Attend to Important Variables As Well As Time Intervals

Authors Info & Claims
Published:08 March 2021Publication History

ABSTRACT

Many real-world applications, e.g., healthcare, present multi-variate time series prediction problems. In such settings, in addition to the predictive accuracy of the models, model transparency and explainability are paramount. We consider the problem of building explainable classifiers from multi-variate time series data. A key criterion to understand such predictive models involves elucidating and quantifying the contribution of time varying input variables to the classification. Hence, we introduce a novel, modular, convolution-based feature extraction and attention mechanism that simultaneously identifies the variables as well as time intervals which determine the classifier output. We present results of extensive experiments with several benchmark data sets that show that the proposed method outperforms the state-of-the-art baseline methods on multi-variate time series classification task. The results of our case studies demonstrate that the variables and time intervals identified by the proposed method make sense relative to available domain knowledge.

References

  1. Amaia Abanda, Usue Mori, and Jose A Lozano. 2019. A review on distance based time series classification. Data Mining and Knowledge Discovery , Vol. 33, 2 (2019), 378--412.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Marco Ancona, Cengiz Oztireli, and Markus Gross. 2019. Explaining Deep Neural Networks with a Polynomial Time Algorithm for Shapley Value Approximation. In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 97), , Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, Long Beach, California, USA, 272--281. http://proceedings.mlr.press/v97/ancona19a.htmlGoogle ScholarGoogle Scholar
  3. Anthony Bagnall, Jason Lines, Aaron Bostrom, James Large, and Eamonn Keogh. 2017. The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Mining and Knowledge Discovery , Vol. 31, 3 (2017), 606--660.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In ICLR .Google ScholarGoogle Scholar
  5. Donald J Berndt and James Clifford. 1994. Using dynamic time warping to find patterns in time series.. In KDD workshop , Vol. 10. Seattle, WA, 359--370.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Umang Bhatt, Alice Xiang, Shubham Sharma, Adrian Weller, Ankur Taly, Yunhan Jia, Joydeep Ghosh, Ruchir Puri, José MF Moura, and Peter Eckersley. 2020. Explainable machine learning in deployment. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 648--657.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Peter Bloomfield. 2004. Fourier analysis of time series: an introduction .John Wiley & Sons.Google ScholarGoogle ScholarCross RefCross Ref
  8. Prithwish Chakraborty, Manish Marwah, Martin Arlitt, and Naren Ramakrishnan. 2012. Fine-grained photovoltaic output prediction using a bayesian ensemble. In AAAI .Google ScholarGoogle Scholar
  9. Jianbo Chen, Le Song, Martin Wainwright, and Michael Jordan. 2018. Learning to Explain: An Information-Theoretic Perspective on Model Interpretation. In ICML . 883--892.Google ScholarGoogle Scholar
  10. Kyunghyun Cho, Bart Van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the properties of neural machine translation: Encoder-decoder approaches. arXiv:1409.1259 (2014).Google ScholarGoogle Scholar
  11. Edward Choi, Mohammad Taha Bahadori, Jimeng Sun, Joshua Kulas, Andy Schuetz, and Walter Stewart. 2016. Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. In NeurIPS. 3504--3512.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. BA Conway, DM Halliday, SF Farmer, U Shahani, P Maas, AI Weir, and JR Rosenberg. 1995. Synchronization between motor cortex and spinal motoneuronal pool during the performance of a maintained motor task in man. The Journal of physiology , Vol. 489, 3 (1995), 917--924.Google ScholarGoogle ScholarCross RefCross Ref
  13. Enyan Dai, Yiwei Sun, and Suhang Wang. 2020. Ginger Cannot Cure Cancer: Battling Fake Health News with a Comprehensive Data Repository. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 14. 853--862.Google ScholarGoogle ScholarCross RefCross Ref
  14. Vijay Ekambaram, Kushagra Manglik, Sumanta Mukherjee, Surya Shravan Kumar Sajja, Satyam Dwivedi, and Vikas Raykar. 2020. Attention based Multi-Modal New Product Sales Time-series Forecasting. In KDD . 3110--3118.Google ScholarGoogle Scholar
  15. Hassan Ismail Fawaz, Germain Forestier, Jonathan Weber, Lhassane Idoumghar, and Pierre-Alain Muller. 2019. Deep learning for time series classification: a review. Data Mining and Knowledge Discovery , Vol. 33, 4 (2019), 917--963.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Garrett M Fitzmaurice, Nan M Laird, and James H Ware. 2012. Applied longitudinal analysis . Vol. 998. John Wiley & Sons.Google ScholarGoogle Scholar
  17. Jean-Yves Franceschi, Aymeric Dieuleveut, and Martin Jaggi. 2019. Unsupervised scalable representation learning for multivariate time series. In NeurIPS . 4650--4661.Google ScholarGoogle Scholar
  18. Meredith Franklin, Petros Koutrakis, and Joel Schwartz. 2008. The role of particle composition on the association between PM2. 5 and mortality. Epidemiology (Cambridge, Mass.) , Vol. 19, 5 (2008), 680.Google ScholarGoogle Scholar
  19. Jerome Friedman, Trevor Hastie, and Robert Tibshirani. 2001. The elements of statistical learning . Vol. 1. Springer series in statistics New York.Google ScholarGoogle Scholar
  20. Nicholas Frosst and Geoffrey Hinton. 2017. Distilling a neural network into a soft decision tree. arXiv:1711.09784 (2017).Google ScholarGoogle Scholar
  21. Ben D Fulcher and Nick S Jones. 2014. Highly comparative feature-based time-series classification. IEEE TKDE , Vol. 26, 12 (2014), 3026--3037.Google ScholarGoogle Scholar
  22. Ary L Goldberger, Luis AN Amaral, Leon Glass, Jeffrey M Hausdorff, Plamen Ch Ivanov, Roger G Mark, Joseph E Mietus, George B Moody, Chung-Kang Peng, and H Eugene Stanley. 2000. PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. circulation , Vol. 101, 23 (2000), e215--e220.Google ScholarGoogle Scholar
  23. Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. 2018. A survey of methods for explaining black box models. ACM Computing Surveys (CSUR) , Vol. 51, 5 (2018), 1--42.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Tian Guo, Tao Lin, and Nino Antulov-Fantulin. 2019. Exploring interpretable LSTM neural networks over multi-variable data. In ICML . 2494--2504.Google ScholarGoogle Scholar
  25. Min Han and Xiaoxin Liu. 2013. Feature selection techniques with class separability for multivariate time series. Neurocomputing , Vol. 110 (2013), 29--34.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Trevor Hastie, Robert Tibshirani, and Jerome Friedman. 2009. The elements of statistical learning: data mining, inference, and prediction .Springer Science & Business Media.Google ScholarGoogle Scholar
  27. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation , Vol. 9, 8 (1997), 1735--1780.Google ScholarGoogle Scholar
  28. Aria Khademi and Vasant Honavar. 2020. A Causal Lens for Peeking into Black Box Predictive Models: Predictive Model Interpretation via Causal Attribution. arXiv:2008.00357 (2020).Google ScholarGoogle Scholar
  29. Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv:1412.6980 (2014).Google ScholarGoogle Scholar
  30. Thanh Le and Vasant Honavar. 2020. Dynamical Gaussian Process Latent Variable Model for Representation Learning from Longitudinal Data. In Proceedings of the 2020 ACM-IMS on Foundations of Data Science Conference. 183--188.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. nature , Vol. 521, 7553 (2015), 436--444.Google ScholarGoogle Scholar
  32. Shiyang Li, Xiaoyong Jin, Yao Xuan, Xiyou Zhou, Wenhu Chen, Yu-Xiang Wang, and Xifeng Yan. 2019. Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. In NeurIPS. 5244--5254.Google ScholarGoogle Scholar
  33. Junjie Liang, Yanting Wu, Dongkuan Xu, and Vasant Honavar. 2021. Longitudinal Deep Kernel Gaussian Process Regression. In Proceedings of the 35th AAAI Conference on Artificial Intelligence. In press.Google ScholarGoogle Scholar
  34. Junjie Liang, Dongkuan Xu, Yiwei Sun, and Vasant G Honavar. 2020. LMLFM: Longitudinal Multi-Level Factorization Machine. In AAAI .Google ScholarGoogle Scholar
  35. Xuan Liang, Tao Zou, Bin Guo, Shuo Li, Haozhe Zhang, Shuyi Zhang, Hui Huang, and Song Xi Chen. 2015. Assessing Beijing's PM2. 5 pollution: severity, weather impact, APEC and winter heating. Proc. R. Soc. A: Mathematical, Physical and Engineering Sciences , Vol. 471, 2182 (2015), 20150257.Google ScholarGoogle Scholar
  36. Scott M Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. In NeurIPS . 4765--4774.Google ScholarGoogle Scholar
  37. Philippe Major and Elizabeth A Thiele. 2007. Seizures in Children: Laboratory. Pediatrics in review , Vol. 28, 11 (2007), 405.Google ScholarGoogle Scholar
  38. Julieta Martinez, Michael J Black, and Javier Romero. 2017. On human motion prediction using recurrent neural networks. In CVPR. IEEE, 4674--4683.Google ScholarGoogle Scholar
  39. Shane T Mueller, Robert R Hoffman, William Clancey, Abigail Emrey, and Gary Klein. 2019. Explanation in human-AI systems: A literature meta-review, synopsis of key ideas and publications, and bibliography for explainable AI. arXiv:1902.01876 (2019).Google ScholarGoogle Scholar
  40. Meinard Müller. 2007. Dynamic time warping. Information retrieval for music and motion (2007), 69--84.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. W James Murdoch, Peter J Liu, and Bin Yu. 2018. Beyond word importance: Contextual decomposition to extract interactions from LSTMs. arXiv:1801.05453 (2018).Google ScholarGoogle Scholar
  42. Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. 2016. Wavenet: A generative model for raw audio. arXiv:1609.03499 (2016).Google ScholarGoogle Scholar
  43. Donald B Percival and Andrew T Walden. 2000. Wavelet methods for time series analysis . Vol. 4. Cambridge university press.Google ScholarGoogle ScholarCross RefCross Ref
  44. Tue Hvass Petersen, Maria Willerslev-Olsen, Bernard A Conway, and Jens Bo Nielsen. 2012. The motor cortex drives the muscles during walking in human subjects. The Journal of physiology , Vol. 590, 10 (2012), 2443--2452.Google ScholarGoogle ScholarCross RefCross Ref
  45. Wei-wei Pu, Xiu-juan Zhao, Xiao-ling Zhang, and Zhi-qiang Ma. 2011. Effect of meteorological factors on PM2. 5 during July to September of Beijing. Procedia Earth and Planetary Science , Vol. 2 (2011), 272--277.Google ScholarGoogle ScholarCross RefCross Ref
  46. Yao Qin, Dongjin Song, Haifeng Chen, Wei Cheng, Guofei Jiang, and Garrison W Cottrell. 2017. A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction. In IJCAI .Google ScholarGoogle Scholar
  47. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. “Why should I trust you?”: Explaining the predictions of any classifier. In KDD. ACM, 1135--1144.Google ScholarGoogle Scholar
  48. Gerwin Schalk, Dennis J McFarland, Thilo Hinterberger, Niels Birbaumer, and Jonathan R Wolpaw. 2004. BCI2000: a general-purpose brain-computer interface (BCI) system. IEEE TBME , Vol. 51, 6 (2004), 1034--1043.Google ScholarGoogle Scholar
  49. Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In IEEE ICCV . 618--626.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Ali Hossam Shoeb. 2009. Application of machine learning to epileptic seizure onset detection and treatment . Ph.D. Dissertation. Massachusetts Institute of Technology.Google ScholarGoogle Scholar
  51. Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. 2017. Learning important features through propagating activation differences. In ICML. JMLR. org, 3145--3153.Google ScholarGoogle Scholar
  52. Avanti Shrikumar, Peyton Greenside, Anna Shcherbina, and Anshul Kundaje. 2016. Not just a black box: Learning important features through propagating activation differences. arXiv:1605.01713 (2016).Google ScholarGoogle Scholar
  53. Yiwei Sun, Ngot Bui, Tsung-Yu Hsieh, and Vasant Honavar. 2018. Multi-view network embedding via graph factorization clustering and co-regularized multi-view agreement. In ICDM Workshop. IEEE, 1006--1013.Google ScholarGoogle ScholarCross RefCross Ref
  54. Yiwei Sun and Shabnam Ghaffarzadegan. 2020. An Ontology-Aware Framework for Audio Event Classification. In ICASSP. IEEE, 321--325.Google ScholarGoogle Scholar
  55. Yiwei Sun, Suhang Wang, Tsung-Yu Hsieh, Xianfeng Tang, and Vasant Honavar. 2019. MEGAN: a generative adversarial network for multi-view network embedding. In IJCAI. AAAI Press, 3527--3533.Google ScholarGoogle Scholar
  56. Xianfeng Tang, Yandong Li, Yiwei Sun, Huaxiu Yao, Prasenjit Mitra, and Suhang Wang. 2020 a. Transferring Robustness for Graph Neural Network Against Poisoning Attacks. In WSDM . 600--608.Google ScholarGoogle Scholar
  57. Xianfeng Tang, Huaxiu Yao, Yiwei Sun, Charu C Aggarwal, Prasenjit Mitra, and Suhang Wang. 2020 b. Joint Modeling of Local and Global Temporal Dynamics for Multivariate Time Series Forecasting with Missing Values.. In AAAI. 5956--5963.Google ScholarGoogle Scholar
  58. Yue Wu, José Miguel Hernández Lobato, and Zoubin Ghahramani. 2013. Dynamic covariance models for multivariate financial time series. In ICML . III--558.Google ScholarGoogle Scholar
  59. Yanbo Xu, Siddharth Biswal, Shriprasad R Deshpande, Kevin O Maher, and Jimeng Sun. 2018. Raim: Recurrent attentive and intensive model of multimodal patient monitoring data. In KDD . 2565--2573.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Xiang Xuan and Kevin Murphy. 2007. Modeling changing dependency structure in multivariate time series. In ICML . 1055--1062.Google ScholarGoogle Scholar
  61. Lexiang Ye and Eamonn Keogh. 2009. Time series shapelets: a new primitive for data mining. In KDD. 947--956.Google ScholarGoogle Scholar
  62. Jie Yin, Qiang Yang, and Jeffrey Junfeng Pan. 2008. Sensor-based abnormal human-activity detection. IEEE TKDE , Vol. 20, 8 (2008), 1082--1090.Google ScholarGoogle Scholar
  63. Hyunjin Yoon and Cyrus Shahabi. 2006. Feature subset selection on multivariate time series with extremely large spatial features. In ICDM Workshop). IEEE, 337--342.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Jinsung Yoon, James Jordon, and Mihaela van der Schaar. 2018. INVASE: Instance-wise variable selection using neural networks. In ICLR .Google ScholarGoogle Scholar
  65. Jason Yosinski, Jeff Clune, Anh Nguyen, Thomas Fuchs, and Hod Lipson. 2015. Understanding neural networks through deep visualization. arXiv:1506.06579 (2015).Google ScholarGoogle Scholar
  66. Ye Yuan, Guangxu Xun, Fenglong Ma, Yaqing Wang, Nan Du, Kebin Jia, Lu Su, and Aidong Zhang. 2018. Muvan: A multi-view attention network for multivariate temporal data. In ICDM. IEEE, 717--726.Google ScholarGoogle Scholar
  67. Matthew D Zeiler and Rob Fergus. 2014. Visualizing and understanding convolutional networks. In ECCV. Springer, 818--833.Google ScholarGoogle Scholar
  68. Xi Sheryl Zhang, Fengyi Tang, Hiroko H Dodge, Jiayu Zhou, and Fei Wang. 2019. Metapred: Meta-learning for clinical risk prediction with limited patient electronic health records. In KDD. 2487--2495.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Explainable Multivariate Time Series Classification: A Deep Neural Network Which Learns to Attend to Important Variables As Well As Time Intervals

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        WSDM '21: Proceedings of the 14th ACM International Conference on Web Search and Data Mining
        March 2021
        1192 pages
        ISBN:9781450382977
        DOI:10.1145/3437963

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 8 March 2021

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate498of2,863submissions,17%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader