skip to main content
10.1145/3132847.3132980acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Fast and Accurate Time Series Classification with WEASEL

Published:06 November 2017Publication History

ABSTRACT

Time series (TS) occur in many scientific and commercial applications, ranging from earth surveillance to industry automation to the smart grids. An important type of TS analysis is classification, which can, for instance, improve energy load forecasting in smart grids by detecting the types of electronic devices based on their energy consumption profiles recorded by automatic sensors. Such sensor-driven applications are very often characterized by (a) very long TS and (b) very large TS datasets needing classification. However, current methods to time series classification (TSC) cannot cope with such data volumes at acceptable accuracy; they are either scalable but offer only inferior classification quality, or they achieve state-of-the-art classification quality but cannot scale to large data volumes. In this paper, we present WEASEL (Word ExtrAction for time SEries cLassification), a novel TSC method which is both fast and accurate. Like other state-of-the-art TSC methods, WEASEL transforms time series into feature vectors, using a sliding-window approach, which are then analyzed through a machine learning classifier. The novelty of WEASEL lies in its specific method for deriving features, resulting in a much smaller yet much more discriminative feature set. On the popular UCR benchmark of 85 TS datasets, WEASEL is more accurate than the best current non-ensemble algorithms at orders-of-magnitude lower classification and training times, and it is almost as accurate as ensemble classifiers, whose computational complexity makes them inapplicable even for mid-size datasets. The outstanding robustness of WEASEL is also confirmed by experiments on two real smart grid datasets, where it out-of-the-box achieves almost the same accuracy as highly tuned, domain-specific methods.

References

  1. Anthony Bagnall, Luke M. Davis, Jon Hills, and Jason Lines. 2012. Transformation Based Ensembles for Time Series Classification Proceedings of the 2012 SIAM International Conference on Data Mining, Vol. Vol. 12. SIAM, 307--318.Google ScholarGoogle Scholar
  2. Anthony Bagnall, Jason Lines, Aaron Bostrom, James Large, and Eamonn Keogh. 2016. The Great Time Series Classification Bake Off: An Experimental Evaluation of Recently Proposed Algorithms. Extended Version. Data Mining and Knowledge Discovery (2016), 1--55. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Anthony Bagnall, Jason Lines, Jon Hills, and Aaron Bostrom. 2015. Time-Series Classification with COTE: The Collective of Transformation-Based Ensembles. IEEE Transactions on Knowledge and Data Engineering, Vol. 27, 9 (2015), 2522--2535.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Mustafa Gokce Baydogan, George Runger, and Eugene Tuv. 2013. A bag-of-features framework to classify time series. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, 11 (2013), 2796--2802. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. BOSS implementation. 2016. https://github.com/patrickzib/SFA/. (2016).Google ScholarGoogle Scholar
  6. Aaron Bostrom and Anthony Bagnall. 2015. Binary shapelet transform for multiclass time series classification International Conference on Big Data Analytics and Knowledge Discovery. Springer, 257--269.Google ScholarGoogle Scholar
  7. Wlodzimierz Bryc. 2012. The normal distribution: characterizations with applications. Vol. Vol. 100. Springer Science & Business Media.Google ScholarGoogle Scholar
  8. G. Webb C. Tan and F. Petitjean. 2017. Indexing and classifying gigabytes of time series under time warping SIAM SDM.Google ScholarGoogle Scholar
  9. Janez Demvsar. 2006. Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research Vol. 7 (2006), 1--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Philippe Esling and Carlos Agon. 2012. Time-series data mining. ACM Computing Surveys Vol. 45, 1 (2012), 12:1--12:34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A library for large linear classification. The Journal of Machine Learning Research Vol. 9 (2008), 1871--1874. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Jingkun Gao, Suman Giri, Emre Can Kara, and Mario Bergés. 2014. PLAID: a public dataset of high-resoultion electrical appliance measurements for load identification research: demo abstract Proceedings of the 1st ACM Conference on Embedded Systems for Energy-Efficient Buildings. ACM, 198--199. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Christophe Gisler, Antonio Ridi, Damien Zujferey, O Abou Khaled, and Jean Hennebert. 2013. Appliance consumption signature database and recognition test protocols International Workshop on Systems, Signal Processing and their Applications (WoSSPA). IEEE, 336--341.Google ScholarGoogle Scholar
  14. Josif Grabocka, Nicolas Schilling, Martin Wistuba, and Lars Schmidt-Thieme. 2014. Learning time-series shapelets. In Proceedings of the 2014 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 392--401. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Benjamin F Hobbs, Suradet Jitprapaikulsarn, Sreenivas Konda, Vira Chankong, Kenneth A Loparo, and Dominic J Maratukulam. 1999. Analysis of the value for unit commitment of improved load forecasts. IEEE Transactions on Power Systems Vol. 14, 4 (1999), 1342--1348.Google ScholarGoogle ScholarCross RefCross Ref
  16. Bing Hu, Yanping Chen, and Eamonn Keogh. 2013. Time Series Classification under More Realistic Assumptions Proceedings of the 2013 SIAM International Conference on Data Mining. SIAM, 578--586.Google ScholarGoogle Scholar
  17. Zbigniew Jerzak and Holger Ziekow. 2014. The DEBS 2014 Grand Challenge. In Proceedings of the 2014 ACM International Conference on Distributed Event-based Systems. ACM, 266--269. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Isak Karlsson, Panagiotis Papapetrou, and Henrik Boström. 2016. Generalized random shapelet forests. Data Mining and Knowledge Discovery Vol. 30, 5 (2016), 1053--1085. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Fast and Accurate Time Series Classification with WEASEL

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management
          November 2017
          2604 pages
          ISBN:9781450349185
          DOI:10.1145/3132847

          Copyright © 2017 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 6 November 2017

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          CIKM '17 Paper Acceptance Rate171of855submissions,20%Overall Acceptance Rate1,861of8,427submissions,22%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader