research-article

Fast and Accurate Time Series Classification with WEASEL

Authors:
Patrick Schäfer

Humboldt University of Berlin, Berlin, Germany

Humboldt University of Berlin, Berlin, Germany
View Profile

,
Ulf Leser

Humboldt University of Berlin, Berlin, Germany

Humboldt University of Berlin, Berlin, Germany
View Profile

CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge ManagementNovember 2017Pages 637–646https://doi.org/10.1145/3132847.3132980

Published:06 November 2017Publication History

CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management

Pages 637–646

ABSTRACT

Time series (TS) occur in many scientific and commercial applications, ranging from earth surveillance to industry automation to the smart grids. An important type of TS analysis is classification, which can, for instance, improve energy load forecasting in smart grids by detecting the types of electronic devices based on their energy consumption profiles recorded by automatic sensors. Such sensor-driven applications are very often characterized by (a) very long TS and (b) very large TS datasets needing classification. However, current methods to time series classification (TSC) cannot cope with such data volumes at acceptable accuracy; they are either scalable but offer only inferior classification quality, or they achieve state-of-the-art classification quality but cannot scale to large data volumes. In this paper, we present WEASEL (Word ExtrAction for time SEries cLassification), a novel TSC method which is both fast and accurate. Like other state-of-the-art TSC methods, WEASEL transforms time series into feature vectors, using a sliding-window approach, which are then analyzed through a machine learning classifier. The novelty of WEASEL lies in its specific method for deriving features, resulting in a much smaller yet much more discriminative feature set. On the popular UCR benchmark of 85 TS datasets, WEASEL is more accurate than the best current non-ensemble algorithms at orders-of-magnitude lower classification and training times, and it is almost as accurate as ensemble classifiers, whose computational complexity makes them inapplicable even for mid-size datasets. The outstanding robustness of WEASEL is also confirmed by experiments on two real smart grid datasets, where it out-of-the-box achieves almost the same accuracy as highly tuned, domain-specific methods.

References

Anthony Bagnall, Luke M. Davis, Jon Hills, and Jason Lines. 2012. Transformation Based Ensembles for Time Series Classification Proceedings of the 2012 SIAM International Conference on Data Mining, Vol. Vol. 12. SIAM, 307--318.Google Scholar
Anthony Bagnall, Jason Lines, Aaron Bostrom, James Large, and Eamonn Keogh. 2016. The Great Time Series Classification Bake Off: An Experimental Evaluation of Recently Proposed Algorithms. Extended Version. Data Mining and Knowledge Discovery (2016), 1--55. Google ScholarDigital Library
Anthony Bagnall, Jason Lines, Jon Hills, and Aaron Bostrom. 2015. Time-Series Classification with COTE: The Collective of Transformation-Based Ensembles. IEEE Transactions on Knowledge and Data Engineering, Vol. 27, 9 (2015), 2522--2535.Google ScholarDigital Library
Mustafa Gokce Baydogan, George Runger, and Eugene Tuv. 2013. A bag-of-features framework to classify time series. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, 11 (2013), 2796--2802. Google ScholarDigital Library
BOSS implementation. 2016. https://github.com/patrickzib/SFA/. (2016).Google Scholar
Aaron Bostrom and Anthony Bagnall. 2015. Binary shapelet transform for multiclass time series classification International Conference on Big Data Analytics and Knowledge Discovery. Springer, 257--269.Google Scholar
Wlodzimierz Bryc. 2012. The normal distribution: characterizations with applications. Vol. Vol. 100. Springer Science & Business Media.Google Scholar
G. Webb C. Tan and F. Petitjean. 2017. Indexing and classifying gigabytes of time series under time warping SIAM SDM.Google Scholar
Janez Demvsar. 2006. Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research Vol. 7 (2006), 1--30. Google ScholarDigital Library
Philippe Esling and Carlos Agon. 2012. Time-series data mining. ACM Computing Surveys Vol. 45, 1 (2012), 12:1--12:34. Google ScholarDigital Library
Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A library for large linear classification. The Journal of Machine Learning Research Vol. 9 (2008), 1871--1874. Google ScholarDigital Library
Jingkun Gao, Suman Giri, Emre Can Kara, and Mario Bergés. 2014. PLAID: a public dataset of high-resoultion electrical appliance measurements for load identification research: demo abstract Proceedings of the 1st ACM Conference on Embedded Systems for Energy-Efficient Buildings. ACM, 198--199. Google ScholarDigital Library
Christophe Gisler, Antonio Ridi, Damien Zujferey, O Abou Khaled, and Jean Hennebert. 2013. Appliance consumption signature database and recognition test protocols International Workshop on Systems, Signal Processing and their Applications (WoSSPA). IEEE, 336--341.Google Scholar
Josif Grabocka, Nicolas Schilling, Martin Wistuba, and Lars Schmidt-Thieme. 2014. Learning time-series shapelets. In Proceedings of the 2014 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 392--401. Google ScholarDigital Library
Benjamin F Hobbs, Suradet Jitprapaikulsarn, Sreenivas Konda, Vira Chankong, Kenneth A Loparo, and Dominic J Maratukulam. 1999. Analysis of the value for unit commitment of improved load forecasts. IEEE Transactions on Power Systems Vol. 14, 4 (1999), 1342--1348.Google ScholarCross Ref
Bing Hu, Yanping Chen, and Eamonn Keogh. 2013. Time Series Classification under More Realistic Assumptions Proceedings of the 2013 SIAM International Conference on Data Mining. SIAM, 578--586.Google Scholar
Zbigniew Jerzak and Holger Ziekow. 2014. The DEBS 2014 Grand Challenge. In Proceedings of the 2014 ACM International Conference on Distributed Event-based Systems. ACM, 266--269. Google ScholarDigital Library
Isak Karlsson, Panagiotis Papapetrou, and Henrik Boström. 2016. Generalized random shapelet forests. Data Mining and Knowledge Discovery Vol. 30, 5 (2016), 1053--1085. Google ScholarDigital Library

Index Terms

Fast and Accurate Time Series Classification with WEASEL

Recommendations

Scalable time series classification

Time series classification tries to mimic the human understanding of similarity. When it comes to long or larger time series datasets, state-of-the-art classifiers reach their limits because of unreasonably high training or testing times. One ...
Read More
A Significantly Faster Elastic-Ensemble for Time-Series Classification
Intelligent Data Engineering and Automated Learning – IDEAL 2019
Abstract
The Elastic-Ensemble [7] has one of the longest build times of all constituents of the current state of the art algorithm for time series classification: the Hierarchical Vote Collective of Transformation-based Ensembles (HIVE-COTE) [8]. We ...
Read More
Early classification on time series

In this paper, we formulate the problem of early classification of time series data, which is important in some time-sensitive applications such as health informatics. We introduce a novel concept of MPL (minimum prediction length) and develop ECTS (...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management
November 2017
2604 pages
ISBN:9781450349185
DOI:10.1145/3132847
General Chairs:
Ee-Peng Lim
Singapore Management University, Singapore
,
Marianne Winslett
University of Illinois at Urbana-Champaign, USA, and Advanced Digital Sciences Center, Singapore
,
Program Chairs:
Mark Sanderson
RMIT, Australia
,
Ada Fu
Chinese University of Hong Kong, Hong Kong
,
Jimeng Sun
Georgia Tech, USA
,
Shane Culpepper
RMIT, Australia
,
Eric Lo
Chinese University of Hong Kong, Hong Kong
,
Joyce Ho
Emory University, USA
,
Debora Donato
Mix Tech, Inc., USA
,
Rakesh Agrawal
Data Insights Laboratories, USA
,
Yu Zheng
Microsoft Research Asia, China
,
Carlos Castillo
Qatar Computing Research Institute, Qatar
,
Aixin Sun
Nanyang Technological University, Singapore
,
Vincent S. Tseng
National Cheng Kung University, Taiwan
,
Chenliang Li
Wuhan University, China
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 6 November 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
bag-of-patterns
classification
feature selection
time series
word co-occurrences
Qualifiers
- research-article
Conference

Acceptance Rates
CIKM '17 Paper Acceptance Rate171of855submissions,20%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 137
  Total Citations
  View Citations
- 1,262
  Total Downloads
- Downloads (Last 12 months)134
- Downloads (Last 6 weeks)18
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Fast and Accurate Time Series Classification with WEASEL

CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Scalable time series classification

A Significantly Faster Elastic-Ensemble for Time-Series Classification

Early classification on time series