Weitere Artikel dieser Ausgabe durch Wischen aufrufen
Responsible editor: M. J. Zaki.
Multivariate time series (MTS) classification has gained importance with the increase in the number of temporal datasets in different domains (such as medicine, finance, multimedia, etc.). Similarity-based approaches, such as nearest-neighbor classifiers, are often used for univariate time series, but MTS are characterized not only by individual attributes, but also by their relationships. Here we provide a classifier based on a new symbolic representation for MTS (denoted as SMTS) with several important elements. SMTS considers all attributes of MTS simultaneously, rather than separately, to extract information contained in the relationships. Symbols are learned from a supervised algorithm that does not require pre-defined intervals, nor features. An elementary representation is used that consists of the time index, and the values (and first differences for numerical attributes) of the individual time series as columns. That is, there is essentially no feature extraction (aside from first differences) and the local series values are fused to time position through the time index. The initial representation of raw data is quite simple conceptually and operationally. Still, a tree-based ensemble can detect interactions in the space of the time index and time values and this is exploited to generate a high-dimensional codebook from the terminal nodes of the trees. Because the time index is included as an attribute, each MTS is learned to be segmented by time, or by the value of one of its attributes. The codebook is processed with a second ensemble where now implicit feature selection is exploited to handle the high-dimensional input. The constituent properties produce a distinctly different algorithm. Moreover, MTS with nominal and missing values are handled efficiently with tree learners. Experiments demonstrate the effectiveness of the proposed approach in terms of accuracy and computation times in a large collection multivariate (and univariate) datasets.
Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten
Sie möchten Zugang zu diesem Inhalt erhalten? Dann informieren Sie sich jetzt über unsere Produkte:
Akl A, Valaee S (2010) Accelerometer-based gesture recognition via dynamic-time warping, affinity propagation, compressive sensing. In Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on, pp 2270–2273, March
Bache K, Lichman M (2013) UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine, CA. http://archive.ics.uci.edu/ml
Bankó Z, Abonyi J (2012) Correlation based dynamic time warping of multivariate time series. Expert Systems with Applications 18(5):231–241
Baydogan MG (2012) Modeling Time Series Data for Supervised Learning. PhD thesis, Arizona State University, Dec.
Baydogan MG (2013) Multivariate time series classification. homepage: www.mustafabaydogan.com/multivariate-time-series-discretization-for-classification.html
Baydogan MG, Runger G, Tuv E (2013) A bag-of-features framework to classify time series. Pattern Analysis and Machine Intelligence, IEEE Transactions on 35(11):2796–2802 CrossRef
Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and Regression Trees. Wadsworth, Belmont, MA MATH
Brodley C, Utgoff P (1995) Multivariate decision trees. Machine Learning 19(1):45–77 MATH
Chakrabarti K, Keogh E, Mehrotra S, Pazzani M (2002) Locally adaptive dimensionality reduction for indexing large time series databases. ACM Trans. Database Syst. 27(2):188–228 CrossRef
Fu T-C (2011) A review on time series data mining. Engineering Applications of Artificial Intelligence 24:164–181 CrossRef
Geurts P (2001) Pattern extraction for time series classification. Principles of Data Mining and Knowledge Discovery, volume 2168 of Lecture Notes in Computer ScienceSpringer, Berlin / Heidelberg, pp 115–127
Hammami N, Bedda M (2010) Improved tree model for arabic speech recognition. In Computer Science and Information Technology (ICCSIT), 2010 3rd IEEE International Conference on, volume 5, pages 521–526, July
Kadous MW, Sammut C (2005) Classification of multivariate time series and structured data using constructive induction. Machine Learning 58:179–216 CrossRef
Keogh E, Zhu Q, Hu B, Y. H, Xi X, Wei L, Ratanamahatana CA (2011) The UCR time series classification/clustering. homepage: www.cs.ucr.edu/~eamonn/time_series_data/
Kudo M, Toyama J, Shimbo M (1999) Multidimensional curve classification using passing-through regions. Pattern Recognition Letters 20(1113):1103–1111 CrossRef
Kuksa PP (2012) 2d similarity kernels for biological sequence classification. In ACM SIGKDD Workshop on Data Mining in Bioinformatics
Li C, Khan L, Prabhakaran B (2006) Real-time classification of variable length multi-attribute motions. Knowledge and Information Systems 10:163–183 CrossRef
Li C, Khan L, Prabhakaran B (2007) Feature selection for classification of variable length multiattribute motions. In Multimedia Data Mining and Knowledge Discovery, pages 116–137. Springer London
Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. In In Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp 2–11. ACM Press
Lin J, Khade R, Li Y (2012) Rotation-invariant similarity in time series using bag-of-patterns representation. Journal of Intelligent Information Systems, pp 1–29
Lin J, Williamson S, Borne K, DeBarr D (2012) Pattern recognition in time series. In Advances in Machine Learning and Data Mining for Astronomy, Chapman & Hall, To appear.
Liu J, Wang Z, Zhong L, Wickramasuriya J, Vasudevan V (2009) uWave: Accelerometer-based personalized gesture recognition and its applications. Pervasive Computing and Communications, IEEE International Conference on 0:1–9
McGovern A, Rosendahl D, Brown R, Droegemeier K (2011) Identifying predictive multi-dimensional time series motifs: an application to severe weather prediction. Data Mining and Knowledge Discovery 22:232–258 CrossRef
Moosmann F, Nowak E, Jurie F (2008) Randomized clustering forests for image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 30:1632–1646 CrossRef
Olszewski RT (2012) http://www.cs.cmu.edu/~bobski/. accessed: June 10
Ordonez P, Armstrong T, Oates T, Fackler J (2011) Using modified multivariate bag-of-words models to classify physiological data. In Proceedings of the 2011 IEEE 11th International Conference on Data Mining Workshops, ICDMW ’11, pages 534–539, Washington, DC, USA, IEEE Computer Society.
Quinlan JR (1993) C4.5: Programs for Machine Learning. Morgan Kaufmann,
Ratanamahatana C, Keogh E (2004) Making time-series classification more accurate using learned constraints. In Proceedings of SIAM International Conference on Data Mining (SDM04), pp 11–22
Ratanamahatana C, Keogh E (2005) Three myths about dynamic time warping data mining. In Proceedings of SIAM International Conference on Data Mining (SDM05), volume 21, pp 506–510
Shieh J, Keogh E (2008) isax: indexing and mining terabyte sized time series. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’08, pages 623–631, New York, NY, USA, ACM.
CMU Graphics Lab Motion Capture Database. Homepage: mocap.cs.cmu.edu, 2012
Weng X, Shen J (2008) Classification of multivariate time series using locality preserving projections. Knowledge-Based Systems 21(7):581–587 CrossRef
- Learning a symbolic representation for multivariate time series classification
Mustafa Gokce Baydogan
- Springer US
Neuer Inhalt/© ITandMEDIA