Skip to main content
Top
Published in: Soft Computing 15/2019

04-06-2018 | Methodologies and Application

An ensemble of shapelet-based classifiers on inter-class and intra-class imbalanced multivariate time series at the early stage

Authors: Guoliang He, Wen Zhao, Xuewen Xia, Rong Peng, Xiaoying Wu

Published in: Soft Computing | Issue 15/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Early classification of time series will weaken the accuracy to some degree. If the time series data are imbalanced, it will be also challenging to accurately identify minority class examples. Up to now, these two problems have been intensively addressed separately on univariate time series data, but yet to be well studied when they occur together. Compared with univariate time series, multivariate time series (MTS) is more complex, which contains multiple variables, and the interconnections between variables are hidden. Therefore, it is even more challenging to handle the combination of both problems on multivariate time series. In this paper, we propose an adaptive classification ensemble method called early prediction on imbalanced MTS to deal with early classification on inter-class and intra-class imbalanced MTS data simultaneously. First, an adaptive ensemble framework is designed to learn an early classification model on imbalanced MTS data. Based on a multiple under-sampling approach and dynamical subspace generation method, the diversity of base classifiers is realized as well as all majority class examples being fully utilized. Second, to deal with the implicit issue of intra-class imbalance in the training data, a cluster-based shapelet selection method is introduced to obtain an optimal set of stable and robust shapelets. Finally, an associate-pattern mining approach is designed to efficiently learn base classifiers, which could enhance the interpretability of classification. Experimental results show that our proposed method can achieve effective early prediction on inter-class and intra-class imbalanced MTS data.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th VLDB conference, pp 487–499 Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th VLDB conference, pp 487–499
go back to reference Agrawal M, Singh G, Kumar GR (2012) Predictive data mining for highly imbalanced classification. Int J Emerg Technol Adv Eng 2(12):139–143 Agrawal M, Singh G, Kumar GR (2012) Predictive data mining for highly imbalanced classification. Int J Emerg Technol Adv Eng 2(12):139–143
go back to reference Bregón A, Simón M A, Rodríguez JJ, Alonso CJ, et al (2005) Early fault classification in dynamic systems using case-based reasoning. In: Proceedings of the Spanish Association for Artificial Intelligence, pp 211–220 Bregón A, Simón M A, Rodríguez JJ, Alonso CJ, et al (2005) Early fault classification in dynamic systems using case-based reasoning. In: Proceedings of the Spanish Association for Artificial Intelligence, pp 211–220
go back to reference Cao H, Li X-L, Woon Y-K, Ng S-K (2013) Integrated oversampling for imbalanced time series classification. IEEE Trans Knowl Data Eng 25(12):2809–2822CrossRef Cao H, Li X-L, Woon Y-K, Ng S-K (2013) Integrated oversampling for imbalanced time series classification. IEEE Trans Knowl Data Eng 25(12):2809–2822CrossRef
go back to reference Cao H, Li XL, Woon YK, Ng SK (2011) SPO: structure preserving oversampling for imbalanced time series classification. In: Proceedings of international conference on data mining, pp 1008–1013 Cao H, Li XL, Woon YK, Ng SK (2011) SPO: structure preserving oversampling for imbalanced time series classification. In: Proceedings of international conference on data mining, pp 1008–1013
go back to reference Cieslak DA, Chawla NV (2008) Learning decision trees for unbalanced data. In: Proceedings of European conference on machine learning and principles and practice of knowledge discovery in databases, pp 241–256 Cieslak DA, Chawla NV (2008) Learning decision trees for unbalanced data. In: Proceedings of European conference on machine learning and principles and practice of knowledge discovery in databases, pp 241–256
go back to reference Diez JJR, González CA, Boström H (2001) Boosting interval based literals: variable length and early classification. Intell Data Anal 5(3):245–262CrossRef Diez JJR, González CA, Boström H (2001) Boosting interval based literals: variable length and early classification. Intell Data Anal 5(3):245–262CrossRef
go back to reference Garcia-Trevino ES, Barria JA (2014) Structural generative descriptions for time series classification. IEEE Trans Cybern 44(10):1978–1991CrossRef Garcia-Trevino ES, Barria JA (2014) Structural generative descriptions for time series classification. IEEE Trans Cybern 44(10):1978–1991CrossRef
go back to reference Ghalwash MF, Obradovic Z (2012) Early classification of multivariate temporal observations by extraction of interpretable shapelets. BMC Bioinform 13:195CrossRef Ghalwash MF, Obradovic Z (2012) Early classification of multivariate temporal observations by extraction of interpretable shapelets. BMC Bioinform 13:195CrossRef
go back to reference Ghalwash MF, Radosavljevic V, Obradovic Z (2013) Extraction of interpretable multivariate patterns for early diagnostics. In: Proceedings of international conference on data mining, pp 201–210 Ghalwash MF, Radosavljevic V, Obradovic Z (2013) Extraction of interpretable multivariate patterns for early diagnostics. In: Proceedings of international conference on data mining, pp 201–210
go back to reference Ghalwash MF, Radosavljevic V, Obradovic Z (2014) Utilizing temporal patterns for estimating uncertainty in interpretable early decision making. In: Proceedings of ACM SIGKDD international conference on Knowledge discovery and data mining, pp 402–411 Ghalwash MF, Radosavljevic V, Obradovic Z (2014) Utilizing temporal patterns for estimating uncertainty in interpretable early decision making. In: Proceedings of ACM SIGKDD international conference on Knowledge discovery and data mining, pp 402–411
go back to reference Griffin MP, O’Shea TM, Bissonette EA, Harrell FE Jr, Lake DE, Moorman JR (2003) Abnormal heart rate characteristics preceding neonatal sepsis and sepsis-like illness. Pediatr Res 53(6):920–926CrossRef Griffin MP, O’Shea TM, Bissonette EA, Harrell FE Jr, Lake DE, Moorman JR (2003) Abnormal heart rate characteristics preceding neonatal sepsis and sepsis-like illness. Pediatr Res 53(6):920–926CrossRef
go back to reference He Q, Dong Z, Zhuang F, Shang T, Shi Z (2012) Fast time series classification based on infrequent shapelets. In: Proceedings of international conference on machine learning and applications, pp 215–219 He Q, Dong Z, Zhuang F, Shang T, Shi Z (2012) Fast time series classification based on infrequent shapelets. In: Proceedings of international conference on machine learning and applications, pp 215–219
go back to reference He G, Duan Y, Qian T, Xu C (2013) Early prediction on imbalanced multivariate time series. In: Proceedings of ACM international conference on Information and knowledge management, pp 1889–1892 He G, Duan Y, Qian T, Xu C (2013) Early prediction on imbalanced multivariate time series. In: Proceedings of ACM international conference on Information and knowledge management, pp 1889–1892
go back to reference He G, Duan Y, Peng R, Jing X, Qian T, Wang L (2015) Early classification on multivariate time series. Neurocomputing 149:777–787CrossRef He G, Duan Y, Peng R, Jing X, Qian T, Wang L (2015) Early classification on multivariate time series. Neurocomputing 149:777–787CrossRef
go back to reference He G, Chen L, Zeng C, Zheng Q, Zhou G (2016) Probabilistic skyline queries on uncertain time series. Neurocomputing 191:224–237CrossRef He G, Chen L, Zeng C, Zheng Q, Zhou G (2016) Probabilistic skyline queries on uncertain time series. Neurocomputing 191:224–237CrossRef
go back to reference He G, Li Y, Zhao W (2017) An uncertainty and density based active semi-supervised learning scheme for positive unlabeled multivariate time series classification. Knowl Based Syst 124:80–92CrossRef He G, Li Y, Zhao W (2017) An uncertainty and density based active semi-supervised learning scheme for positive unlabeled multivariate time series classification. Knowl Based Syst 124:80–92CrossRef
go back to reference Ho T (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844CrossRef Ho T (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844CrossRef
go back to reference Köknar-Tezek S, Latecki LJ (2011) Improving SVM classification on imbalanced time series data sets with ghost points. Knowl Inf Syst 28(1):1–23CrossRef Köknar-Tezek S, Latecki LJ (2011) Improving SVM classification on imbalanced time series data sets with ghost points. Knowl Inf Syst 28(1):1–23CrossRef
go back to reference Liang G (2013) An effective method for imbalanced time series classification: hybrid sampling, AI 2013. Lect Notes Comput Sci 8272:374–385CrossRef Liang G (2013) An effective method for imbalanced time series classification: hybrid sampling, AI 2013. Lect Notes Comput Sci 8272:374–385CrossRef
go back to reference Liang G, Zhang C (2012) A comparative study of sampling methods and algorithms for imbalanced time series classification. In: Proceedings of Australasian joint conference on artificial intelligence, pp 637–648 Liang G, Zhang C (2012) A comparative study of sampling methods and algorithms for imbalanced time series classification. In: Proceedings of Australasian joint conference on artificial intelligence, pp 637–648
go back to reference Mueen A, Keogh E, Yong N (2011) Logical-shapelets: an expressive primitive for time series classification. In: Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining, pp 1154–1162 Mueen A, Keogh E, Yong N (2011) Logical-shapelets: an expressive primitive for time series classification. In: Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining, pp 1154–1162
go back to reference Orsenigo C, Vercellis C (2010) Combining discrete SVM and fixed cardinality warping distances for multivariate time series classification. Pattern Recognit 43:3787–3794MATHCrossRef Orsenigo C, Vercellis C (2010) Combining discrete SVM and fixed cardinality warping distances for multivariate time series classification. Pattern Recognit 43:3787–3794MATHCrossRef
go back to reference Petković D, Gocić M, Shamshirband S (2016) Adaptive neuro-fuzzy computing technique for precipitation estimation. Facta Univ Ser Mech Eng 14(2):209–218CrossRef Petković D, Gocić M, Shamshirband S (2016) Adaptive neuro-fuzzy computing technique for precipitation estimation. Facta Univ Ser Mech Eng 14(2):209–218CrossRef
go back to reference Ping XO, Tseng YJ, Lin YP, Chiu HJ, Lai F, Liang JD, Huang GT, Yang PM (2015) A multiple measurements case-based reasoning method for predicting recurrent status of liver cancer patients. Comput Ind 69:12–21CrossRef Ping XO, Tseng YJ, Lin YP, Chiu HJ, Lai F, Liang JD, Huang GT, Yang PM (2015) A multiple measurements case-based reasoning method for predicting recurrent status of liver cancer patients. Comput Ind 69:12–21CrossRef
go back to reference Rousseeuw P (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20(1):53–65MATHCrossRef Rousseeuw P (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20(1):53–65MATHCrossRef
go back to reference Ryan HT, Qian Q, Chawla NV, Zhou Z-H (2012) Building decision trees for the multi-class imbalance problem. In: Proceedings of Pacific-Asia conference on knowledge discovery and data mining, pp 122–134 Ryan HT, Qian Q, Chawla NV, Zhou Z-H (2012) Building decision trees for the multi-class imbalance problem. In: Proceedings of Pacific-Asia conference on knowledge discovery and data mining, pp 122–134
go back to reference Sun Y, Kamel MS, Wong AKC, Wang Y (2007) Cost-sensitive boosting for classification of imbalanced data. Pattern Recognit 40:3358–3378MATHCrossRef Sun Y, Kamel MS, Wong AKC, Wang Y (2007) Cost-sensitive boosting for classification of imbalanced data. Pattern Recognit 40:3358–3378MATHCrossRef
go back to reference Tan YFV, Cao H, Pang J (2013) MOGT: oversampling with a parsimonious mixture of Gaussian trees model for imbalanced time-series classification. In: MLSP, pp 1–6 Tan YFV, Cao H, Pang J (2013) MOGT: oversampling with a parsimonious mixture of Gaussian trees model for imbalanced time-series classification. In: MLSP, pp 1–6
go back to reference Tseng YJ, Ping XO, Liang JD, Yang PM, Huang GT, Lai F (2015) Multiple time series clinical data processing for classification with merging algorithm and statistical measures. IEEE J Biomed Health Inform 15(3):1036–43 Tseng YJ, Ping XO, Liang JD, Yang PM, Huang GT, Lai F (2015) Multiple time series clinical data processing for classification with merging algorithm and statistical measures. IEEE J Biomed Health Inform 15(3):1036–43
go back to reference Wen X, Shao L, Xue Y, Fang W (2015) A rapid learning algorithm for vehicle classification. Inf Sci 295(1):395–406CrossRef Wen X, Shao L, Xue Y, Fang W (2015) A rapid learning algorithm for vehicle classification. Inf Sci 295(1):395–406CrossRef
go back to reference Xing Z, Pei J, Yu PS (2009) Early prediction on time series: a nearest neighbor approach. In: Proceedings of international joint conference on artifical intelligence, pp 1297–1302 Xing Z, Pei J, Yu PS (2009) Early prediction on time series: a nearest neighbor approach. In: Proceedings of international joint conference on artifical intelligence, pp 1297–1302
go back to reference Xing Z, Pei J, Keogh E (2010) A brief survey on sequence classification. ACM SIGKDD Explor 12(1):40–48CrossRef Xing Z, Pei J, Keogh E (2010) A brief survey on sequence classification. ACM SIGKDD Explor 12(1):40–48CrossRef
go back to reference Xing Z, Pei J, Yu PS, Wang K (2011) Extracting interpretable features for early classification on time series. In: Proceedings of SIAM international conference on data mining, pp 247–258 Xing Z, Pei J, Yu PS, Wang K (2011) Extracting interpretable features for early classification on time series. In: Proceedings of SIAM international conference on data mining, pp 247–258
go back to reference Xu R, Wunsch D II (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678CrossRef Xu R, Wunsch D II (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678CrossRef
go back to reference Ye L, Keogh E (2009) Time series shapelets: a new primitive for data mining. In: Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining, pp. 947–956 Ye L, Keogh E (2009) Time series shapelets: a new primitive for data mining. In: Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining, pp. 947–956
go back to reference Yoon H, Yang K, Shahabi C (2005) Feature subset selection and feature ranking for multivariate time series. IEEE Trans Knowl Data Eng 17(9):1186–1198CrossRef Yoon H, Yang K, Shahabi C (2005) Feature subset selection and feature ranking for multivariate time series. IEEE Trans Knowl Data Eng 17(9):1186–1198CrossRef
go back to reference Zheng Y, Jeon B, Xu D, Wu QM, Zhang H (2015) Image segmentation by generalized hierarchical fuzzy C-means algorithm. J Intell Fuzzy Syst 28(2):961–973CrossRef Zheng Y, Jeon B, Xu D, Wu QM, Zhang H (2015) Image segmentation by generalized hierarchical fuzzy C-means algorithm. J Intell Fuzzy Syst 28(2):961–973CrossRef
Metadata
Title
An ensemble of shapelet-based classifiers on inter-class and intra-class imbalanced multivariate time series at the early stage
Authors
Guoliang He
Wen Zhao
Xuewen Xia
Rong Peng
Xiaoying Wu
Publication date
04-06-2018
Publisher
Springer Berlin Heidelberg
Published in
Soft Computing / Issue 15/2019
Print ISSN: 1432-7643
Electronic ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-018-3261-3

Other articles of this Issue 15/2019

Soft Computing 15/2019 Go to the issue

Premium Partner