Skip to main content
Top

2018 | OriginalPaper | Chapter

A SMOTE Extension for Balancing Multivariate Epilepsy-Related Time Series Datasets

Authors : Enrique de la Cal, José R. Villar, Paula Vergara, Javier Sedano, Álvaro Herrero

Published in: International Joint Conference SOCO’17-CISIS’17-ICEUTE’17 León, Spain, September 6–8, 2017, Proceeding

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In some cases, big data bunches are in the form of Time Series (TS), where the occurrence of complex TS events are rarely presented. In this scenario, learning algorithms need to cope with the TS data balancing problem, which has been barely studied for TS datasets. This research addresses this issue, describing a very simple TS extension of the well-known SMOTE algorithm for balancing datasets. To validate the proposal, it is applied to a realistic dataset publicly available containing epilepsy-related TS. A study on the characteristics of the dataset before and after the performance of this TS balancing algorithm is performed, showing evidence on the requirements for the research on this topic, the energy efficiency of the algorithm and the TS generation process among them.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Beniczky, S., Polster, T., Kjaer, T., Hjalgrim, H.: Detection of generalized tonic-clonic seizures by a wireless wrist accelerometer: a prospective, multicenter study. Epilepsia 4(54), e58–61 (2013)CrossRef Beniczky, S., Polster, T., Kjaer, T., Hjalgrim, H.: Detection of generalized tonic-clonic seizures by a wireless wrist accelerometer: a prospective, multicenter study. Epilepsia 4(54), e58–61 (2013)CrossRef
2.
go back to reference Villar, J.R., González, S., Sedano, J., Chira, C., Trejo-Gabriel-Galán, J.M.: Improving human activity recognition and its application in early stroke diagnosis. Int. J. Neural Syst. 25(4), 1450036–1450055 (2015)CrossRef Villar, J.R., González, S., Sedano, J., Chira, C., Trejo-Gabriel-Galán, J.M.: Improving human activity recognition and its application in early stroke diagnosis. Int. J. Neural Syst. 25(4), 1450036–1450055 (2015)CrossRef
3.
go back to reference Villar, J.R., Vergara, P., Menéndez, M., de la Cal, E., González, V.M., Sedano, J.: Generalized models for the classification of abnormal movements in daily life and its applicability to epilepsy convulsion recognition. Int. J. Neural Syst. 26(6) (2016). https://doi.org/10.1142/S0129065716500374 Villar, J.R., Vergara, P., Menéndez, M., de la Cal, E., González, V.M., Sedano, J.: Generalized models for the classification of abnormal movements in daily life and its applicability to epilepsy convulsion recognition. Int. J. Neural Syst. 26(6) (2016). https://​doi.​org/​10.​1142/​S012906571650037​4
4.
go back to reference Villar, J.R., Menéndez, M., de la Cal, E., González, V.M., Sedano, J.: Identification of abnormal movements with 3D accelerometer sensors for its application to seizure recognition. Int. J. Appl. Logic (2016). Accepted for publication Villar, J.R., Menéndez, M., de la Cal, E., González, V.M., Sedano, J.: Identification of abnormal movements with 3D accelerometer sensors for its application to seizure recognition. Int. J. Appl. Logic (2016). Accepted for publication
5.
go back to reference López, V., Fernández, A., del Jesus, M., Herrera, F.: A hierarchical genetic fuzzy system based on genetic programming for addressing classification with highly imbalanced and borderline data-sets. Knowl.-Based Syst. 38, 85–104 (2013)CrossRef López, V., Fernández, A., del Jesus, M., Herrera, F.: A hierarchical genetic fuzzy system based on genetic programming for addressing classification with highly imbalanced and borderline data-sets. Knowl.-Based Syst. 38, 85–104 (2013)CrossRef
6.
go back to reference Galar, M., Fernández, A., Barrenechea, E., Herrera, F.: Eusboost: enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling. Pattern Recogn. 46(12), 3460–3471 (2013)CrossRef Galar, M., Fernández, A., Barrenechea, E., Herrera, F.: Eusboost: enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling. Pattern Recogn. 46(12), 3460–3471 (2013)CrossRef
7.
go back to reference Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res., 321–357 (2002) Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res., 321–357 (2002)
8.
go back to reference Batista, G., Prati, R., Monard, M.: A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explor. 6, 20–29 (2004)CrossRef Batista, G., Prati, R., Monard, M.: A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explor. 6, 20–29 (2004)CrossRef
9.
go back to reference He, H., Bai, Y., Garcia, E., Li, S., et al.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: IEEE International Joint Conference on Neural Networks, IJCNN 2008, (IEEE World Congress on Computational Intelligence), pp. 1322–1328. IEEE (2008) He, H., Bai, Y., Garcia, E., Li, S., et al.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: IEEE International Joint Conference on Neural Networks, IJCNN 2008, (IEEE World Congress on Computational Intelligence), pp. 1322–1328. IEEE (2008)
10.
go back to reference Tang, S., Chen, S.: The generation mechanism of synthetic minority class examples. In: Proceedings of 5th International Conference on Information Technology and Applications in Biomedicine (ITAB 2008), pp. 444–447 (2008) Tang, S., Chen, S.: The generation mechanism of synthetic minority class examples. In: Proceedings of 5th International Conference on Information Technology and Applications in Biomedicine (ITAB 2008), pp. 444–447 (2008)
11.
go back to reference Stefanowski, J., Wilk, S.: Selective pre-processing of imbalanced data for improving classification performance. In: Proceedings of the 10th International Conference in Data Warehousing and Knowledge Discovery (DaWaK2008), vol. LNCS 5182, pp. 283–292. Springer (2008) Stefanowski, J., Wilk, S.: Selective pre-processing of imbalanced data for improving classification performance. In: Proceedings of the 10th International Conference in Data Warehousing and Knowledge Discovery (DaWaK2008), vol. LNCS 5182, pp. 283–292. Springer (2008)
12.
go back to reference Fu, T.C.: A review on time series data mining. Eng. Appl. Artif. Intell. 24(1), 164–181 (2011)CrossRef Fu, T.C.: A review on time series data mining. Eng. Appl. Artif. Intell. 24(1), 164–181 (2011)CrossRef
13.
go back to reference Mishra, S., Saravanan, C., Dwivedi, V., Pathak, K.: Discovering flood rising pattern in hydrological time series data mining during the pre monsoon period. Indian J. Mar. Sci. 44(3), 3 (2015) Mishra, S., Saravanan, C., Dwivedi, V., Pathak, K.: Discovering flood rising pattern in hydrological time series data mining during the pre monsoon period. Indian J. Mar. Sci. 44(3), 3 (2015)
14.
go back to reference Montgomery, D.C., Jennings, C.L., Kulahci, M.: Introduction to Time Series Analysis and Forecasting. Wiley, Hoboken (2015)MATH Montgomery, D.C., Jennings, C.L., Kulahci, M.: Introduction to Time Series Analysis and Forecasting. Wiley, Hoboken (2015)MATH
15.
go back to reference Moses, D., et al.: A survey of data mining algorithms used in cardiovascular disease diagnosis from multi-lead ecg data. Kuwait J. Sci. 42(2) (2015) Moses, D., et al.: A survey of data mining algorithms used in cardiovascular disease diagnosis from multi-lead ecg data. Kuwait J. Sci. 42(2) (2015)
16.
go back to reference Köknar-Tezel, S., Latecki, L.J.: Improving SVM classification on imbalanced time series data sets with ghost points. Knowl. Inf. Syst. 28(1), 1–23 (2011)CrossRef Köknar-Tezel, S., Latecki, L.J.: Improving SVM classification on imbalanced time series data sets with ghost points. Knowl. Inf. Syst. 28(1), 1–23 (2011)CrossRef
17.
go back to reference Agrawal, A., Viktor, H.L., Paquet, E.: SCUT: multi-class imbalanced data classification using smote and cluster-based undersampling. In: Proceedings of 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K) (2015) Agrawal, A., Viktor, H.L., Paquet, E.: SCUT: multi-class imbalanced data classification using smote and cluster-based undersampling. In: Proceedings of 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K) (2015)
18.
go back to reference Phan, S., Famili, F., Tang, Z., Pan, Y., Liu, Z., Ouyang, J., Lenferink, A., Oconnor, M.M.C.: A novel pattern based clustering methodology for time-series microarray data. Int. J. Comput. Mathe. 84(5), 585–597 (2007)MathSciNetCrossRefMATH Phan, S., Famili, F., Tang, Z., Pan, Y., Liu, Z., Ouyang, J., Lenferink, A., Oconnor, M.M.C.: A novel pattern based clustering methodology for time-series microarray data. Int. J. Comput. Mathe. 84(5), 585–597 (2007)MathSciNetCrossRefMATH
Metadata
Title
A SMOTE Extension for Balancing Multivariate Epilepsy-Related Time Series Datasets
Authors
Enrique de la Cal
José R. Villar
Paula Vergara
Javier Sedano
Álvaro Herrero
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-67180-2_43

Premium Partner