Skip to main content
Erschienen in: Progress in Artificial Intelligence 1/2018

08.06.2017 | Regular Paper

Impact of time series discretization on intensive care burn unit survival classification

verfasst von: Isidoro J. Casanova, Manuel Campos, Jose M. Juarez, Antonio Fernandez-Fernandez-Arroyo, Jose A. Lorente

Erschienen in: Progress in Artificial Intelligence | Ausgabe 1/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In the preprocessing step of a knowledge discovery process, the method of discretization selected can have a remarkable impact on the performance and accuracy of classification algorithms. In this article, we analyze and compare expert discretization and automatic discretization algorithms. In particular, we study their impact to predict the survival of patients in the context of intensive care burn units. We focus on the quality of different discretizations algorithm analyzing the number of intervals generated, the amount of patterns produced and the classification performance in a specific clinical problem. Our results show that the many algorithms underperform expert discretization and that it is necessary to take into account the correlation among continuous features to obtain the best accuracy.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Agrawal, R., Srikant, R.: Mining sequential patterns. In: International Conference on Data Engineering, March 6–10, 1995, Taipei, Taiwan (1995) Agrawal, R., Srikant, R.: Mining sequential patterns. In: International Conference on Data Engineering, March 6–10, 1995, Taipei, Taiwan (1995)
2.
Zurück zum Zitat Alcalá-Fdez, J., Sánchez, L., García, S., del Jesus, M.J., Ventura, S., Garrell, J.M., Otero, J., Romero, C., Bacardit, J., Rivas, V.M., Fernández, J.C., Herrera, F.: KEEL: a software tool to assess evolutionary algorithms to data mining problems. Soft Comput. 13(3), 307–318 (2009)CrossRef Alcalá-Fdez, J., Sánchez, L., García, S., del Jesus, M.J., Ventura, S., Garrell, J.M., Otero, J., Romero, C., Bacardit, J., Rivas, V.M., Fernández, J.C., Herrera, F.: KEEL: a software tool to assess evolutionary algorithms to data mining problems. Soft Comput. 13(3), 307–318 (2009)CrossRef
3.
Zurück zum Zitat Azulay, R. et al.: Discretization of medical time series—A comparative study. In: Proceedings of the IDAMAP 2007, Amsterdam, The Netherlands, (2007) Azulay, R. et al.: Discretization of medical time series—A comparative study. In: Proceedings of the IDAMAP 2007, Amsterdam, The Netherlands, (2007)
4.
Zurück zum Zitat Casanova, I.J., Campos, M., Juarez, J.M., Fernandez-Fernandez-Arroyo, A., Lorente, J.A.: Using multivariate sequential patterns to improve survival prediction in Intensive Care Burn Unit. In: Proceedings of the 15th Conference on Artificial Intelligence in Medicine, AIME 2015, pp. 277–286. Pavia, Italy (2015) Casanova, I.J., Campos, M., Juarez, J.M., Fernandez-Fernandez-Arroyo, A., Lorente, J.A.: Using multivariate sequential patterns to improve survival prediction in Intensive Care Burn Unit. In: Proceedings of the 15th Conference on Artificial Intelligence in Medicine, AIME 2015, pp. 277–286. Pavia, Italy (2015)
5.
Zurück zum Zitat Casanova, I.J., Campos, M., Juarez, J.M., Fernandez-Fernandez-Arroyo, A., Lorente, J.A.: Impact of discretization with multivariate sequential patterns to do the classification of the survival prediction in Intensive Care Burn Unit. In: Proceedings of the VIII Simposio Teoría y Aplicaciones de Minería de Datos (TAMIDA 2016). CAEPIA 2016, pages 847–856. Salamanca, Spain (2016) Casanova, I.J., Campos, M., Juarez, J.M., Fernandez-Fernandez-Arroyo, A., Lorente, J.A.: Impact of discretization with multivariate sequential patterns to do the classification of the survival prediction in Intensive Care Burn Unit. In: Proceedings of the VIII Simposio Teoría y Aplicaciones de Minería de Datos (TAMIDA 2016). CAEPIA 2016, pages 847–856. Salamanca, Spain (2016)
6.
Zurück zum Zitat Cios, K.J., Pedrycz, W., Swiniarski, R.W., Kurgan, L.: Data Mining: A Knowledge Discovery Approach. Springer Science & Business Media, Berlin (2007)MATH Cios, K.J., Pedrycz, W., Swiniarski, R.W., Kurgan, L.: Data Mining: A Knowledge Discovery Approach. Springer Science & Business Media, Berlin (2007)MATH
7.
Zurück zum Zitat Clarke, E.J., Barton, B.A.: Entropy and MDL discretization of continuous variables for Bayesian belief networks. Int. J. Intell. Syst. 15, 61–92 (2000)CrossRef Clarke, E.J., Barton, B.A.: Entropy and MDL discretization of continuous variables for Bayesian belief networks. Int. J. Intell. Syst. 15, 61–92 (2000)CrossRef
8.
Zurück zum Zitat Cohen, W.W.: Fast effective rule induction. In: Proceedings of the 20th International Conference on Machine Learning, pp. 115–123. Morgan Kaufmann, (1995) Cohen, W.W.: Fast effective rule induction. In: Proceedings of the 20th International Conference on Machine Learning, pp. 115–123. Morgan Kaufmann, (1995)
9.
Zurück zum Zitat Demsar, J., Zupan, B., Aoki, N., et al.: Feature mining and predictive model construction from severe trauma patient’s data. Int. J. Med. Inform. 63, 41–50 (2012)CrossRef Demsar, J., Zupan, B., Aoki, N., et al.: Feature mining and predictive model construction from severe trauma patient’s data. Int. J. Med. Inform. 63, 41–50 (2012)CrossRef
10.
Zurück zum Zitat Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: XIII International Joint Conference on Artificial Intelligence (IJCAI93), Chambery, France, pp. 1022–1029, (1993) Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: XIII International Joint Conference on Artificial Intelligence (IJCAI93), Chambery, France, pp. 1022–1029, (1993)
11.
Zurück zum Zitat Ferreira, A.J.: Feature selection and discretization for high-dimensional data. Ph.D. Thesis, Universidade de Lisboa, (2014) Ferreira, A.J.: Feature selection and discretization for high-dimensional data. Ph.D. Thesis, Universidade de Lisboa, (2014)
12.
Zurück zum Zitat Garcia, S., Luengo, J., Saez, J.A., Lopez, V., Herrera, F.: A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. IEEE Trans. Knowl. Data Eng. 25(4), 734–750 (2013)CrossRef Garcia, S., Luengo, J., Saez, J.A., Lopez, V., Herrera, F.: A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. IEEE Trans. Knowl. Data Eng. 25(4), 734–750 (2013)CrossRef
13.
Zurück zum Zitat Gomariz, A.: Techniques for the discovery of temporal patterns. Ph.D. Thesis, University of Murcia (Spain), University of Antwerp (Belgium), (2013) Gomariz, A.: Techniques for the discovery of temporal patterns. Ph.D. Thesis, University of Murcia (Spain), University of Antwerp (Belgium), (2013)
14.
Zurück zum Zitat Hoppner, F.: Time series abstraction methods—A survey in workshop on knowledge discovery in databases, Dortmund, (2002) Hoppner, F.: Time series abstraction methods—A survey in workshop on knowledge discovery in databases, Dortmund, (2002)
15.
Zurück zum Zitat Jimenez, F., Sanchez, G., Juarez, J.M.: Multi-objective evolutionary algorithms for fuzzy classification in survival prediction. Artif. Intell. Med. 60, 197–219 (2014)CrossRef Jimenez, F., Sanchez, G., Juarez, J.M.: Multi-objective evolutionary algorithms for fuzzy classification in survival prediction. Artif. Intell. Med. 60, 197–219 (2014)CrossRef
16.
Zurück zum Zitat Kerber, R.: ChiMerge: discretization of numeric attributes. In: Proceedings of 10th International Artificial Intelligence, pp. 123–128, (1992) Kerber, R.: ChiMerge: discretization of numeric attributes. In: Proceedings of 10th International Artificial Intelligence, pp. 123–128, (1992)
17.
Zurück zum Zitat Kotsiantis, S., Kanellopoulos, D.: Discretization techniques: a recent survey. GESTS Int. Trans. Comput. Sci. Eng. 32(1), 47–58 (2006) Kotsiantis, S., Kanellopoulos, D.: Discretization techniques: a recent survey. GESTS Int. Trans. Comput. Sci. Eng. 32(1), 47–58 (2006)
18.
Zurück zum Zitat Lee, C.: A Hellinger-based discretization method for numeric attributes in classification learning. Knowl. Based Syst. 20(4), 419–425 (2007)CrossRef Lee, C.: A Hellinger-based discretization method for numeric attributes in classification learning. Knowl. Based Syst. 20(4), 419–425 (2007)CrossRef
19.
Zurück zum Zitat Lima, M.D.C., et al.: Heuristic discretization method for bayesian networks. J. Comput. Sci. 10(5), 869–878 (2014)CrossRef Lima, M.D.C., et al.: Heuristic discretization method for bayesian networks. J. Comput. Sci. 10(5), 869–878 (2014)CrossRef
20.
Zurück zum Zitat Lin, J., Keogh, E., Lonardi, S., Chiu, B.: A symbolic representation of time series with implications for streaming algorithms. In: Proceedings of the 8th ACM SIGMOD DMKD workshop, (2003) Lin, J., Keogh, E., Lonardi, S., Chiu, B.: A symbolic representation of time series with implications for streaming algorithms. In: Proceedings of the 8th ACM SIGMOD DMKD workshop, (2003)
21.
Zurück zum Zitat Liu, H., Hussain, F., Tan, C.L., Dash, M.: Discretization: an enabling technique. Data Min. Knowl. Discov. 6(4), 393–423 (2002)MathSciNetCrossRef Liu, H., Hussain, F., Tan, C.L., Dash, M.: Discretization: an enabling technique. Data Min. Knowl. Discov. 6(4), 393–423 (2002)MathSciNetCrossRef
22.
Zurück zum Zitat Liu, X.: A discretization algorithm based on a heterogeneity criterion. IEEE Trans. Knowl. Data Eng. 17(9), 1166–1173 (2005)CrossRef Liu, X.: A discretization algorithm based on a heterogeneity criterion. IEEE Trans. Knowl. Data Eng. 17(9), 1166–1173 (2005)CrossRef
23.
Zurück zum Zitat Maslove, D.M., Podchiyska, T., Lowe, H.J.: Discretization of continuous features in clinical datasets. J. Am. Med. Inform. Assoc. 20(3), 544–553 (2013)CrossRef Maslove, D.M., Podchiyska, T., Lowe, H.J.: Discretization of continuous features in clinical datasets. J. Am. Med. Inform. Assoc. 20(3), 544–553 (2013)CrossRef
24.
Zurück zum Zitat Mehta, S., Parthasarathy, S., Yang, H.: Toward unsupervised correlation preserving discretization. IEEE Trans. Knowl. Data Eng. 17(9), 1174–1185 (2005)CrossRef Mehta, S., Parthasarathy, S., Yang, H.: Toward unsupervised correlation preserving discretization. IEEE Trans. Knowl. Data Eng. 17(9), 1174–1185 (2005)CrossRef
25.
Zurück zum Zitat Mörchen, F., Ultsch, A.: Optimizing time series discretization for knowledge discovery. In: Proceedings of the KDD05 (2005) Mörchen, F., Ultsch, A.: Optimizing time series discretization for knowledge discovery. In: Proceedings of the KDD05 (2005)
26.
Zurück zum Zitat Moskovitch, R., Shahar, Y.: Classification-driven temporal discretization of multivariate time series. Data Min. Knowl. Discov. 29(4), 871–913 (2015)MathSciNetCrossRef Moskovitch, R., Shahar, Y.: Classification-driven temporal discretization of multivariate time series. Data Min. Knowl. Discov. 29(4), 871–913 (2015)MathSciNetCrossRef
27.
Zurück zum Zitat Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986) Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
28.
Zurück zum Zitat Ridzuan, N., Wolfe, D.: Human Readable Rule Induction in Medical Data Mining: A Survey of Existing Algorithms Proceedings of the European Computing Conference, Lecture Notes in Electrical Engineering, Volume 27, pp. 787–798 (2009) Ridzuan, N., Wolfe, D.: Human Readable Rule Induction in Medical Data Mining: A Survey of Existing Algorithms Proceedings of the European Computing Conference, Lecture Notes in Electrical Engineering, Volume 27, pp. 787–798 (2009)
29.
Zurück zum Zitat Ruiz, F.J., Angulo, C., Agell, N.: IDD: a supervised interval distance-based method for discretization. IEEE Trans. Knowl. Data Eng. 20(9), 1230–1238 (2008)CrossRef Ruiz, F.J., Angulo, C., Agell, N.: IDD: a supervised interval distance-based method for discretization. IEEE Trans. Knowl. Data Eng. 20(9), 1230–1238 (2008)CrossRef
30.
Zurück zum Zitat Shahar, Y.: A framework for knowledge-based temporal abstraction. Artif. Intell. 90(1—-2), 79–133 (1997)CrossRefMATH Shahar, Y.: A framework for knowledge-based temporal abstraction. Artif. Intell. 90(1—-2), 79–133 (1997)CrossRefMATH
31.
Zurück zum Zitat Sheppard, N.N., Hemington-Gorse, S., Shelley, O.P., Philp, B., Dziewulski, P.: Prognostic scoring systems in burns: a review. Burns 37(8), 1288–1295 (2011)CrossRef Sheppard, N.N., Hemington-Gorse, S., Shelley, O.P., Philp, B., Dziewulski, P.: Prognostic scoring systems in burns: a review. Burns 37(8), 1288–1295 (2011)CrossRef
32.
Zurück zum Zitat Stacey, M., McGregor, C.: Temporal abstraction in intelligent clinical data analysis: a survey. Artif. Intell. Med. 39, 1–24 (2007)CrossRef Stacey, M., McGregor, C.: Temporal abstraction in intelligent clinical data analysis: a survey. Artif. Intell. Med. 39, 1–24 (2007)CrossRef
33.
Zurück zum Zitat Sun, C.-T., Hsu, J.H.: An extended Chi2 algorithm for discretization of real value attributes. IEEE Trans. Knowl. Data Eng. 17(3), 437–441 (2005)CrossRef Sun, C.-T., Hsu, J.H.: An extended Chi2 algorithm for discretization of real value attributes. IEEE Trans. Knowl. Data Eng. 17(3), 437–441 (2005)CrossRef
34.
Zurück zum Zitat Wu, Q.X., Bell, D.A., Prasad, G., McGinnity, T.M.: A distribution-index-based discretizer for decision-making with symbolic AI approaches. IEEE Trans. Knowl. Data Eng. 19(1), 17–28 (2007) Wu, Q.X., Bell, D.A., Prasad, G., McGinnity, T.M.: A distribution-index-based discretizer for decision-making with symbolic AI approaches. IEEE Trans. Knowl. Data Eng. 19(1), 17–28 (2007)
35.
Zurück zum Zitat Zighed, D.A., Rabaseda, R., Rakotomalala, R.: FUSINTER: a method for discretization of continuous attributes. Int. J. Uncertain. Fuzz. Knowl.-Based Syst. 6(3), 307–326 (1998)CrossRefMATH Zighed, D.A., Rabaseda, R., Rakotomalala, R.: FUSINTER: a method for discretization of continuous attributes. Int. J. Uncertain. Fuzz. Knowl.-Based Syst. 6(3), 307–326 (1998)CrossRefMATH
Metadaten
Titel
Impact of time series discretization on intensive care burn unit survival classification
verfasst von
Isidoro J. Casanova
Manuel Campos
Jose M. Juarez
Antonio Fernandez-Fernandez-Arroyo
Jose A. Lorente
Publikationsdatum
08.06.2017
Verlag
Springer Berlin Heidelberg
Erschienen in
Progress in Artificial Intelligence / Ausgabe 1/2018
Print ISSN: 2192-6352
Elektronische ISSN: 2192-6360
DOI
https://doi.org/10.1007/s13748-017-0130-8

Weitere Artikel der Ausgabe 1/2018

Progress in Artificial Intelligence 1/2018 Zur Ausgabe