Skip to main content
Top

Hint

Swipe to navigate through the articles of this issue

Published in: International Journal of Data Science and Analytics 1/2022

07-03-2022 | Regular Paper

A novel approach for discretizing continuous attributes based on tree ensemble and moment matching optimization

Authors: Haddouchi Maissae, Berrado Abdelaziz

Published in: International Journal of Data Science and Analytics | Issue 1/2022

Log in

Abstract

This paper introduces ForestDisc, an optimized, supervised, multivariate, and nonparametric discretization algorithm based on tree ensemble learning and moment matching optimization. At its core, ForestDisc uses, for each continuous attribute in the data space, moment matching to elect popular split points based on those generated while constructing a random forest model. An extensive empirical study involving 50 benchmark datasets and six classification algorithms reveals that ForestDisc is highly competitive compared with 20 major discretizers based on both intrinsic and extrinsic performance measures. The intrinsic metrics include the number of resulting bins per variable and the execution time necessary for discretizing an attribute. The extrinsic metrics concern the performance of the discretizers when applied as a preprocessing step to classification tasks, and include accuracy, F1, and Kappa measures. ForestDisc discretizer also enables an excellent trade-off between intrinsic and extrinsic performance measures.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Frank, E., Witten, I.H.: Making better use of global discretization, 115–123 (Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, Conference held at Bled, Slovenia, to 1999-06-30) Frank, E., Witten, I.H.: Making better use of global discretization, 115–123 (Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, Conference held at Bled, Slovenia, to 1999-06-30)
2.
go back to reference Liu, H., Hussain, F., Tan, C.L., Dash, M.: Discretization: an enabling technique. Data Min. Knowl. Discov. 6, 393–423 (2002) MathSciNetCrossRef Liu, H., Hussain, F., Tan, C.L., Dash, M.: Discretization: an enabling technique. Data Min. Knowl. Discov. 6, 393–423 (2002) MathSciNetCrossRef
3.
go back to reference Lustgarten, J.L., Gopalakrishnan, V., Grover, H., Visweswaran, S.: Improving classification performance with discretization on biomedical datasets. AMIA Annu. Symp. Proc. 2008, 445–449 (2008) Lustgarten, J.L., Gopalakrishnan, V., Grover, H., Visweswaran, S.: Improving classification performance with discretization on biomedical datasets. AMIA Annu. Symp. Proc. 2008, 445–449 (2008)
4.
go back to reference Yang, Y., Webb, G.I., Wu, X.: Discretization methods. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 101–116. Springer, Boston (2010) Yang, Y., Webb, G.I., Wu, X.: Discretization methods. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 101–116. Springer, Boston (2010)
5.
go back to reference Vorobeva, A.A.: Influence of features discretization on accuracy of random forest classifier for web user identification. IEEE, St-Petersburg, Russia, 498–504 (2017) Vorobeva, A.A.: Influence of features discretization on accuracy of random forest classifier for web user identification. IEEE, St-Petersburg, Russia, 498–504 (2017)
9.
go back to reference Dougherty, J., Kohavi, R., Sahami, M.: Supervised and Unsupervised Discretization of Continuous Features, pp. 194–202. Elsevier, Amsterdam (1995) Dougherty, J., Kohavi, R., Sahami, M.: Supervised and Unsupervised Discretization of Continuous Features, pp. 194–202. Elsevier, Amsterdam (1995)
10.
go back to reference Ramırez-Gallego, S., Garcıa, S., Martınez-Rego, D., Benıtez, J. M., Herrera, F.: Data Discretization: Taxonomy and Big Data Challenge 26 Ramırez-Gallego, S., Garcıa, S., Martınez-Rego, D., Benıtez, J. M., Herrera, F.: Data Discretization: Taxonomy and Big Data Challenge 26
11.
go back to reference Agre, G.: On supervised and unsupervised discretization. Cybern. Inf. Technol. (2002) Agre, G.: On supervised and unsupervised discretization. Cybern. Inf. Technol. (2002)
13.
go back to reference Wang, C., Wang, M., She, Z., Cao, L., Tan, P.-N., Chawla, S., Ho, C.K., Bailey, J.: CD: a coupled discretization algorithm. In: Tan, P.-N., Chawla, S., Ho, C.K., Bailey, J. (eds.) Advances in Knowledge Discovery and Data Mining. Lecture Notes in Computer Science, pp. 407–418. Springer, Berlin (2012) CrossRef Wang, C., Wang, M., She, Z., Cao, L., Tan, P.-N., Chawla, S., Ho, C.K., Bailey, J.: CD: a coupled discretization algorithm. In: Tan, P.-N., Chawla, S., Ho, C.K., Bailey, J. (eds.) Advances in Knowledge Discovery and Data Mining. Lecture Notes in Computer Science, pp. 407–418. Springer, Berlin (2012) CrossRef
17.
go back to reference Muhlenbach, F., Rakotomalala, R.: Discretization of Continuous Attributes Idea group reference edn hal-00383757v2, 397–402 (2005) Muhlenbach, F., Rakotomalala, R.: Discretization of Continuous Attributes Idea group reference edn hal-00383757v2, 397–402 (2005)
19.
go back to reference Berrado, A., Runger, G.C.: Supervised multivariate discretization in mixed data with Random Forests. IEEE, Rabat, Morocco, pp. 211–217 (2009) Berrado, A., Runger, G.C.: Supervised multivariate discretization in mixed data with Random Forests. IEEE, Rabat, Morocco, pp. 211–217 (2009)
22.
go back to reference Sriwanna, K., Puntumapon, K., Waiyamai, K., Zhou, S., Zhang, S., Karypis, G.: An enhanced class-attribute interdependence maximization discretization algorithm. In: Zhou, S., Zhang, S., Karypis, G. (eds.) Advanced Data Mining and Applications. Lecture Notes in Computer Science, pp. 465–476. Springer, Berlin (2012) CrossRef Sriwanna, K., Puntumapon, K., Waiyamai, K., Zhou, S., Zhang, S., Karypis, G.: An enhanced class-attribute interdependence maximization discretization algorithm. In: Zhou, S., Zhang, S., Karypis, G. (eds.) Advanced Data Mining and Applications. Lecture Notes in Computer Science, pp. 465–476. Springer, Berlin (2012) CrossRef
24.
go back to reference Baka, A., Wettayaprasit, W., Vanichayobon, S.: A novel discretization technique using Class Attribute Interval Average, pp. 95–100 (2014) Baka, A., Wettayaprasit, W., Vanichayobon, S.: A novel discretization technique using Class Attribute Interval Average, pp. 95–100 (2014)
28.
go back to reference CanoAlberto, T.N., VenturaSebastián, JC.: Ur-CAIM. Soft Computing - A Fusion of Foundations, Methodologies and Applications (2016) CanoAlberto, T.N., VenturaSebastián, JC.: Ur-CAIM. Soft Computing - A Fusion of Foundations, Methodologies and Applications (2016)
29.
go back to reference Ramírez-Gallego, S., García, S., Benítez, J. M., Herrera, F., Burduk, R., Jackowski, K., Kurzyński, M., Woźniak, M., Żołnierek, A.: A Wrapper evolutionary approach for supervised multivariate discretization: a case study on decision trees. In: Burduk, R., Jackowski, K., Kurzyński, M., Woźniak, M., Żołnierek, A. (eds) Proceedings of the 9th International Conference on Computer Recognition Systems CORES 2015, Advances in Intelligent Systems and Computing. Springer, Cham, pp. 47–58 (2016) Ramírez-Gallego, S., García, S., Benítez, J. M., Herrera, F., Burduk, R., Jackowski, K., Kurzyński, M., Woźniak, M., Żołnierek, A.: A Wrapper evolutionary approach for supervised multivariate discretization: a case study on decision trees. In: Burduk, R., Jackowski, K., Kurzyński, M., Woźniak, M., Żołnierek, A. (eds) Proceedings of the 9th International Conference on Computer Recognition Systems CORES 2015, Advances in Intelligent Systems and Computing. Springer, Cham, pp. 47–58 (2016)
30.
go back to reference Sriwanna, K., Boongoen, T., Iam-On, N. Lavangnananda, K., Phon-Amnuaisuk, S., Engchuan, W., Chan, J.H.: An enhanced univariate discretization based on cluster ensembles. In:Lavangnananda, K., Phon-Amnuaisuk, S., Engchuan, W., Chan, J. H. (eds) Proceedings in Adaptation, Learning and Optimization, Intelligent and Evolutionary Systems. Springer, Cham, pp. 85–98 (2016) Sriwanna, K., Boongoen, T., Iam-On, N. Lavangnananda, K., Phon-Amnuaisuk, S., Engchuan, W., Chan, J.H.: An enhanced univariate discretization based on cluster ensembles. In:Lavangnananda, K., Phon-Amnuaisuk, S., Engchuan, W., Chan, J. H. (eds) Proceedings in Adaptation, Learning and Optimization, Intelligent and Evolutionary Systems. Springer, Cham, pp. 85–98 (2016)
38.
go back to reference Ehrhardt, A., Vandewalle, V., Biernacki, C., Heinrich, P.: Supervised multivariate discretization and levels merging for logistic regression. Iasi, Romania (2018) Ehrhardt, A., Vandewalle, V., Biernacki, C., Heinrich, P.: Supervised multivariate discretization and levels merging for logistic regression. Iasi, Romania (2018)
39.
go back to reference Drias, H., Moulai, H., Rehkab, N.: LR-SDiscr: an efficient algorithm for supervised discretization. In: Nguyen, N.T., Hoang, D.H., Hong, T.-P., Pham, H., Trawiński, B. (eds.) Intelligent Information and Database Systems, vol. 10751, pp. 266–275. Springer, Cham (2018) CrossRef Drias, H., Moulai, H., Rehkab, N.: LR-SDiscr: an efficient algorithm for supervised discretization. In: Nguyen, N.T., Hoang, D.H., Hong, T.-P., Pham, H., Trawiński, B. (eds.) Intelligent Information and Database Systems, vol. 10751, pp. 266–275. Springer, Cham (2018) CrossRef
40.
go back to reference Abachi, H.M., Hosseini, S., Maskouni, M.A., Kangavari, M., Cheung, N.-M., Wang, J., Cong, G., Chen, J., Qi, J.: Statistical discretization of continuous attributes using Kolmogorov-Smirnov test. In: Wang, J., Cong, G., Chen, J., Qi, J. (eds.) Databases Theory and Applications. Lecture Notes in Computer Science, pp. 309–315. Springer, Cham (2018) Abachi, H.M., Hosseini, S., Maskouni, M.A., Kangavari, M., Cheung, N.-M., Wang, J., Cong, G., Chen, J., Qi, J.: Statistical discretization of continuous attributes using Kolmogorov-Smirnov test. In: Wang, J., Cong, G., Chen, J., Qi, J. (eds.) Databases Theory and Applications. Lecture Notes in Computer Science, pp. 309–315. Springer, Cham (2018)
45.
go back to reference Liu, H., Jiang, C., Wang, M., Wei, K., Yan, S.: An Improved Data Discretization Algorithm based on Rough Sets Theory, pp. 1432–1437 (2020) Liu, H., Jiang, C., Wang, M., Wei, K., Yan, S.: An Improved Data Discretization Algorithm based on Rough Sets Theory, pp. 1432–1437 (2020)
49.
go back to reference Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference and Prediction, 2nd edn. Springer, Berlin (2009) CrossRef Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference and Prediction, 2nd edn. Springer, Berlin (2009) CrossRef
52.
go back to reference Haddouchi, M., Berrado, A.: Discretizing continuous attributes for machine learning using nonlinear programming. Int. J. Comput. Sci. Appl. 18(1), 26–44 (2021) Haddouchi, M., Berrado, A.: Discretizing continuous attributes for machine learning using nonlinear programming. Int. J. Comput. Sci. Appl. 18(1), 26–44 (2021)
53.
go back to reference Rouaud, M: Probability, Statistics and Estimation. Propagation of Uncertainties in Experimental Measurement. Short edition edn. Creative Commons (2017) Rouaud, M: Probability, Statistics and Estimation. Propagation of Uncertainties in Experimental Measurement. Short edition edn. Creative Commons (2017)
54.
go back to reference Bazaraa, M.S., Sherali, H.D., Shetty, C.M.: Nonlinear Programming: Theory and Algorithms, 3rd ed edn. Wiley-Interscience, Hoboken, N.J, (2006). OCLC: ocm61478842 Bazaraa, M.S., Sherali, H.D., Shetty, C.M.: Nonlinear Programming: Theory and Algorithms, 3rd ed edn. Wiley-Interscience, Hoboken, N.J, (2006). OCLC: ocm61478842
56.
go back to reference Dubitzky, W., Granzow, M., Berrar, D.P.: Fundamentals of Data Mining in Genomics and Proteomics. Springer, Berlin (2007) CrossRef Dubitzky, W., Granzow, M., Berrar, D.P.: Fundamentals of Data Mining in Genomics and Proteomics. Springer, Berlin (2007) CrossRef
62.
go back to reference Madsen, K., Zertchaninov, S.: Global Optimization using Branch-and-Bound 17 (1998) Madsen, K., Zertchaninov, S.: Global Optimization using Branch-and-Bound 17 (1998)
63.
go back to reference Zertchaninov, S., Madsen, K., Zilinskas, A.: A C++ Programme for Global Optimization. IMM Publications 14 (1998) Zertchaninov, S., Madsen, K., Zilinskas, A.: A C++ Programme for Global Optimization. IMM Publications 14 (1998)
64.
go back to reference Powell, M.: A direct search optimization method that models the objective and constraint functions by linear interpolation. In: Gomez, S., Hennart, J.-P. (eds.) Advances in Optimization and Numerical Analysis, pp. 51–67. Springer, Dordrecht (1994) CrossRef Powell, M.: A direct search optimization method that models the objective and constraint functions by linear interpolation. In: Gomez, S., Hennart, J.-P. (eds.) Advances in Optimization and Numerical Analysis, pp. 51–67. Springer, Dordrecht (1994) CrossRef
66.
go back to reference Powell, M.: The BOBYQA algorithm for bound constrained optimization without derivatives. Tech. Rep., Department of Applied Mathematics and Theoretical Physics, Cambridge England, technical report NA2009/06 (2009) Powell, M.: The BOBYQA algorithm for bound constrained optimization without derivatives. Tech. Rep., Department of Applied Mathematics and Theoretical Physics, Cambridge England, technical report NA2009/06 (2009)
70.
go back to reference Rowan, T.H.: Functional Stability Analysis of Numerical Algorithms. Ph.D. thesis, Ph.D. thesis, Department of Computer Sciences, University of Texas at Austin (1990) Rowan, T.H.: Functional Stability Analysis of Numerical Algorithms. Ph.D. thesis, Ph.D. thesis, Department of Computer Sciences, University of Texas at Austin (1990)
71.
go back to reference Svanberg, K.: A class of globally convergent optimization methods based on conservative convex separable approximations. SIAM J. Optim. 12, 555–573 (2002) MathSciNetCrossRef Svanberg, K.: A class of globally convergent optimization methods based on conservative convex separable approximations. SIAM J. Optim. 12, 555–573 (2002) MathSciNetCrossRef
72.
go back to reference Kraft, D.: A Software Package for Sequential Quadratic Programming Deutsche Forschungs- Und Versuchsanstalt Für Luft- Und Raumfahrt Köln: Forschungsbericht. DFVLR, Wiss. Berichtswesen d (1988) Kraft, D.: A Software Package for Sequential Quadratic Programming Deutsche Forschungs- Und Versuchsanstalt Für Luft- Und Raumfahrt Köln: Forschungsbericht. DFVLR, Wiss. Berichtswesen d (1988)
73.
go back to reference Kraft, D., Munchen, I.: Algorithm 733: TOMP - Fortran modules for optimal control calculations. ACM Trans. Math. Soft 262–281 (1994) Kraft, D., Munchen, I.: Algorithm 733: TOMP - Fortran modules for optimal control calculations. ACM Trans. Math. Soft 262–281 (1994)
74.
go back to reference Nocedal, J.: Updating quasi-newton matrices with limited storage. Math. Comput. 35(773–782), 10 (1980) MathSciNetMATH Nocedal, J.: Updating quasi-newton matrices with limited storage. Math. Comput. 35(773–782), 10 (1980) MathSciNetMATH
75.
go back to reference Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program. 45, 503–528 (1989) MathSciNetCrossRef Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program. 45, 503–528 (1989) MathSciNetCrossRef
77.
go back to reference Vlcek, J., Luksan, L.: Shifted limited-memory variable metric methods for large-scale unconstrained optimization. J. Comput. Appl. Math. 186, 365–390 (2006) MathSciNetCrossRef Vlcek, J., Luksan, L.: Shifted limited-memory variable metric methods for large-scale unconstrained optimization. J. Comput. Appl. Math. 186, 365–390 (2006) MathSciNetCrossRef
78.
go back to reference Conn, A.R., Gould, N.I.M., Philippe, Toint, L.: A globally convergent augmented lagrangian algorithm for optimization with general constraints and simple bounds. SIAM J. Numer. Anal. 572 (1991) Conn, A.R., Gould, N.I.M., Philippe, Toint, L.: A globally convergent augmented lagrangian algorithm for optimization with general constraints and simple bounds. SIAM J. Numer. Anal. 572 (1991)
79.
go back to reference Birgin, E.G., Martínez, J.M.: Improving ultimate convergence of an Augmented Lagrangian method. Optim. Methods Softw. 23(2), 177–195 (2008) MathSciNetCrossRef Birgin, E.G., Martínez, J.M.: Improving ultimate convergence of an Augmented Lagrangian method. Optim. Methods Softw. 23(2), 177–195 (2008) MathSciNetCrossRef
82.
go back to reference Singer, S., Singer, S.: Complexity Analysis of Nelder-Mead Search Iterations, vol. 12. Dubrovnik, Croatia (1999) MATH Singer, S., Singer, S.: Complexity Analysis of Nelder-Mead Search Iterations, vol. 12. Dubrovnik, Croatia (1999) MATH
85.
go back to reference R Core Team: R: A Language and Environment for Statistical Computing (Vienna, Austria, 2019) R Core Team: R: A Language and Environment for Statistical Computing (Vienna, Austria, 2019)
86.
go back to reference Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. Artif. Intell. 13, 1022–1027 (1993) Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. Artif. Intell. 13, 1022–1027 (1993)
87.
go back to reference Liu, H., Setiono, R.: Chi2: Feature Selection and Discretization of Numeric Attributes, 388–391 (1995) Liu, H., Setiono, R.: Chi2: Feature Selection and Discretization of Numeric Attributes, 388–391 (1995)
89.
go back to reference von Jouanne-Diedrich, H.: Vonjd/OneR (2017) von Jouanne-Diedrich, H.: Vonjd/OneR (2017)
90.
go back to reference Kerber, R.: ChiMerge: Discretization of numeric attributes, AAAI’92, 123–128. AAAI Press, San Jose, California (1992) Kerber, R.: ChiMerge: Discretization of numeric attributes, AAAI’92, 123–128. AAAI Press, San Jose, California (1992)
97.
go back to reference Holte, R.C.: Very simple classification rules perform well on most commonly used datasets. Mach. Learn. 32 (1993) Holte, R.C.: Very simple classification rules perform well on most commonly used datasets. Mach. Learn. 32 (1993)
98.
go back to reference Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S Fourth, edition Springer Publishing Company, Incorporated, Berlin (2002) CrossRef Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S Fourth, edition Springer Publishing Company, Incorporated, Berlin (2002) CrossRef
100.
go back to reference Nguyen, H.S.: On efficient handling of continuous attributes in large data bases. Fundam. Inform. 48, 61–81 (2001) MathSciNetMATH Nguyen, H.S.: On efficient handling of continuous attributes in large data bases. Fundam. Inform. 48, 61–81 (2001) MathSciNetMATH
101.
go back to reference Bazan, J.G., Nguyen, H.S., Nguyen, S.H., Synak, P., Wróblewski, J.: Rough set algorithms in classification problem. In: Kacprzyk, J., Polkowski, L., Tsumoto, S., Lin, T.Y. (eds.) Rough Set Methods and Applications, vol. 56, pp. 49–88. Physica-Verlag, Heidelberg (2000) CrossRef Bazan, J.G., Nguyen, H.S., Nguyen, S.H., Synak, P., Wróblewski, J.: Rough set algorithms in classification problem. In: Kacprzyk, J., Polkowski, L., Tsumoto, S., Lin, T.Y. (eds.) Rough Set Methods and Applications, vol. 56, pp. 49–88. Physica-Verlag, Heidelberg (2000) CrossRef
102.
go back to reference Celeux, G., Chauveau, D., Diebolt, J.: On Stochastic Versions of the EM Algorithm. Research Report RR-2514, INRIA (1995) Celeux, G., Chauveau, D., Diebolt, J.: On Stochastic Versions of the EM Algorithm. Research Report RR-2514, INRIA (1995)
103.
go back to reference Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and regression trees the wadsworth statistics/probability series edn. Monterey, CA : Wadsworth & Brooks/Cole Advanced Books & Software, 1984. - 358 p. (1884) Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and regression trees the wadsworth statistics/probability series edn. Monterey, CA : Wadsworth & Brooks/Cole Advanced Books & Software, 1984. - 358 p. (1884)
105.
go back to reference Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2000) MathSciNetMATH Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2000) MathSciNetMATH
106.
go back to reference Chen, T., Guestrin, C.: XGBoost: A Scalable Tree Boosting System, pp. 785–794. ACM Press, San Francisco (2016) Chen, T., Guestrin, C.: XGBoost: A Scalable Tree Boosting System, pp. 785–794. ACM Press, San Francisco (2016)
110.
go back to reference Prati, R.C., Monard, M.C.: A survey on graphical methods for classification predictive performance evaluation. IEEE Trans. Knowl. Data Eng. 1601–1618 Prati, R.C., Monard, M.C.: A survey on graphical methods for classification predictive performance evaluation. IEEE Trans. Knowl. Data Eng. 1601–1618
115.
go back to reference Garcıa, S., Herrera, F.: An Extension on “Statistical Comparisons of Classifiers over Multiple Data Sets” for all Pairwise Comparisons 18 Garcıa, S., Herrera, F.: An Extension on “Statistical Comparisons of Classifiers over Multiple Data Sets” for all Pairwise Comparisons 18
116.
go back to reference Dua, D., Graff, C.: UCI machine learning repository (2017) Dua, D., Graff, C.: UCI machine learning repository (2017)
117.
go back to reference Alcalá-Fdez, J., et al.: KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Multiple-Valued Logic Soft Comput. 17(2–3), 255–287 (2011) Alcalá-Fdez, J., et al.: KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Multiple-Valued Logic Soft Comput. 17(2–3), 255–287 (2011)
119.
go back to reference Batuwita, R., Palade, V.: Class imbalance learning methods for support vector machines. In: He, H., Ma, Y. (eds.) Imbalanced Learning, pp. 83–99. Wiley, Hoboken (2013) CrossRef Batuwita, R., Palade, V.: Class imbalance learning methods for support vector machines. In: He, H., Ma, Y. (eds.) Imbalanced Learning, pp. 83–99. Wiley, Hoboken (2013) CrossRef
Metadata
Title
A novel approach for discretizing continuous attributes based on tree ensemble and moment matching optimization
Authors
Haddouchi Maissae
Berrado Abdelaziz
Publication date
07-03-2022
Publisher
Springer International Publishing
Published in
International Journal of Data Science and Analytics / Issue 1/2022
Print ISSN: 2364-415X
Electronic ISSN: 2364-4168
DOI
https://doi.org/10.1007/s41060-022-00316-1

Other articles of this Issue 1/2022

International Journal of Data Science and Analytics 1/2022 Go to the issue

Premium Partner