Skip to main content
Top

2024 | OriginalPaper | Chapter

Nearest Centroid Classifier Based on Information Value and Homogeneity

Authors : Mehmet Hamdi Özçelik, Serol Bulkan

Published in: Advances in Intelligent Manufacturing and Service System Informatics

Publisher: Springer Nature Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The chapter presents a novel Nearest Centroid classifier that leverages Information Value and homogeneity metrics to enhance distance calculations, resulting in improved performance across 50 datasets. The algorithm's superiority is demonstrated through extensive benchmarking against established classifiers like Nearest Centroid, Logistic Regression, K-Nearest Neighbors, and Decision Trees. The use of Information Value for predictive power and various homogeneity metrics for feature evaluation sets this algorithm apart, making it a promising tool for binary classification tasks. The authors highlight the algorithm's potential for further enhancement and its notable performance in credit risk domain datasets.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Wolpert, D.H.: The supervised learning no-free-lunch theorems. Soft Comput. Ind. 25–42 (2002) Wolpert, D.H.: The supervised learning no-free-lunch theorems. Soft Comput. Ind. 25–42 (2002)
2.
go back to reference Hastie, T., Tibshirani, R., Friedman, J.H.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, vol. 2, p. 670. Springer, New York (2009) Hastie, T., Tibshirani, R., Friedman, J.H.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, vol. 2, p. 670. Springer, New York (2009)
4.
go back to reference Kuncheva, L.I.: Prototype classifiers and the big fish: the case of prototype (instance) selection. IEEE Syst. Man Cybern. Mag. 6(2), 49–56 (2020)CrossRef Kuncheva, L.I.: Prototype classifiers and the big fish: the case of prototype (instance) selection. IEEE Syst. Man Cybern. Mag. 6(2), 49–56 (2020)CrossRef
5.
go back to reference Tibshirani, R., Hastie, T., Narasimhan, B., Chu, G.: Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc. Natl. Acad. Sci. 99(10), 6567–6572 (2002)CrossRef Tibshirani, R., Hastie, T., Narasimhan, B., Chu, G.: Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc. Natl. Acad. Sci. 99(10), 6567–6572 (2002)CrossRef
6.
go back to reference Hartigan, J.A., Wong, M.A.: Algorithm AS 136: a k-means clustering algorithm. J. Roy. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 100–108 (1979) Hartigan, J.A., Wong, M.A.: Algorithm AS 136: a k-means clustering algorithm. J. Roy. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 100–108 (1979)
7.
go back to reference Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)CrossRefMATH Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)CrossRefMATH
9.
go back to reference Gou, J., et al.: A representation coefficient-based k-nearest centroid neighbor classifier. Expert Syst. Appl. 194, 116529 (2022)CrossRef Gou, J., et al.: A representation coefficient-based k-nearest centroid neighbor classifier. Expert Syst. Appl. 194, 116529 (2022)CrossRef
10.
go back to reference Elen, A., Avuçlu, E.: Standardized Variable Distances: a distance-based machine learning method. Appl. Soft Comput. 98, 106855 (2021)CrossRef Elen, A., Avuçlu, E.: Standardized Variable Distances: a distance-based machine learning method. Appl. Soft Comput. 98, 106855 (2021)CrossRef
11.
go back to reference Baesens, B., Van Gestel, T., Viaene, S., Stepanova, M., Suykens, J., Vanthienen, J.: Benchmarking state-of-the-art classification algorithms for credit scoring. J. Oper. Res. Soc. 54, 627–635 (2003)CrossRefMATH Baesens, B., Van Gestel, T., Viaene, S., Stepanova, M., Suykens, J., Vanthienen, J.: Benchmarking state-of-the-art classification algorithms for credit scoring. J. Oper. Res. Soc. 54, 627–635 (2003)CrossRefMATH
12.
go back to reference Lessmann, S., Baesens, B., Seow, H.V., Thomas, L.C.: Benchmarking state-of-the-art classification algorithms for credit scoring: an update of research. Eur. J. Oper. Res. 247(1), 124–136 (2015)CrossRefMATH Lessmann, S., Baesens, B., Seow, H.V., Thomas, L.C.: Benchmarking state-of-the-art classification algorithms for credit scoring: an update of research. Eur. J. Oper. Res. 247(1), 124–136 (2015)CrossRefMATH
13.
go back to reference Siddiqi, N.: Intelligent Credit Scoring: Building and Implementing Better Credit Risk Scorecards, pp.186–197. Wiley (2017) Siddiqi, N.: Intelligent Credit Scoring: Building and Implementing Better Credit Risk Scorecards, pp.186–197. Wiley (2017)
14.
go back to reference Vanschoren, J., Van Rijn, J.N., Bischl, B., Torgo, L.: OpenML: networked science in machine learning. ACM SIGKDD Explor. Newslett. 15(2), 49–60 (2014)CrossRef Vanschoren, J., Van Rijn, J.N., Bischl, B., Torgo, L.: OpenML: networked science in machine learning. ACM SIGKDD Explor. Newslett. 15(2), 49–60 (2014)CrossRef
15.
go back to reference Feurer, M., et al.: OpenML-Python: an extensible Python API for OpenML. J. Mach. Learn. Res. 22(1), 4573–4577 (2021) Feurer, M., et al.: OpenML-Python: an extensible Python API for OpenML. J. Mach. Learn. Res. 22(1), 4573–4577 (2021)
16.
go back to reference Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetMATH Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetMATH
Metadata
Title
Nearest Centroid Classifier Based on Information Value and Homogeneity
Authors
Mehmet Hamdi Özçelik
Serol Bulkan
Copyright Year
2024
Publisher
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-99-6062-0_5

Premium Partners