Skip to main content
Top
Published in: Pattern Analysis and Applications 2/2018

29-09-2016 | Theoretical Advances

Obtaining calibrated probability using ROC Binning

Authors: Meesun Sun, Sungzoon Cho

Published in: Pattern Analysis and Applications | Issue 2/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Obtaining calibrated probability, or actual occurrence, is crucial in many real problems because it effectively supports the decision-making process with good assessment of cost and effect. Estimating calibrated probability is a more significant issue in class imbalance and class overlap problems, where direct application of classification algorithms may result in substantial errors. Consequently, several post-processing calibration techniques that aim at improving the probability estimation or the error distribution of existing classification models have been developed. In this underlying context, we propose Receiver Operating Characteristics Binning, a robust method that provides accurate calibrated probabilities that are robust to changes in the prevalence of the positive class by using a combination of True Positive Rate, False Positive Rate, and the prevalence of the positive class. The results of experiments conducted on the real-world UCI dataset indicate that, given a training set in which the positive class proportion is noticeably different from that of the test set, the proposed ROC Binning method outperforms conventional calibration methods.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
1.
go back to reference Bella A, Ferri C, Hernández-Orallo J, Ramírez-Quintana MJ (2013) On the effect of calibration in classifier combination. Appl Intell 38(4):566–585CrossRef Bella A, Ferri C, Hernández-Orallo J, Ramírez-Quintana MJ (2013) On the effect of calibration in classifier combination. Appl Intell 38(4):566–585CrossRef
2.
go back to reference Naeini MP, Cooper GF, Hauskrecht M (2014) Binary classifier calibration: non-parametric approach. arXiv preprint arXiv:14013390 Naeini MP, Cooper GF, Hauskrecht M (2014) Binary classifier calibration: non-parametric approach. arXiv preprint arXiv:​14013390
3.
go back to reference Zadrozny B, Elkan C (2001) Obtaining calibrated probability estimates from decision trees and Naïve Bayesian classiers. In: Proceedings of the 18th international conference on machine learning, Williamstown, MA, 2001. pp 609–616 Zadrozny B, Elkan C (2001) Obtaining calibrated probability estimates from decision trees and Naïve Bayesian classiers. In: Proceedings of the 18th international conference on machine learning, Williamstown, MA, 2001. pp 609–616
4.
go back to reference Platt J (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in Large Margin Classifiers, pp. 61–74 Platt J (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in Large Margin Classifiers, pp. 61–74
5.
go back to reference Zadrozny B, Elkan C Transforming classifier scores into accurate multiclass probability estimates. In: Proceedings of the 8th ACM SIGKDD international conference on Knowledge discovery and data mining, Canada, 2002. pp 694–699 Zadrozny B, Elkan C Transforming classifier scores into accurate multiclass probability estimates. In: Proceedings of the 8th ACM SIGKDD international conference on Knowledge discovery and data mining, Canada, 2002. pp 694–699
7.
go back to reference Denil M, Trappenberg T (2010) Overlap versus imbalance. Advances in artificial intelligence. Springer, Berlin Heidelberg, pp 220–231CrossRef Denil M, Trappenberg T (2010) Overlap versus imbalance. Advances in artificial intelligence. Springer, Berlin Heidelberg, pp 220–231CrossRef
8.
go back to reference Cohen I, Goldszmidt M (2004) Properties and benefits of calibrated classifiers knowledge discovery in databases: PKDD 2004, Lecture Notes in Computer Science Cohen I, Goldszmidt M (2004) Properties and benefits of calibrated classifiers knowledge discovery in databases: PKDD 2004, Lecture Notes in Computer Science
9.
go back to reference Sun M, Choi K, Cho S (2015) Estimating the minority class proportion with the ROC curve using Military Personality Inventory data of the ROK Armed Forces. Journal of Applied Statistics 42(8):1677–1689MathSciNetCrossRef Sun M, Choi K, Cho S (2015) Estimating the minority class proportion with the ROC curve using Military Personality Inventory data of the ROK Armed Forces. Journal of Applied Statistics 42(8):1677–1689MathSciNetCrossRef
10.
go back to reference Lambrou A, Papadopoulos H, Nouretdinov I, Gammerman A (2012) Reliable probability estimates based on Support Vector Machines for large multiclass datasets. Artif Intell Appl Innov 382:182–191CrossRef Lambrou A, Papadopoulos H, Nouretdinov I, Gammerman A (2012) Reliable probability estimates based on Support Vector Machines for large multiclass datasets. Artif Intell Appl Innov 382:182–191CrossRef
11.
go back to reference Wallace BC, Dahabreh IJ (2012) Class probability estimates are unreliable for imbalanced data (and how to fix them). In: IEEE 12th international conference on data mining, Washington, DC, 2012. IEEE Computer Society, pp 695–704 Wallace BC, Dahabreh IJ (2012) Class probability estimates are unreliable for imbalanced data (and how to fix them). In: IEEE 12th international conference on data mining, Washington, DC, 2012. IEEE Computer Society, pp 695–704
12.
go back to reference Lin H-T, Lin C-J, Weng RC (2007) A note on Platt’s probabilistic outputs for support vector machines. Mach Learn 68(3):267–276CrossRef Lin H-T, Lin C-J, Weng RC (2007) A note on Platt’s probabilistic outputs for support vector machines. Mach Learn 68(3):267–276CrossRef
13.
go back to reference Fawcett T, Niculescu-Mizil A (2007) PAV and the ROC Convex Hull. Mach Learn 68(1):97–106CrossRef Fawcett T, Niculescu-Mizil A (2007) PAV and the ROC Convex Hull. Mach Learn 68(1):97–106CrossRef
14.
go back to reference Niculescu-Mizil A, Caruana R (2005) Predicting good probabilities with supervised learning. In: Proceedings of the 22nd international conference on Machine learning, 2005. ACM, pp 625–632 Niculescu-Mizil A, Caruana R (2005) Predicting good probabilities with supervised learning. In: Proceedings of the 22nd international conference on Machine learning, 2005. ACM, pp 625–632
15.
go back to reference Gebel M (2009) Multivariate calibration of classifier scores into the probability space. Dissertation, Technical University of Dortmund, Duisburg, Germany Gebel M (2009) Multivariate calibration of classifier scores into the probability space. Dissertation, Technical University of Dortmund, Duisburg, Germany
16.
go back to reference Ferri C, Hernández-Orallo J, Modroiu R (2009) An experimental comparison of performance measures for classification. Pattern Recogn Lett 30(1):27–38CrossRef Ferri C, Hernández-Orallo J, Modroiu R (2009) An experimental comparison of performance measures for classification. Pattern Recogn Lett 30(1):27–38CrossRef
17.
go back to reference Brier G (1950) Verification of forecasts expressed in terms of probabilities. Mon Weather Rev 78:1–3CrossRef Brier G (1950) Verification of forecasts expressed in terms of probabilities. Mon Weather Rev 78:1–3CrossRef
18.
go back to reference Murphy AH (1973) A new vector partition of the probability score. J Appl Meteorol 12(4):595–600CrossRef Murphy AH (1973) A new vector partition of the probability score. J Appl Meteorol 12(4):595–600CrossRef
19.
go back to reference Flach P, Matsubara ET (2007) A Simple Lexicographic Ranker and Probability Estimator. Machine Learning: ECML 2007 Lecture Notes in Computer Science 4701:575–582 Flach P, Matsubara ET (2007) A Simple Lexicographic Ranker and Probability Estimator. Machine Learning: ECML 2007 Lecture Notes in Computer Science 4701:575–582
20.
go back to reference Murphy AH (1972) Scalar and vector partitions of the probability score: part ii. n-state situation. J Appl Meteorol 11:182–1192 Murphy AH (1972) Scalar and vector partitions of the probability score: part ii. n-state situation. J Appl Meteorol 11:182–1192
21.
go back to reference Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27:861–874CrossRef Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27:861–874CrossRef
22.
go back to reference López V, Fernández A, Moreno-Torres JG, Herrera F (2012) Analysis of preprocessing vs. cost-sensitive learning for imbalanced classification. Open problems on intrinsic data characteristics. Expert Syst Appl 39:6585–6608CrossRef López V, Fernández A, Moreno-Torres JG, Herrera F (2012) Analysis of preprocessing vs. cost-sensitive learning for imbalanced classification. Open problems on intrinsic data characteristics. Expert Syst Appl 39:6585–6608CrossRef
23.
go back to reference Barranquero J, González P, Díez J, Coz JJ (2013) On the study of nearest neighbor algorithms for prevalence estimation in binary problems. Pattern Recogn 46:472–482CrossRefMATH Barranquero J, González P, Díez J, Coz JJ (2013) On the study of nearest neighbor algorithms for prevalence estimation in binary problems. Pattern Recogn 46:472–482CrossRefMATH
25.
go back to reference Webb G, Ting K (2005) On the application of ROC analysis to predict classification performance under varying class distributions. Mach Learn 58:25–32CrossRefMATH Webb G, Ting K (2005) On the application of ROC analysis to predict classification performance under varying class distributions. Mach Learn 58:25–32CrossRefMATH
26.
go back to reference Fawcett T, Flach P (2005) A response to Webb and Ting’s on the application of ROC analysis to predict classification performance under varying class distributions. Mach Learn 58(1):33–38CrossRef Fawcett T, Flach P (2005) A response to Webb and Ting’s on the application of ROC analysis to predict classification performance under varying class distributions. Mach Learn 58(1):33–38CrossRef
27.
go back to reference Tan P-N, Steinbach M, Kumar V (2006) Introduction to data mining. Addison Wesley, New York Tan P-N, Steinbach M, Kumar V (2006) Introduction to data mining. Addison Wesley, New York
29.
go back to reference Sánchez JS, Mollineda RA, Sotoca JM (2007) An analysis of how training data complexity affects the nearest neighbor classifiers. Pattern Anal Appl 10(3):189–201MathSciNetCrossRef Sánchez JS, Mollineda RA, Sotoca JM (2007) An analysis of how training data complexity affects the nearest neighbor classifiers. Pattern Anal Appl 10(3):189–201MathSciNetCrossRef
30.
go back to reference Duda RO, Hart PE (1973) Pattern classification and science analysis. Wiley, New YorkMATH Duda RO, Hart PE (1973) Pattern classification and science analysis. Wiley, New YorkMATH
31.
go back to reference Switzer P (1980) Extensions of linear discriminant analysis for statistical classification of remotely sensed satellite imagery. J Int Assoc Math Geol 12(4):367–376MathSciNetCrossRef Switzer P (1980) Extensions of linear discriminant analysis for statistical classification of remotely sensed satellite imagery. J Int Assoc Math Geol 12(4):367–376MathSciNetCrossRef
32.
go back to reference Agresti A (1996) An introduction to categorical data analysis. Wiley, New YorkMATH Agresti A (1996) An introduction to categorical data analysis. Wiley, New YorkMATH
33.
go back to reference Cortes C, Vapnik VN (1995) Support-vector networks. Mach Learn 20(3):273–297MATH Cortes C, Vapnik VN (1995) Support-vector networks. Mach Learn 20(3):273–297MATH
34.
go back to reference Matlab version 7.10.0 (2010). The MathWorks Inc., Natick, Massachusetts Matlab version 7.10.0 (2010). The MathWorks Inc., Natick, Massachusetts
35.
go back to reference Isotonic Regression Software (2005) undInstitute für Mathematische Statistik und Versicherungslehre. Universität Bern, Bern Isotonic Regression Software (2005) undInstitute für Mathematische Statistik und Versicherungslehre. Universität Bern, Bern
36.
go back to reference Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–130MathSciNetMATH Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–130MathSciNetMATH
Metadata
Title
Obtaining calibrated probability using ROC Binning
Authors
Meesun Sun
Sungzoon Cho
Publication date
29-09-2016
Publisher
Springer London
Published in
Pattern Analysis and Applications / Issue 2/2018
Print ISSN: 1433-7541
Electronic ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-016-0578-3

Other articles of this Issue 2/2018

Pattern Analysis and Applications 2/2018 Go to the issue

Premium Partner