Skip to main content

2019 | OriginalPaper | Buchkapitel

A Principled Two-Step Method for Example-Dependent Cost Binary Classification

verfasst von : Javier Mediavilla-Relaño, Aitor Gutiérrez-López, Marcelino Lázaro, Aníbal R. Figueiras-Vidal

Erschienen in: From Bioinspired Systems and Biomedical Applications to Machine Learning

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper presents a principled two-step method for example-dependent cost binary classification problems. The first step obtains a consistent estimate of the posterior probabilities by training a Multi-Layer Perceptron with a Bregman surrogate cost. The second step uses the provided estimates in a Bayesian decision rule. When working with imbalanced datasets, neutral re-balancing allows getting better estimates of the posterior probabilities. Experiments with real datasets show the good performance of the proposed method in comparison with other procedures.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
\(\widetilde{Q}\) can be interpreted as \(\widetilde{Q}_P\).
 
Literatur
1.
Zurück zum Zitat Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, New York (2001)MATH Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, New York (2001)MATH
2.
Zurück zum Zitat Van Trees, H.L.: Detection, Estimation, and Modulation Theory: Part I. Wiley, New York (1968)MATH Van Trees, H.L.: Detection, Estimation, and Modulation Theory: Part I. Wiley, New York (1968)MATH
3.
Zurück zum Zitat Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)MATH Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)MATH
4.
Zurück zum Zitat Zhang, G.P.: Neural networks for classification: a survey. IEEE Trans. Syst. Man Cybern. 30(4), 451–462 (2000)CrossRef Zhang, G.P.: Neural networks for classification: a survey. IEEE Trans. Syst. Man Cybern. 30(4), 451–462 (2000)CrossRef
5.
Zurück zum Zitat Widrow, B., Rumelhart, D.E., Lehr, M.A.: Neural networks: applications in industry, business and science. Commun. ACM 37(3), 93–105 (1994)CrossRef Widrow, B., Rumelhart, D.E., Lehr, M.A.: Neural networks: applications in industry, business and science. Commun. ACM 37(3), 93–105 (1994)CrossRef
6.
Zurück zum Zitat He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)CrossRef He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)CrossRef
7.
Zurück zum Zitat Panigrahi, S., Kundu, A., Surai, S., Majumdar, A.K.: Credit card fraud detection: a fusion approach using Dempster-Shafer theory and Bayesian learning. Inf. Fusion 10(4), 354–363 (2009)CrossRef Panigrahi, S., Kundu, A., Surai, S., Majumdar, A.K.: Credit card fraud detection: a fusion approach using Dempster-Shafer theory and Bayesian learning. Inf. Fusion 10(4), 354–363 (2009)CrossRef
8.
Zurück zum Zitat Bhattacharyya, S., Jha, S., Tharakunnel, K., Westland, J.C.: Data mining for credit card fraud: a comparative study. Decis. Support Syst. 50(3), 602–613 (2011)CrossRef Bhattacharyya, S., Jha, S., Tharakunnel, K., Westland, J.C.: Data mining for credit card fraud: a comparative study. Decis. Support Syst. 50(3), 602–613 (2011)CrossRef
9.
Zurück zum Zitat Verbraken, T., Bravo, C., Webber, R., Baesens, B.: Development and application of consumer credit scoring models using profit-based classification measures. Eur. J. Oper. Res. 238(2), 505–513 (2014)MathSciNetCrossRef Verbraken, T., Bravo, C., Webber, R., Baesens, B.: Development and application of consumer credit scoring models using profit-based classification measures. Eur. J. Oper. Res. 238(2), 505–513 (2014)MathSciNetCrossRef
10.
Zurück zum Zitat Bahnsen, A.C., Aouada, D., Ottersten, B.: Example-dependent cost-sensitive logistic regression for credit scoring. In: Proceedings of 13th International Conference on Machine Learning and Applications, pp. 263–269. IEEE Computer Society (2014) Bahnsen, A.C., Aouada, D., Ottersten, B.: Example-dependent cost-sensitive logistic regression for credit scoring. In: Proceedings of 13th International Conference on Machine Learning and Applications, pp. 263–269. IEEE Computer Society (2014)
11.
Zurück zum Zitat Ngai, E.W.T., Xiu, L., Chau, D.C.K.: Application of data mining techniques in customer relationship management: a literature review and classification. Expert Syst. Appl. 36(2), 2592–2602 (2009)CrossRef Ngai, E.W.T., Xiu, L., Chau, D.C.K.: Application of data mining techniques in customer relationship management: a literature review and classification. Expert Syst. Appl. 36(2), 2592–2602 (2009)CrossRef
12.
Zurück zum Zitat Moro, S., Laureano, R.M.S., Cortez, P.: Using data mining for bank direct marketing: an application of the CRISP-DM methodology. In: Proceedings of European Simulation and Modeling Conference, Guimaraes (Portugal), pp. 117–121 (2011) Moro, S., Laureano, R.M.S., Cortez, P.: Using data mining for bank direct marketing: an application of the CRISP-DM methodology. In: Proceedings of European Simulation and Modeling Conference, Guimaraes (Portugal), pp. 117–121 (2011)
13.
Zurück zum Zitat Elkan, C.: The foundations of cost-sensitive learning. In: Proceedings of 17th International Joint Conference on Artificial Intelligence, vol. 2, pp. 973–978 (2001) Elkan, C.: The foundations of cost-sensitive learning. In: Proceedings of 17th International Joint Conference on Artificial Intelligence, vol. 2, pp. 973–978 (2001)
14.
Zurück zum Zitat Zadrozny, B., Langford, J., Abe, N.: Cost-sensitive learning by cost-proportionate example weighting. In: Proceedings of Third International Conference on Data Mining, pp. 435–442 (2003) Zadrozny, B., Langford, J., Abe, N.: Cost-sensitive learning by cost-proportionate example weighting. In: Proceedings of Third International Conference on Data Mining, pp. 435–442 (2003)
15.
Zurück zum Zitat Branco, P., Torgo, L., Ribeiro, R.P.: A survey of predictive modeling on imbalanced domains. ACM Comput. Surv. 49(2), 31:1–31:50 (2016)CrossRef Branco, P., Torgo, L., Ribeiro, R.P.: A survey of predictive modeling on imbalanced domains. ACM Comput. Surv. 49(2), 31:1–31:50 (2016)CrossRef
17.
Zurück zum Zitat González, P., et al.: Multiclass support vector machines with example dependent costs applied to plankton biomass estimation. IEEE Trans. Neural Netw. Learn. Syst. 24(11), 1901–1905 (2013)CrossRef González, P., et al.: Multiclass support vector machines with example dependent costs applied to plankton biomass estimation. IEEE Trans. Neural Netw. Learn. Syst. 24(11), 1901–1905 (2013)CrossRef
18.
Zurück zum Zitat Bahnsen, A.C., Stojanovic, A., Aouada, D., Ottersten, B.: Cost sensitive credit card fraud detection using Bayes minimization risk. In: Proceedings of 12th International Conference on Machine Learning and Applications, pp. 333–338. IEEE Computer Society (2013) Bahnsen, A.C., Stojanovic, A., Aouada, D., Ottersten, B.: Cost sensitive credit card fraud detection using Bayes minimization risk. In: Proceedings of 12th International Conference on Machine Learning and Applications, pp. 333–338. IEEE Computer Society (2013)
19.
Zurück zum Zitat Bahnsen, A.C., Stojanovic, A., Aouada, D., Ottersten, B.: Improving credit card fraud detection with calibrated probabilities. In: Proceedings of 14th International Conference on Data Mining, Philadelphia, USA, pp. 677–685. SIAM (2014) Bahnsen, A.C., Stojanovic, A., Aouada, D., Ottersten, B.: Improving credit card fraud detection with calibrated probabilities. In: Proceedings of 14th International Conference on Data Mining, Philadelphia, USA, pp. 677–685. SIAM (2014)
20.
Zurück zum Zitat Bahnsen, A.C., Aouada, D., Ottersten, B.: A novel cost-sensitive framework for customer churn predictive modeling. Decis. Anal. 2(5), 1–15 (2015) Bahnsen, A.C., Aouada, D., Ottersten, B.: A novel cost-sensitive framework for customer churn predictive modeling. Decis. Anal. 2(5), 1–15 (2015)
21.
Zurück zum Zitat Bahnsen, A.C., Aouada, D., Ottersten, B.: Example-dependent cost-sensitive decision trees. Expert Syst. Appl. 42(19), 6609–6619 (2015)CrossRef Bahnsen, A.C., Aouada, D., Ottersten, B.: Example-dependent cost-sensitive decision trees. Expert Syst. Appl. 42(19), 6609–6619 (2015)CrossRef
22.
Zurück zum Zitat Bregman, L.M.: The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 7, 200–217 (1967)MathSciNetCrossRef Bregman, L.M.: The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 7, 200–217 (1967)MathSciNetCrossRef
23.
Zurück zum Zitat Cid-Sueiro, J., Arribas, J.I., Urbán-Muñoz, S., Figueiras-Vidal, A.R.: Cost functions to estimate a posteriori probabilities in multiclass problems. IEEE Trans. Neural Netw. 10(3), 645–656 (1999)CrossRef Cid-Sueiro, J., Arribas, J.I., Urbán-Muñoz, S., Figueiras-Vidal, A.R.: Cost functions to estimate a posteriori probabilities in multiclass problems. IEEE Trans. Neural Netw. 10(3), 645–656 (1999)CrossRef
24.
Zurück zum Zitat Cid-Sueiro, J., Figueiras-Vidal, A.R.: On the structure of strict sense Bayesian cost functions and its applications. IEEE Trans. Neural Netw. 12(3), 445–455 (2001)CrossRef Cid-Sueiro, J., Figueiras-Vidal, A.R.: On the structure of strict sense Bayesian cost functions and its applications. IEEE Trans. Neural Netw. 12(3), 445–455 (2001)CrossRef
26.
Zurück zum Zitat Baesens, B., Roesch, D., Scheule, H.: Credit Risk Analytics: Measurement Techniques, Applications, and Examples in SAS. Wiley, New York (2016)CrossRef Baesens, B., Roesch, D., Scheule, H.: Credit Risk Analytics: Measurement Techniques, Applications, and Examples in SAS. Wiley, New York (2016)CrossRef
27.
Zurück zum Zitat Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetMATH Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetMATH
28.
Zurück zum Zitat Lázaro, M., Hayes, M.H., Figueiras-Vidal, A.R.: Training neural network classifiers through Bayes risk minimization applying unidimensional Parzen windows. Pattern Recognit. 77, 204–215 (2018)CrossRef Lázaro, M., Hayes, M.H., Figueiras-Vidal, A.R.: Training neural network classifiers through Bayes risk minimization applying unidimensional Parzen windows. Pattern Recognit. 77, 204–215 (2018)CrossRef
Metadaten
Titel
A Principled Two-Step Method for Example-Dependent Cost Binary Classification
verfasst von
Javier Mediavilla-Relaño
Aitor Gutiérrez-López
Marcelino Lázaro
Aníbal R. Figueiras-Vidal
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-19651-6_2