Skip to main content
Top
Published in: Advances in Data Analysis and Classification 4/2023

07-01-2023 | Regular Article

Robust instance-dependent cost-sensitive classification

Authors: Simon De Vos, Toon Vanderschueren, Tim Verdonck, Wouter Verbeke

Published in: Advances in Data Analysis and Classification | Issue 4/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Instance-dependent cost-sensitive (IDCS) learning methods have proven useful for binary classification tasks where individual instances are associated with variable misclassification costs. However, we demonstrate in this paper by means of a series of experiments that IDCS methods are sensitive to noise and outliers in relation to instance-dependent misclassification costs and their performance strongly depends on the cost distribution of the data sample. Therefore, we propose a generic three-step framework to make IDCS methods more robust: (i) detect outliers automatically, (ii) correct outlying cost information in a data-driven way, and (iii) construct an IDCS learning method using the adjusted cost information. We apply this framework to cslogit, a logistic regression-based IDCS method, to obtain its robust version, which we name r-cslogit. The robustness of this approach is introduced in steps (i) and (ii), where we make use of robust estimators to detect and impute outlying costs of individual instances. The newly proposed r-cslogit method is tested on synthetic and semi-synthetic data and proven to be superior in terms of savings compared to its non-robust counterpart for variable levels of noise and outliers. All our code is made available online at https://​github.​com/​SimonDeVos/​Robust-IDCS.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
go back to reference Bahnsen AC, Aouada D, Ottersten B (2015) Example-dependent costsensitive decision trees. Exp Sys Appl 42(19):6609–6619CrossRef Bahnsen AC, Aouada D, Ottersten B (2015) Example-dependent costsensitive decision trees. Exp Sys Appl 42(19):6609–6619CrossRef
go back to reference Bahnsen AC, Aouada D, Stojanovic A, Ottersten B (2016) Feature engineering strategies for credit card fraud detection. Exp Sys Appl 51:134–142CrossRef Bahnsen AC, Aouada D, Stojanovic A, Ottersten B (2016) Feature engineering strategies for credit card fraud detection. Exp Sys Appl 51:134–142CrossRef
go back to reference Bianco AM, Yohai VJ (1996) Robust estimation in the logistic regression model. Robust statistics, data analysis, and computer intensive methods, Springer, Berlin, p 17–34 Bianco AM, Yohai VJ (1996) Robust estimation in the logistic regression model. Robust statistics, data analysis, and computer intensive methods, Springer, Berlin, p 17–34
go back to reference Bondell HD (2008) A characteristic function approach to the biased sampling model, with application to robust logistic regression. J Stat Plann Infer 138(3):742–755MathSciNetCrossRefMATH Bondell HD (2008) A characteristic function approach to the biased sampling model, with application to robust logistic regression. J Stat Plann Infer 138(3):742–755MathSciNetCrossRefMATH
go back to reference Brefeld U, Geibel P, Wysotzki F (2003) Support vector machines with example dependent costs. European conference on machine learning, p 23–34 Brefeld U, Geibel P, Wysotzki F (2003) Support vector machines with example dependent costs. European conference on machine learning, p 23–34
go back to reference Carroll RJ, Pederson S (1993) On robustness in the logistic regression model. J Royal Stat Soci: Ser B (Methodol) 55(3):693–706MathSciNetMATH Carroll RJ, Pederson S (1993) On robustness in the logistic regression model. J Royal Stat Soci: Ser B (Methodol) 55(3):693–706MathSciNetMATH
go back to reference Croux C, Haesbroeck G (2003) Implementing the bianco and yohai estimator for logistic regression. Comput Stat & Data Anal 44(1–2):273–295MathSciNetCrossRefMATH Croux C, Haesbroeck G (2003) Implementing the bianco and yohai estimator for logistic regression. Comput Stat & Data Anal 44(1–2):273–295MathSciNetCrossRefMATH
go back to reference Elkan C (2001) The foundations of cost-sensitive learning. Int Joint Conf Artif Intell 17:973–978 Elkan C (2001) The foundations of cost-sensitive learning. Int Joint Conf Artif Intell 17:973–978
go back to reference Fan W, Stolfo SJ, Zhang J, Chan PK (1999) Adacost: misclassification cost-sensitive boosting. Icml, Vol. 99, p 97–105 Fan W, Stolfo SJ, Zhang J, Chan PK (1999) Adacost: misclassification cost-sensitive boosting. Icml, Vol. 99, p 97–105
go back to reference Höppner S, Baesens B, Verbeke W, Verdonck T (2022) Instance-dependent cost-sensitive learning for detecting transfer fraud. Eur J Operat Res 297(1):291–300MathSciNetCrossRefMATH Höppner S, Baesens B, Verbeke W, Verdonck T (2022) Instance-dependent cost-sensitive learning for detecting transfer fraud. Eur J Operat Res 297(1):291–300MathSciNetCrossRefMATH
go back to reference Künsch HR, Stefanski LA, Carroll RJ (1989) Conditionally unbiased bounded-influence estimation in general regression models, with applications to generalized linear models. J Am Stat Assoc 84(406):460–466MathSciNetMATH Künsch HR, Stefanski LA, Carroll RJ (1989) Conditionally unbiased bounded-influence estimation in general regression models, with applications to generalized linear models. J Am Stat Assoc 84(406):460–466MathSciNetMATH
go back to reference Lessmann S, Haupt J, Coussement K, De Bock KW (2021) Targeting customers for profit: an ensemble learning framework to support marketing decision-making. Inf Sci 557:286–301MathSciNetCrossRef Lessmann S, Haupt J, Coussement K, De Bock KW (2021) Targeting customers for profit: an ensemble learning framework to support marketing decision-making. Inf Sci 557:286–301MathSciNetCrossRef
go back to reference Maronna RA, Martin RD, Yohai VJ, Salibián-Barrera M (2019) Robust statistics: theory and methods (with r). Wiley, HobrokenMATH Maronna RA, Martin RD, Yohai VJ, Salibián-Barrera M (2019) Robust statistics: theory and methods (with r). Wiley, HobrokenMATH
go back to reference Monti GS, Filzmoser P (2021) Robust logistic zero-sum regression for microbiome compositional data. Adv Data Anal Classif 16(2):301–324MathSciNetCrossRefMATH Monti GS, Filzmoser P (2021) Robust logistic zero-sum regression for microbiome compositional data. Adv Data Anal Classif 16(2):301–324MathSciNetCrossRefMATH
go back to reference Petrides G, Moldovan D, Coenen L, Guns T, Verbeke W (2022) Costsensitive learning for profit-driven credit scoring. J Oper Res Soc 73(2):338–350CrossRef Petrides G, Moldovan D, Coenen L, Guns T, Verbeke W (2022) Costsensitive learning for profit-driven credit scoring. J Oper Res Soc 73(2):338–350CrossRef
go back to reference Rousseeuw PJ, Hubert M (2011) Robust statistics for outlier detection. Wiley Interdiscip: Rev Data Min Knowl Discov 1(1):73–79 Rousseeuw PJ, Hubert M (2011) Robust statistics for outlier detection. Wiley Interdiscip: Rev Data Min Knowl Discov 1(1):73–79
go back to reference Sahin Y, Bulkan S, Duman E (2013) A cost-sensitive decision tree approach for fraud detection. Exp Sys Appl 40(15):5916–5923CrossRef Sahin Y, Bulkan S, Duman E (2013) A cost-sensitive decision tree approach for fraud detection. Exp Sys Appl 40(15):5916–5923CrossRef
go back to reference Štefelová N, Alfons A, Palarea-Albaladejo J, Filzmoser P, Hron K (2021) Robust regression with compositional covariates including cellwise outliers. Adv Data Anal Classif 15(4):869–909MathSciNetCrossRefMATH Štefelová N, Alfons A, Palarea-Albaladejo J, Filzmoser P, Hron K (2021) Robust regression with compositional covariates including cellwise outliers. Adv Data Anal Classif 15(4):869–909MathSciNetCrossRefMATH
go back to reference Vanderschueren T, Verdonck T, Baesens B, Verbeke W (2022) Predictthen- optimize or predict-and-optimize? an empirical evaluation of costsensitive learning strategies. Inf Sci 594:400–415CrossRef Vanderschueren T, Verdonck T, Baesens B, Verbeke W (2022) Predictthen- optimize or predict-and-optimize? an empirical evaluation of costsensitive learning strategies. Inf Sci 594:400–415CrossRef
go back to reference Verbeke W, Olaya D, Berrevoets J, Verboven S, Maldonado S (2020) The foundations of cost-sensitive causal classification. arXiv:2007.12582 Verbeke W, Olaya D, Berrevoets J, Verboven S, Maldonado S (2020) The foundations of cost-sensitive causal classification. arXiv:​2007.​12582
go back to reference Whitrow C, Hand DJ, Juszczak P, Weston D, Adams NM (2009) Transaction aggregation as a strategy for credit card fraud detection. Data Min Knowl Discov 18(1):30–55MathSciNetCrossRef Whitrow C, Hand DJ, Juszczak P, Weston D, Adams NM (2009) Transaction aggregation as a strategy for credit card fraud detection. Data Min Knowl Discov 18(1):30–55MathSciNetCrossRef
go back to reference Zelenkov Y (2019) Example-dependent cost-sensitive adaptive boosting. Exp Sys Appl 135:71–82CrossRef Zelenkov Y (2019) Example-dependent cost-sensitive adaptive boosting. Exp Sys Appl 135:71–82CrossRef
Metadata
Title
Robust instance-dependent cost-sensitive classification
Authors
Simon De Vos
Toon Vanderschueren
Tim Verdonck
Wouter Verbeke
Publication date
07-01-2023
Publisher
Springer Berlin Heidelberg
Published in
Advances in Data Analysis and Classification / Issue 4/2023
Print ISSN: 1862-5347
Electronic ISSN: 1862-5355
DOI
https://doi.org/10.1007/s11634-022-00533-3

Other articles of this Issue 4/2023

Advances in Data Analysis and Classification 4/2023 Go to the issue

Premium Partner