Skip to main content

2020 | OriginalPaper | Buchkapitel

Cost Sensitive Evaluation of Instance Hardness in Machine Learning

verfasst von : Ricardo B. C. Prudêncio

Erschienen in: Machine Learning and Knowledge Discovery in Databases

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Measuring hardness of individual instances in machine learning contributes to a deeper analysis of learning performance. This work proposes instance hardness measures for binary classification in cost-sensitive scenarios. Here cost curves are generated for each instance, defined as the loss observed for a pool of learning models for that instance along the range of cost proportions. Instance hardness is defined as the area under the cost curves and can be seen as an expected loss of difficulty along cost proportions. Different cost curves were proposed by considering common decision threshold choice methods in literature, thus providing alternative views of instance hardness.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
2
J48, IBk, Logistic Regression, Naive Bayes and Random Forest were adopted. IBK adopted k = 5. The other algorithms were applied using default parameter values.
 
Literatur
3.
Zurück zum Zitat Brodley, C.E., Friedl, M.A.: Identifying mislabeled training data. J. Artif. Intell. Res. 11, 131–167 (1999)MATHCrossRef Brodley, C.E., Friedl, M.A.: Identifying mislabeled training data. J. Artif. Intell. Res. 11, 131–167 (1999)MATHCrossRef
7.
Zurück zum Zitat Garcia, L.P., Carvalho, A.C., Lorena, A.C.: Effect of label noise in the complexity of classification problems. Neurocomputing 160, 108–119 (2015)CrossRef Garcia, L.P., Carvalho, A.C., Lorena, A.C.: Effect of label noise in the complexity of classification problems. Neurocomputing 160, 108–119 (2015)CrossRef
8.
Zurück zum Zitat Hernández-Orallo, J., Flach, P., Ferri, C.: Brier curves: a new cost-based visualisation of classifier performance. In: 28th International Conference on Machine Learning (2011) Hernández-Orallo, J., Flach, P., Ferri, C.: Brier curves: a new cost-based visualisation of classifier performance. In: 28th International Conference on Machine Learning (2011)
9.
Zurück zum Zitat Hernández-Orallo, J., Flach, P., Ferri, C.: A unified view of performance metrics: translating threshold choice into expected classification loss. J. Mach. Learn. Res. 13(1), 2813–2869 (2012)MathSciNetMATH Hernández-Orallo, J., Flach, P., Ferri, C.: A unified view of performance metrics: translating threshold choice into expected classification loss. J. Mach. Learn. Res. 13(1), 2813–2869 (2012)MathSciNetMATH
11.
Zurück zum Zitat Luengo, J., Shim, S.O., Alshomrani, S., Altalhi, A., Herrera, F.: CNC-NOS: class noise cleaning by ensemble filtering and noise scoring. Knowl.-Based Syst. 140, 27–49 (2018)CrossRef Luengo, J., Shim, S.O., Alshomrani, S., Altalhi, A., Herrera, F.: CNC-NOS: class noise cleaning by ensemble filtering and noise scoring. Knowl.-Based Syst. 140, 27–49 (2018)CrossRef
12.
Zurück zum Zitat Martınez-Plumed, F., Prudêncio, R.B., Martınez-Usó, A., Hernández-Orallo, J.: Making sense of item response theory in machine learning. In: European Conference on Artificial Intelligence, ECAI, pp. 1140–1148 (2016) Martınez-Plumed, F., Prudêncio, R.B., Martınez-Usó, A., Hernández-Orallo, J.: Making sense of item response theory in machine learning. In: European Conference on Artificial Intelligence, ECAI, pp. 1140–1148 (2016)
13.
Zurück zum Zitat Melville, P., Mooney, R.J.: Diverse ensembles for active learning. In: Proceedings of the 21st International Conference on Machine Learning, p. 74 (2004) Melville, P., Mooney, R.J.: Diverse ensembles for active learning. In: Proceedings of the 21st International Conference on Machine Learning, p. 74 (2004)
16.
Zurück zum Zitat Sluban, B., Lavrac, N.: Relating ensemble diversity and performance: a study in class noise detection. Neurocomputing 160, 120–131 (2015)CrossRef Sluban, B., Lavrac, N.: Relating ensemble diversity and performance: a study in class noise detection. Neurocomputing 160, 120–131 (2015)CrossRef
19.
Zurück zum Zitat Woloszynski, T., Kurzynski, M., Podsiadlo, P., Stachowiak, G.W.: A measure of competence based on random classification for dynamic ensemble selection. Inf. Fusion 13(3), 207–213 (2012)CrossRef Woloszynski, T., Kurzynski, M., Podsiadlo, P., Stachowiak, G.W.: A measure of competence based on random classification for dynamic ensemble selection. Inf. Fusion 13(3), 207–213 (2012)CrossRef
20.
Zurück zum Zitat Woods, K., Kegelmeyer, W., Bowyer, K.: Combination of multiple classifiers using local accuracy estimates. IEEE Trans. Pattern Anal. Mach. Intell. 19, 405–410 (1997)CrossRef Woods, K., Kegelmeyer, W., Bowyer, K.: Combination of multiple classifiers using local accuracy estimates. IEEE Trans. Pattern Anal. Mach. Intell. 19, 405–410 (1997)CrossRef
Metadaten
Titel
Cost Sensitive Evaluation of Instance Hardness in Machine Learning
verfasst von
Ricardo B. C. Prudêncio
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-46147-8_6