Skip to main content

2017 | OriginalPaper | Buchkapitel

Relevance-Based Evaluation Metrics for Multi-class Imbalanced Domains

verfasst von : Paula Branco, Luís Torgo, Rita P. Ribeiro

Erschienen in: Advances in Knowledge Discovery and Data Mining

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The class imbalance problem is a key issue that has received much attention. This attention has been mostly focused on two-classes problems. Fewer solutions exist for the multi-classes imbalance problem. From an evaluation point of view, the class imbalance problem is challenging because a non-uniform importance is assigned to the classes. In this paper, we propose a relevance-based evaluation framework that incorporates user preferences by allowing the assignment of differentiated importance values to each class. The presented solution is able to overcome difficulties detected in existing measures and increases discrimination capability. The proposed framework requires the assignment of a relevance score to the problem classes. To deal with cases where the user is not able to specify each class relevance, we describe three mechanisms to incorporate the existing domain knowledge into the relevance framework. These mechanisms differ in the amount of information available and assumptions made regarding the domain. They also allow the use of our framework in common settings of multi-class imbalanced problems with different levels of information available.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The experimental framework, code and results of this evaluation is available in https://​github.​com/​paobranco/​Relevance-basedMulticlassI​mbalanceMetrics.
 
Literatur
1.
Zurück zum Zitat Brüggemann, R., Sørensen, P.B., Lerche, D., Carlsen, L.: Estimation of averaged ranks by a local partial order model#. J. Chem. Inf. Comput. Sci. 44(2), 618–625 (2004)CrossRef Brüggemann, R., Sørensen, P.B., Lerche, D., Carlsen, L.: Estimation of averaged ranks by a local partial order model#. J. Chem. Inf. Comput. Sci. 44(2), 618–625 (2004)CrossRef
2.
Zurück zum Zitat Cohen, G., Hilario, M., Sax, H., Hugonnet, S., Geissbuhler, A.: Learning from imbalanced data in surveillance of nosocomial infection. Artif. Intell. Med. 37(1), 7–18 (2006)CrossRef Cohen, G., Hilario, M., Sax, H., Hugonnet, S., Geissbuhler, A.: Learning from imbalanced data in surveillance of nosocomial infection. Artif. Intell. Med. 37(1), 7–18 (2006)CrossRef
4.
Zurück zum Zitat Ferri, C., Hernández-Orallo, J., Modroiu, R.: An experimental comparison of performance measures for classification. Pattern Recognit. Lett. 30(1), 27–38 (2009)CrossRef Ferri, C., Hernández-Orallo, J., Modroiu, R.: An experimental comparison of performance measures for classification. Pattern Recognit. Lett. 30(1), 27–38 (2009)CrossRef
5.
Zurück zum Zitat Forman, G., Scholz, M.: Apples-to-apples in cross-validation studies: pitfalls in classifier performance measurement. SIGKDD Explor. Newsl. 12(1), 49–57 (2010)CrossRef Forman, G., Scholz, M.: Apples-to-apples in cross-validation studies: pitfalls in classifier performance measurement. SIGKDD Explor. Newsl. 12(1), 49–57 (2010)CrossRef
6.
Zurück zum Zitat Gorodkin, J.: Comparing two K-category assignments by a K-category correlation coefficient. Comput. Biol. Chem. 28(5), 367–374 (2004)CrossRefMATH Gorodkin, J.: Comparing two K-category assignments by a K-category correlation coefficient. Comput. Biol. Chem. 28(5), 367–374 (2004)CrossRefMATH
7.
Zurück zum Zitat Gu, Q., Zhu, L., Cai, Z.: Evaluation measures of the classification performance of imbalanced data sets. In: Cai, Z., Li, Z., Kang, Z., Liu, Y. (eds.) ISICA 2009. CCIS, vol. 51, pp. 461–471. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04962-0_53 CrossRef Gu, Q., Zhu, L., Cai, Z.: Evaluation measures of the classification performance of imbalanced data sets. In: Cai, Z., Li, Z., Kang, Z., Liu, Y. (eds.) ISICA 2009. CCIS, vol. 51, pp. 461–471. Springer, Heidelberg (2009). doi:10.​1007/​978-3-642-04962-0_​53 CrossRef
8.
Zurück zum Zitat Hand, D.J.: Measuring classifier performance: a coherent alternative to the area under the ROC curve. Mach. Learn. 77(1), 103–123 (2009)CrossRef Hand, D.J.: Measuring classifier performance: a coherent alternative to the area under the ROC curve. Mach. Learn. 77(1), 103–123 (2009)CrossRef
9.
Zurück zum Zitat Hand, D.J., Till, R.J.: A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach 45(2), 171–186 (2001)CrossRefMATH Hand, D.J., Till, R.J.: A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach 45(2), 171–186 (2001)CrossRefMATH
10.
Zurück zum Zitat Hempstalk, K., Frank, E.: Discriminating against new classes: one-class versus multi-class classification. In: Wobcke, W., Zhang, M. (eds.) AI 2008. LNCS (LNAI), vol. 5360, pp. 325–336. Springer, Heidelberg (2008). doi:10.1007/978-3-540-89378-3_32 CrossRef Hempstalk, K., Frank, E.: Discriminating against new classes: one-class versus multi-class classification. In: Wobcke, W., Zhang, M. (eds.) AI 2008. LNCS (LNAI), vol. 5360, pp. 325–336. Springer, Heidelberg (2008). doi:10.​1007/​978-3-540-89378-3_​32 CrossRef
11.
Zurück zum Zitat Matthews, B.W.: Comparison of the predicted and observed secondary structure of T4 phage lysozyme. BBA-Protein Struct. 405(2), 442–451 (1975)CrossRef Matthews, B.W.: Comparison of the predicted and observed secondary structure of T4 phage lysozyme. BBA-Protein Struct. 405(2), 442–451 (1975)CrossRef
12.
Zurück zum Zitat Mosley, L.: A balanced approach to the multi-class imbalance problem. Graduate Theses and Dissertations, Paper 13537 (2013) Mosley, L.: A balanced approach to the multi-class imbalance problem. Graduate Theses and Dissertations, Paper 13537 (2013)
13.
Zurück zum Zitat Sindhwani, V., Bhattacharya, P., Rakshit, S.: Information theoretic feature crediting in multiclass support vector machines. In: SDM, pp. 1–18. SIAM (2001) Sindhwani, V., Bhattacharya, P., Rakshit, S.: Information theoretic feature crediting in multiclass support vector machines. In: SDM, pp. 1–18. SIAM (2001)
14.
Zurück zum Zitat Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 45(4), 427–437 (2009)CrossRef Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 45(4), 427–437 (2009)CrossRef
15.
Zurück zum Zitat Sun, Y., Kamel, M.S., Wang, Y.: Boosting for learning multiple classes with imbalanced class distribution. In: ICDM, pp. 592–602. IEEE (2006) Sun, Y., Kamel, M.S., Wang, Y.: Boosting for learning multiple classes with imbalanced class distribution. In: ICDM, pp. 592–602. IEEE (2006)
16.
Zurück zum Zitat Wei, J.M., Yuan, X.J., Hu, Q.H., Wang, S.Q.: A novel measure for evaluating classifiers. Expert Syst. Appl. 37(5), 3799–3809 (2010)CrossRef Wei, J.M., Yuan, X.J., Hu, Q.H., Wang, S.Q.: A novel measure for evaluating classifiers. Expert Syst. Appl. 37(5), 3799–3809 (2010)CrossRef
Metadaten
Titel
Relevance-Based Evaluation Metrics for Multi-class Imbalanced Domains
verfasst von
Paula Branco
Luís Torgo
Rita P. Ribeiro
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-57454-7_54

Premium Partner