Skip to main content
Erschienen in: Progress in Artificial Intelligence 2/2015

01.03.2015 | Regular Paper

Optimizing different loss functions in multilabel classifications

verfasst von: Jorge Díez, Oscar Luaces, Juan José del Coz, Antonio Bahamonde

Erschienen in: Progress in Artificial Intelligence | Ausgabe 2/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Multilabel classification (ML) aims to assign a set of labels to an instance. This generalization of multiclass classification yields to the redefinition of loss functions and the learning tasks become harder. The objective of this paper is to gain insights into the relations of optimization aims and some of the most popular performance measures: subset (or 0/1), Hamming, and the example-based F-measure. To make a fair comparison, we implemented three ML learners for optimizing explicitly each one of these measures in a common framework. This can be done considering a subset of labels as a structured output. Then, we use structured output support vector machines tailored to optimize a given loss function. The paper includes an exhaustive experimental comparison. The conclusion is that in most cases, the optimization of the Hamming loss produces the best or competitive scores. This is a practical result since the Hamming loss can be minimized using a bunch of binary classifiers, one for each label separately, and therefore, it is a scalable and fast method to learn ML tasks. Additionally, we observe that in noise-free learning tasks optimizing the subset loss is the best option, but the differences are very small. We have also noticed that the biggest room for improvement can be found when the goal is to optimize an F-measure in noisy learning tasks.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
http://​www.​aic.​uniovi.​es/​ml_​generator/​.
Table 1
Cardinality and density statistics of the 48 free-noise datasets
 
Cardinality
Density (%)
50 Labels
   Max
4.3
9
   Min
2.5
5
   Mean
3.3
7
   SD
0.5
1
25 Labels
   Max
4.3
17
   Min
2.4
10
   Mean
3.1
13
   SD
0.6
2
10 Labels
   Max
4.0
40
   Min
1.8
18
   Mean
2.9
29
   SD
0.7
7
Datasets with Bernoulli and swap noise present similar figures
 
Literatur
1.
Zurück zum Zitat Cheng, W., Hüllermeier, E.: Combining instance-based learning and logistic regression for multilabel classification. Mach Learn 76(2), 211–225 (2009)CrossRef Cheng, W., Hüllermeier, E.: Combining instance-based learning and logistic regression for multilabel classification. Mach Learn 76(2), 211–225 (2009)CrossRef
2.
Zurück zum Zitat Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. J Mach Learn Res 2, 265–292 (2002)MATH Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. J Mach Learn Res 2, 265–292 (2002)MATH
3.
Zurück zum Zitat Dembczyński, K., Cheng, W., Hüllermeier, E.: Bayes optimal multilabel classification via probabilistic classifier chains. In: Proceedings of the International Conference on Machine Learning (ICML) (2010) Dembczyński, K., Cheng, W., Hüllermeier, E.: Bayes optimal multilabel classification via probabilistic classifier chains. In: Proceedings of the International Conference on Machine Learning (ICML) (2010)
4.
Zurück zum Zitat Dembczyński, K., Kotłowski, W., Jachnik, A., Waegeman, W., Hüllermeier, E.: Optimizing the f-measure in multi-label classification: plug-in rule approach versus structured loss minimization. ICML (2013) Dembczyński, K., Kotłowski, W., Jachnik, A., Waegeman, W., Hüllermeier, E.: Optimizing the f-measure in multi-label classification: plug-in rule approach versus structured loss minimization. ICML (2013)
5.
Zurück zum Zitat Dembczyński, K., Waegeman, W., Cheng, W., Hüllermeier, E.: An exact algorithm for F-measure maximization. In: Proceedings of the neural information processing systems (NIPS) (2011) Dembczyński, K., Waegeman, W., Cheng, W., Hüllermeier, E.: An exact algorithm for F-measure maximization. In: Proceedings of the neural information processing systems (NIPS) (2011)
6.
Zurück zum Zitat Dembczyński, K., Waegeman, W., Cheng, W., Hüllermeier, E.: On label dependence and loss minimization in multi-label classification. Mach Learn 88, 1–41 (2012)CrossRefMathSciNet Dembczyński, K., Waegeman, W., Cheng, W., Hüllermeier, E.: On label dependence and loss minimization in multi-label classification. Mach Learn 88, 1–41 (2012)CrossRefMathSciNet
7.
Zurück zum Zitat Díez, J., del Coz, J.J., Luaces, O., Bahamonde, A.: Tensor products to optimize label-based loss measures in multilabel classifications. Tech. rep., Centro de Inteligencia Artificial. Universidad de Oviedo at Gijón (2012) Díez, J., del Coz, J.J., Luaces, O., Bahamonde, A.: Tensor products to optimize label-based loss measures in multilabel classifications. Tech. rep., Centro de Inteligencia Artificial. Universidad de Oviedo at Gijón (2012)
8.
Zurück zum Zitat Elisseeff, A., Weston, J.: A kernel method for multi-labelled classification. In: Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS), pp. 681–687. MIT Press, Cambridge (2001) Elisseeff, A., Weston, J.: A kernel method for multi-labelled classification. In: Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS), pp. 681–687. MIT Press, Cambridge (2001)
9.
Zurück zum Zitat Gao, W., Zhou, Z.H.: On the consistency of multi-label learning. J Mach Learn Res Proc Track (COLT) 19, 341–358 (2011) Gao, W., Zhou, Z.H.: On the consistency of multi-label learning. J Mach Learn Res Proc Track (COLT) 19, 341–358 (2011)
10.
Zurück zum Zitat Ghamrawi, N., McCallum, A.: Collective multi-label classification. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 195–200. ACM, New York (2005) Ghamrawi, N., McCallum, A.: Collective multi-label classification. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 195–200. ACM, New York (2005)
11.
Zurück zum Zitat Hariharan, B., Vishwanathan, S., Varma, M.: Efficient max-margin multi-label classification with applications to zero-shot learning. Mach Learn 88(1–2), 127–155 (2012)CrossRefMATHMathSciNet Hariharan, B., Vishwanathan, S., Varma, M.: Efficient max-margin multi-label classification with applications to zero-shot learning. Mach Learn 88(1–2), 127–155 (2012)CrossRefMATHMathSciNet
12.
Zurück zum Zitat Joachims, T.: A support vector method for multivariate performance measures. In: Proceedings of the International Conference on Machine Learning (ICML) (2005) Joachims, T.: A support vector method for multivariate performance measures. In: Proceedings of the International Conference on Machine Learning (ICML) (2005)
13.
Zurück zum Zitat Joachims, T.: Training linear SVMs in linear time. In: Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD). ACM, New York (2006) Joachims, T.: Training linear SVMs in linear time. In: Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD). ACM, New York (2006)
14.
Zurück zum Zitat Joachims, T., Finley, T., Yu, C.: Cutting-plane training of structural svms. Mach Learn 77(1), 27–59 (2009)CrossRefMATH Joachims, T., Finley, T., Yu, C.: Cutting-plane training of structural svms. Mach Learn 77(1), 27–59 (2009)CrossRefMATH
15.
Zurück zum Zitat Lampert, C.H.: Maximum margin multi-label structured prediction. In: Advances in Neural Information Processing Systems, pp. 289–297 (2011) Lampert, C.H.: Maximum margin multi-label structured prediction. In: Advances in Neural Information Processing Systems, pp. 289–297 (2011)
16.
Zurück zum Zitat Luaces, O., Dfez, J., Barranquero, J., del Coz, J.J., Bahamonde, A.: Binary relevance efficacy for multilabel classification. Prog Artif Intell 4(1), 303–313 (2012)CrossRef Luaces, O., Dfez, J., Barranquero, J., del Coz, J.J., Bahamonde, A.: Binary relevance efficacy for multilabel classification. Prog Artif Intell 4(1), 303–313 (2012)CrossRef
18.
Zurück zum Zitat Montañés, E., Quevedo, J., del Coz, J.: Aggregating independent and dependent models to learn multi-label classifiers. In: Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), pp. 484–500. Springer, Berlin (2011) Montañés, E., Quevedo, J., del Coz, J.: Aggregating independent and dependent models to learn multi-label classifiers. In: Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), pp. 484–500. Springer, Berlin (2011)
19.
Zurück zum Zitat Montañes, E., Senge, R., Barranquero, J., Ramón Quevedo, J., José del Coz, J., Hüllermeier, E.: Dependent binary relevance models for multi-label classification. Pattern Recognit 47(3), 1494–1508 (2014)CrossRef Montañes, E., Senge, R., Barranquero, J., Ramón Quevedo, J., José del Coz, J., Hüllermeier, E.: Dependent binary relevance models for multi-label classification. Pattern Recognit 47(3), 1494–1508 (2014)CrossRef
20.
Zurück zum Zitat Petterson, J., Caetano, T.: Reverse multi-label learning. In: Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS), pp. 1912–1920 (2010) Petterson, J., Caetano, T.: Reverse multi-label learning. In: Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS), pp. 1912–1920 (2010)
21.
Zurück zum Zitat Petterson, J., Caetano, T.S.: Submodular multi-label learning. In: Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS), pp. 1512–1520 (2011) Petterson, J., Caetano, T.S.: Submodular multi-label learning. In: Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS), pp. 1512–1520 (2011)
22.
Zurück zum Zitat Quevedo, J.R., Luaces, O., Bahamonde, A.: Multilabel classifiers with a probabilistic thresholding strategy. Pattern Recognit 45(2), 876–883 (2012)MATH Quevedo, J.R., Luaces, O., Bahamonde, A.: Multilabel classifiers with a probabilistic thresholding strategy. Pattern Recognit 45(2), 876–883 (2012)MATH
23.
Zurück zum Zitat Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. In: Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), pp. 254–269 (2009) Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. In: Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), pp. 254–269 (2009)
24.
Zurück zum Zitat Schapire, R., Singer, Y.: Boostexter: a boosting-based system for text categorization. Mach Learn 39(2), 135–168 (2000)CrossRefMATH Schapire, R., Singer, Y.: Boostexter: a boosting-based system for text categorization. Mach Learn 39(2), 135–168 (2000)CrossRefMATH
25.
Zurück zum Zitat Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. J Mach Learn Res 6(2), 1453 (2006)MathSciNet Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. J Mach Learn Res 6(2), 1453 (2006)MathSciNet
26.
Zurück zum Zitat Tsoumakas, G., Katakis, I.: Multi labelclassification: an overview. Int J Data Wareh Min 3(3), 1–13 (2007)CrossRef Tsoumakas, G., Katakis, I.: Multi labelclassification: an overview. Int J Data Wareh Min 3(3), 1–13 (2007)CrossRef
27.
Zurück zum Zitat Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multilabel data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook. Springer, Berlin (2010) Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multilabel data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook. Springer, Berlin (2010)
28.
Zurück zum Zitat Tsoumakas, G., Katakis, I., Vlahavas, I.: Random k-labelsets for multi-label classification. IEEE Trans Knowl Discov Data Eng 23, 1079–1089 (2010)CrossRef Tsoumakas, G., Katakis, I., Vlahavas, I.: Random k-labelsets for multi-label classification. IEEE Trans Knowl Discov Data Eng 23, 1079–1089 (2010)CrossRef
29.
Zurück zum Zitat Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)MATH Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)MATH
31.
Zurück zum Zitat Zaragoza, J., Sucar, L., Bielza, C., Larrañaga, P.: Bayesian chain classifiers for multidimensional classification. In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) (2011) Zaragoza, J., Sucar, L., Bielza, C., Larrañaga, P.: Bayesian chain classifiers for multidimensional classification. In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) (2011)
32.
Zurück zum Zitat Zhang, M.L., Zhou, Z.H.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognit 40(7), 2038–2048 (2007)CrossRefMATH Zhang, M.L., Zhou, Z.H.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognit 40(7), 2038–2048 (2007)CrossRefMATH
Metadaten
Titel
Optimizing different loss functions in multilabel classifications
verfasst von
Jorge Díez
Oscar Luaces
Juan José del Coz
Antonio Bahamonde
Publikationsdatum
01.03.2015
Verlag
Springer Berlin Heidelberg
Erschienen in
Progress in Artificial Intelligence / Ausgabe 2/2015
Print ISSN: 2192-6352
Elektronische ISSN: 2192-6360
DOI
https://doi.org/10.1007/s13748-014-0060-7

Premium Partner