Skip to main content
Erschienen in: Progress in Artificial Intelligence 4/2012

01.12.2012 | Regular Paper

Binary relevance efficacy for multilabel classification

verfasst von: Oscar Luaces, Jorge Díez, José Barranquero, Juan José del Coz, Antonio Bahamonde

Erschienen in: Progress in Artificial Intelligence | Ausgabe 4/2012

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The goal of multilabel (ML) classification is to induce models able to tag objects with the labels that better describe them. The main baseline for ML classification is binary relevance (BR), which is commonly criticized in the literature because of its label independence assumption. Despite this fact, this paper discusses some interesting properties of BR, mainly that it produces optimal models for several ML loss functions. Additionally, we present an analytical study of ML benchmarks datasets and point out some shortcomings. As a result, this paper proposes the use of synthetic datasets to better analyze the behavior of ML methods in domains with different characteristics. To support this claim, we perform some experiments using synthetic data proving the competitive performance of BR with respect to a more complex method in difficult problems with many labels, a conclusion which was not stated by previous studies.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bielza, C., Li, G., Larrañaga, P.: Multi-dimensional classification with bayesian networks. Int. J. Approx. Reason. 52(6), 705–727 (2011)MATHCrossRef Bielza, C., Li, G., Larrañaga, P.: Multi-dimensional classification with bayesian networks. Int. J. Approx. Reason. 52(6), 705–727 (2011)MATHCrossRef
3.
Zurück zum Zitat Cheng, W., Hüllermeier, E.: Combining instance-based learning and logistic regression for multilabel classification. Mach. Learn. 76(2), 211–225 (2009)CrossRef Cheng, W., Hüllermeier, E.: Combining instance-based learning and logistic regression for multilabel classification. Mach. Learn. 76(2), 211–225 (2009)CrossRef
4.
Zurück zum Zitat Dembczyński, K., Cheng, W., Hüllermeier, E.: Bayes optimal multilabel classification via probabilistic classifier chains. In: Proceedings of the 27th International Conference on Machine Learning (ICML) pp. 279–286 (2010) Dembczyński, K., Cheng, W., Hüllermeier, E.: Bayes optimal multilabel classification via probabilistic classifier chains. In: Proceedings of the 27th International Conference on Machine Learning (ICML) pp. 279–286 (2010)
5.
Zurück zum Zitat Dembczyński, K., Waegeman, W., Cheng, W., Hüllermeier, E.: An exact algorithm for f-measure maximization. In: Proceedings of the Neural Information Processing Systems (NIPS), pp. 1404–1412 (2011) Dembczyński, K., Waegeman, W., Cheng, W., Hüllermeier, E.: An exact algorithm for f-measure maximization. In: Proceedings of the Neural Information Processing Systems (NIPS), pp. 1404–1412 (2011)
6.
Zurück zum Zitat Elisseeff, A., Weston, J.: A kernel method for multi-labelled classification. In: Advances in Neural Information Processing Systems 14, MIT Press, Cambridge, pp. 681–687 (2001) Elisseeff, A., Weston, J.: A kernel method for multi-labelled classification. In: Advances in Neural Information Processing Systems 14, MIT Press, Cambridge, pp. 681–687 (2001)
7.
Zurück zum Zitat Godbole, S., Sarawagi, S.: Discriminative methods for multi-labeled classification. In: Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science), vol. 3056, pp. 22–30 (2004) Godbole, S., Sarawagi, S.: Discriminative methods for multi-labeled classification. In: Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science), vol. 3056, pp. 22–30 (2004)
8.
Zurück zum Zitat Joachims, T.: A support vector method for multivariate performance measures. In: Proceedings of the ICML ’05, pp. 377–384 (2005) Joachims, T.: A support vector method for multivariate performance measures. In: Proceedings of the ICML ’05, pp. 377–384 (2005)
9.
Zurück zum Zitat Lastra, G., Luaces, O., Quevedo, J., Bahamonde, A.: Graphical feature selection for multilabel classification tasks. In: Gama, J., Bradley, E., Hollmén, J. (eds.) Proceedings of Advances in Intelligent Data Analysis X (IDA 2011). Springer, Lecture Notes in Computer Science, vol. 7014, 246–257 (2011) Lastra, G., Luaces, O., Quevedo, J., Bahamonde, A.: Graphical feature selection for multilabel classification tasks. In: Gama, J., Bradley, E., Hollmén, J. (eds.) Proceedings of Advances in Intelligent Data Analysis X (IDA 2011). Springer, Lecture Notes in Computer Science, vol. 7014, 246–257 (2011)
10.
Zurück zum Zitat Madjarov, G., Kocev, D., Gjorgjevikj, D., Deroski, S.: An extensive experimental comparison of methods for multi-label learning. Pattern Recognit. 45(9), 3084–3104 (2012)CrossRef Madjarov, G., Kocev, D., Gjorgjevikj, D., Deroski, S.: An extensive experimental comparison of methods for multi-label learning. Pattern Recognit. 45(9), 3084–3104 (2012)CrossRef
11.
Zurück zum Zitat Montañés, E., Quevedo, J., del Coz, J.: Aggregating independent and dependent models to learn multi-label classifiers. In: Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECLM-PKDD) pp. 484–500 (2011) Montañés, E., Quevedo, J., del Coz, J.: Aggregating independent and dependent models to learn multi-label classifiers. In: Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECLM-PKDD) pp. 484–500 (2011)
12.
Zurück zum Zitat Petterson, J., Caetano, T.: Reverse multi-label learning. Adv. Neural Inform. Process. Syst. 23, 1912–1920 (2010) Petterson, J., Caetano, T.: Reverse multi-label learning. Adv. Neural Inform. Process. Syst. 23, 1912–1920 (2010)
13.
Zurück zum Zitat Quevedo, J.R., Luaces, O., Bahamonde, A.: Multilabel classifiers with a probabilistic thresholding strategy. Pattern Recognit. 45(2), 876–883 (2012)MATH Quevedo, J.R., Luaces, O., Bahamonde, A.: Multilabel classifiers with a probabilistic thresholding strategy. Pattern Recognit. 45(2), 876–883 (2012)MATH
14.
Zurück zum Zitat Read, J., Pfahringer, B., Holmes, G.: Generating synthetic multi-label data streams. In: ECML/PKKD 2009 Workshop on Learning from Multi-label Data (MLD’09), pp. 69–84 (2009a) Read, J., Pfahringer, B., Holmes, G.: Generating synthetic multi-label data streams. In: ECML/PKKD 2009 Workshop on Learning from Multi-label Data (MLD’09), pp. 69–84 (2009a)
15.
Zurück zum Zitat Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. In: Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), pp. 254–269 (2009b) Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. In: Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), pp. 254–269 (2009b)
16.
Zurück zum Zitat Read, J., Bifet, A., Holmes, G., Pfahringer, B.: Streaming multi-label classification. JMLR Workshop and Conference Proceedings (Second Workshop on Applications of Pattern Analysis), vol. 17, pp. 19–25 (2011a) Read, J., Bifet, A., Holmes, G., Pfahringer, B.: Streaming multi-label classification. JMLR Workshop and Conference Proceedings (Second Workshop on Applications of Pattern Analysis), vol. 17, pp. 19–25 (2011a)
17.
Zurück zum Zitat Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333–359 (2011b)CrossRef Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333–359 (2011b)CrossRef
18.
Zurück zum Zitat Schapire, R., Singer, Y.: Boostexter: a boosting-based system for text categorization. Mach. Learn. 39(2), 135–168 (2000)MATHCrossRef Schapire, R., Singer, Y.: Boostexter: a boosting-based system for text categorization. Mach. Learn. 39(2), 135–168 (2000)MATHCrossRef
19.
Zurück zum Zitat Tsoumakas, G., Katakis, I.: Multi label classification: an overview. Int. J. Data Warehousing Min. 3(3), 1–13 (2007)CrossRef Tsoumakas, G., Katakis, I.: Multi label classification: an overview. Int. J. Data Warehousing Min. 3(3), 1–13 (2007)CrossRef
20.
Zurück zum Zitat Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multilabel data. Data Mining and Knowledge Discovery Handbook pp. 667–685 (2010a) Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multilabel data. Data Mining and Knowledge Discovery Handbook pp. 667–685 (2010a)
21.
Zurück zum Zitat Tsoumakas, G., Katakis, I., Vlahavas, I.: Random k-labelsets for multi-label classification. IEEE Trans. Knowl. Discov. Data Eng. 23(7), 1079–1089 (2010b)CrossRef Tsoumakas, G., Katakis, I., Vlahavas, I.: Random k-labelsets for multi-label classification. IEEE Trans. Knowl. Discov. Data Eng. 23(7), 1079–1089 (2010b)CrossRef
22.
Zurück zum Zitat Zaragoza, J., Sucar, L., Bielza, C., Larrañaga, P.: Bayesian chain classifiers for multidimensional classification. In: Twenty-Second International Joint Conference on Artificial Intelligence (IJCAI), pp. 2192–2197 (2011) Zaragoza, J., Sucar, L., Bielza, C., Larrañaga, P.: Bayesian chain classifiers for multidimensional classification. In: Twenty-Second International Joint Conference on Artificial Intelligence (IJCAI), pp. 2192–2197 (2011)
23.
Zurück zum Zitat Zhang, M.L., Zhou, Z.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognit. 40(7), 2038–2048 (2007)MATHCrossRef Zhang, M.L., Zhou, Z.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognit. 40(7), 2038–2048 (2007)MATHCrossRef
Metadaten
Titel
Binary relevance efficacy for multilabel classification
verfasst von
Oscar Luaces
Jorge Díez
José Barranquero
Juan José del Coz
Antonio Bahamonde
Publikationsdatum
01.12.2012
Verlag
Springer-Verlag
Erschienen in
Progress in Artificial Intelligence / Ausgabe 4/2012
Print ISSN: 2192-6352
Elektronische ISSN: 2192-6360
DOI
https://doi.org/10.1007/s13748-012-0030-x

Weitere Artikel der Ausgabe 4/2012

Progress in Artificial Intelligence 4/2012 Zur Ausgabe