Skip to main content
Top

2019 | OriginalPaper | Chapter

PParabel: Parallel Partitioned Label Trees for Extreme Classification

Authors : Jiaqi Lu, Jun Zheng, Wenxin Hu

Published in: Network and Parallel Computing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Extreme classification consists of extreme multi-class or multi-label predictions, whose objective is to learn classifiers that can label each data point with the most relevant labels. Recently, some approaches such as 1-vs-all method have been proposed to accomplish the task. However, their training time is linear with the number of classes, which makes them unrealistic in real-world applications such as text and image tagging. In this work, we are motivated to present a two-stage thread-level parallelism which is based on Partitioned Label Trees for Extreme Classification (Parabel). Our method is able to train the tree nodes in different parallel ways according to their number of labels. We compare our algorithm with recent state-of-the-art approach on some publicly available real-world datasets which have up to 670,000 labels. The experimental results demonstrate that our algorithm achieves the shortest training time.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Agrawal, R., Gupta, A., Prabhu, Y., Varma, M.: Multi-label learning with millions of labels: recommending advertiser bid phrases for web pages. In: Proceedings of the 22nd International Conference on World Wide Web (WWW), pp. 13–24. ACM, New York (2013) Agrawal, R., Gupta, A., Prabhu, Y., Varma, M.: Multi-label learning with millions of labels: recommending advertiser bid phrases for web pages. In: Proceedings of the 22nd International Conference on World Wide Web (WWW), pp. 13–24. ACM, New York (2013)
2.
go back to reference Babbar, R., Schölkopf, B.: DiSMEC: distributed sparse machines for extreme multi-label classification. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining (WSDM), pp. 721–729. ACM, New York (2017) Babbar, R., Schölkopf, B.: DiSMEC: distributed sparse machines for extreme multi-label classification. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining (WSDM), pp. 721–729. ACM, New York (2017)
3.
go back to reference Bhatia, K., Jain, H., Kar, P., Varma, M., Jain, P.: Sparse local embeddings for extreme multi-label classification. In: Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS), vol. 1, pp. 730–738. MIT Press Cambridge, MA (2015) Bhatia, K., Jain, H., Kar, P., Varma, M., Jain, P.: Sparse local embeddings for extreme multi-label classification. In: Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS), vol. 1, pp. 730–738. MIT Press Cambridge, MA (2015)
4.
go back to reference Choromanska, A.E., Langford, J.: Logarithmic time online multi-class prediction. In: Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS), vol. 1, pp. 55–63. MIT Press Cambridge, MA (2015) Choromanska, A.E., Langford, J.: Logarithmic time online multi-class prediction. In: Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS), vol. 1, pp. 55–63. MIT Press Cambridge, MA (2015)
5.
go back to reference Jain, H., Prabhu, Y., Varma, M.: Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 935–944. ACM, New York (2016) Jain, H., Prabhu, Y., Varma, M.: Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 935–944. ACM, New York (2016)
6.
go back to reference Jasinska, K., Dembczynski, K., Busa-Fekete, R., Pfannschmidt, K., Klerx, T., Hullermeier, E.: Extreme F-measure maximization using sparse probability estimates. In: Proceedings of the 33rd International Conference on Machine Learning(ICML), vol. 48, pp. 1435–1444 (2016) Jasinska, K., Dembczynski, K., Busa-Fekete, R., Pfannschmidt, K., Klerx, T., Hullermeier, E.: Extreme F-measure maximization using sparse probability estimates. In: Proceedings of the 33rd International Conference on Machine Learning(ICML), vol. 48, pp. 1435–1444 (2016)
7.
go back to reference Liu, J., Chang, W.C., Wu, Y., Yang, Y.: Deep learning for extreme multi-label text classification. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pp. 115–124. ACM, New York (2017) Liu, J., Chang, W.C., Wu, Y., Yang, Y.: Deep learning for extreme multi-label text classification. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pp. 115–124. ACM, New York (2017)
8.
go back to reference Mouhamadou, M.C.: Efficient extreme classification. Data Structures and Algorithms. [cs.DS]. Université Pierre et Marie Curie - Paris VI (2014) Mouhamadou, M.C.: Efficient extreme classification. Data Structures and Algorithms. [cs.DS]. Université Pierre et Marie Curie - Paris VI (2014)
9.
go back to reference Niculescu-Mizil, A., Abbasnejad, E.: Label filters for large scale multilabel classification. In: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 1448–1457 (2017) Niculescu-Mizil, A., Abbasnejad, E.: Label filters for large scale multilabel classification. In: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 1448–1457 (2017)
10.
go back to reference Prabhu, Y., Varma, M.: FastXML: a fast, accurate and stable tree- classifier for extreme multi-label learning. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 263–272. ACM, New York (2014) Prabhu, Y., Varma, M.: FastXML: a fast, accurate and stable tree- classifier for extreme multi-label learning. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 263–272. ACM, New York (2014)
11.
go back to reference Prabhu, Y., Kag, A., Harsola, S., Agrawal, R., Varma, M.: Parabel: partitioned label trees for extreme classification with application to dynamic search advertising. In: Proceedings of the 2018 World Wide Web Conference (WWW), pp. 993–1002. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva (2018) Prabhu, Y., Kag, A., Harsola, S., Agrawal, R., Varma, M.: Parabel: partitioned label trees for extreme classification with application to dynamic search advertising. In: Proceedings of the 2018 World Wide Web Conference (WWW), pp. 993–1002. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva (2018)
12.
go back to reference Yen, I.E.H., Huang, X., Dai, W., Ravikumar, P., Dhillon, I., Xing, E.: PPDsparse: a parallel primal-dual sparse method for extreme classification. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 545–553. ACM, New York (2017) Yen, I.E.H., Huang, X., Dai, W., Ravikumar, P., Dhillon, I., Xing, E.: PPDsparse: a parallel primal-dual sparse method for extreme classification. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 545–553. ACM, New York (2017)
13.
go back to reference Yen, I.E.H., Huang, X., Zhong, K., Ravikumar, P., Dhillon, I.S.: PD- Sparse: a primal and dual sparse approach to extreme multiclass and multilabel classification. In: Proceedings of the 33rd International Conference on Machine Learning (ICML), vol. 48, pp. 3069–3077. JMLR.org (2016) Yen, I.E.H., Huang, X., Zhong, K., Ravikumar, P., Dhillon, I.S.: PD- Sparse: a primal and dual sparse approach to extreme multiclass and multilabel classification. In: Proceedings of the 33rd International Conference on Machine Learning (ICML), vol. 48, pp. 3069–3077. JMLR.org (2016)
14.
go back to reference Yu, H., Jain, P., Kar, P., Dhillon, I.S.: Large-scale multi-label learning with missing labels. In: Proceedings of the 31st International Conference on Machine Learning (ICML), vol. 32, pp. I-592–I-601. JMLR.org (2014) Yu, H., Jain, P., Kar, P., Dhillon, I.S.: Large-scale multi-label learning with missing labels. In: Proceedings of the 31st International Conference on Machine Learning (ICML), vol. 32, pp. I-592–I-601. JMLR.org (2014)
15.
go back to reference Zhang, W., Yan, J., Wang, X., Zha, H.: Deep extreme multi-label learning. In: proceedings of the 2018 ACM on International Conference on Multimedia Retrieval (ICMR), pp. 100–107. ACM, New York (2018) Zhang, W., Yan, J., Wang, X., Zha, H.: Deep extreme multi-label learning. In: proceedings of the 2018 ACM on International Conference on Multimedia Retrieval (ICMR), pp. 100–107. ACM, New York (2018)
Metadata
Title
PParabel: Parallel Partitioned Label Trees for Extreme Classification
Authors
Jiaqi Lu
Jun Zheng
Wenxin Hu
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-30709-7_7

Premium Partner