Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 10/2023

11.05.2023 | Original Article

Multi-layered semantic representation network for multi-label image classification

verfasst von: Xiwen Qu, Hao Che, Jun Huang, Linchuan Xu, Xiao Zheng

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 10/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Multi-label image classification is a fundamental and practical task, which aims to assign multiple possible labels to an image. In recent years, many deep convolutional neural network (CNN) based approaches have been proposed which model label correlations to discover semantics of labels and learn semantic representations of images. This paper advances this research direction by improving both the modeling of label correlations and the learning of semantic representations. On the one hand, besides the local semantics of each label, we propose to further explore global semantics shared by multiple labels. On the other hand, existing approaches mainly learn the semantic representations at the last convolutional layer of a CNN. But it has been noted that the image representations of different layers of CNN capture different levels or scales of features and have different discriminative abilities. We thus propose to learn semantic representations at multiple convolutional layers. To this end, this paper designs a Multi-layered Semantic Representation Network (MSRN) which discovers both local and global semantics of labels through modeling label correlations and utilizes the label semantics to guide the semantic representations learning at multiple layers through an attention mechanism. Extensive experiments on five benchmark datasets including VOC2007, VOC2012, MS-COCO, NUS-WIDE, and Apparel show a competitive performance of the proposed MSRN against state-of-the-art models.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Fußnoten
1
Source codes and pre-trained models of our method are publicly available at https://​jiunhwang.​github.​io/​
 
Literatur
1.
Zurück zum Zitat Cai Z, Fan Q, Feris RS, Vasconcelos N (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: ECCV, pp 354–370 Cai Z, Fan Q, Feris RS, Vasconcelos N (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: ECCV, pp 354–370
2.
Zurück zum Zitat Chen T, Wang Z, Li G, Lin L (2017) Recurrent attentional reinforcement learning for multi-label image recognition. In: AAAI Chen T, Wang Z, Li G, Lin L (2017) Recurrent attentional reinforcement learning for multi-label image recognition. In: AAAI
3.
Zurück zum Zitat Chen S, Chen Y, Yeh C, Wang YF (2018) Order-free rnn with visual attention for multi-label classification. In: AAAI Chen S, Chen Y, Yeh C, Wang YF (2018) Order-free rnn with visual attention for multi-label classification. In: AAAI
5.
Zurück zum Zitat Chen T, Xu M, Hui X, Wu H, Lin L (2019) Learning semantic-specific graph representation for multi-label image recognition. In: ICCV, pp 522–531 Chen T, Xu M, Hui X, Wu H, Lin L (2019) Learning semantic-specific graph representation for multi-label image recognition. In: ICCV, pp 522–531
7.
Zurück zum Zitat Chua TS, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) Nus-wide: a real-world web image database from national university of Singapore. In: ICIVR, ACM Chua TS, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) Nus-wide: a real-world web image database from national university of Singapore. In: ICIVR, ACM
8.
Zurück zum Zitat Everingham M, Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. IJCV 88(2):303–338CrossRef Everingham M, Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. IJCV 88(2):303–338CrossRef
10.
Zurück zum Zitat Gao H, Z, Liu M, Laurens W, Kilian Q (2017) Densely connected convolutional networks. In: CVPR, pp 4700–4708 Gao H, Z, Liu M, Laurens W, Kilian Q (2017) Densely connected convolutional networks. In: CVPR, pp 4700–4708
11.
Zurück zum Zitat Ge ZY, Mahapatra D, Sedai S, Garnavi R, Chakravorty R (2018) Chest x-rays classification: a multi-label and fine-grained problem. arXiv Ge ZY, Mahapatra D, Sedai S, Garnavi R, Chakravorty R (2018) Chest x-rays classification: a multi-label and fine-grained problem. arXiv
16.
Zurück zum Zitat Jetley S, Lord NA, Lee N, Torr PHS (2018) Learn to pay attention. In: ICLR Jetley S, Lord NA, Lee N, Torr PHS (2018) Learn to pay attention. In: ICLR
18.
Zurück zum Zitat Kim J, On K, Kim J, Ha J, Zhang B (2016) Hadamard product for low-rank bilinear pooling (10) Kim J, On K, Kim J, Ha J, Zhang B (2016) Hadamard product for low-rank bilinear pooling (10)
20.
Zurück zum Zitat Li Q, Peng X, Qiao Y, Peng Q (2020) Learning label correlations for multi-label image recognition with graph networks. PRL 138:378–384CrossRef Li Q, Peng X, Qiao Y, Peng Q (2020) Learning label correlations for multi-label image recognition with graph networks. PRL 138:378–384CrossRef
21.
Zurück zum Zitat Lin T, Maire M, Belongie S, Bourdev L, Girshick R, Hays J, Perona P, Ramanan D, Zitnick CL, Dollr P (2014) Microsoft coco: common objects in context Lin T, Maire M, Belongie S, Bourdev L, Girshick R, Hays J, Perona P, Ramanan D, Zitnick CL, Dollr P (2014) Microsoft coco: common objects in context
22.
Zurück zum Zitat Liu Y, Sheng L, Shao J, Yan J, Xiang S, Pan C (2018) Multi-label image classification via knowledge distillation from weakly supervised detection. In: ACM MM, pp 700–708 Liu Y, Sheng L, Shao J, Yan J, Xiang S, Pan C (2018) Multi-label image classification via knowledge distillation from weakly supervised detection. In: ACM MM, pp 700–708
23.
Zurück zum Zitat Ma C, Chen Z, Lu J, Zhou J (2018) Rank-consistency multi-label deep hashing. In: ICME, pp 1–6 Ma C, Chen Z, Lu J, Zhou J (2018) Rank-consistency multi-label deep hashing. In: ICME, pp 1–6
24.
Zurück zum Zitat Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: ICLR Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: ICLR
25.
Zurück zum Zitat Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y (2018) Graph Attention Networks. In: ICLR Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y (2018) Graph Attention Networks. In: ICLR
26.
Zurück zum Zitat Wang D, Zhang S (2020) Unsupervised person re-identification via multi-label classification. In: CVPR, pp 10978–10987 Wang D, Zhang S (2020) Unsupervised person re-identification via multi-label classification. In: CVPR, pp 10978–10987
28.
Zurück zum Zitat Wang Z, Chen T, Li G, Xu R, Lin L (2017) Multi-label image recognition by recurrently discovering attentional regions. In: ICCV, pp 464–472 Wang Z, Chen T, Li G, Xu R, Lin L (2017) Multi-label image recognition by recurrently discovering attentional regions. In: ICCV, pp 464–472
29.
Zurück zum Zitat Wang Y, He D, Li F, Long X, Zhou Z, Ma J, Wen S (2020) Multi-label classification with label graph superimposing. In: AAAI, pp 12265–12272 Wang Y, He D, Li F, Long X, Zhou Z, Ma J, Wen S (2020) Multi-label classification with label graph superimposing. In: AAAI, pp 12265–12272
30.
Zurück zum Zitat Wang Y, Xie Y, Liu Y, Zhou K, Li X (2020) Fast graph convolution network based multi-label image recognition via cross-modal fusion. In: CIKM, pp 1575–1584 Wang Y, Xie Y, Liu Y, Zhou K, Li X (2020) Fast graph convolution network based multi-label image recognition via cross-modal fusion. In: CIKM, pp 1575–1584
32.
Zurück zum Zitat Yang S, Ramanan D (2015) Multi-scale recognition with dag-cnns. In: ICCV, pp 1215–1223 Yang S, Ramanan D (2015) Multi-scale recognition with dag-cnns. In: ICCV, pp 1215–1223
33.
Zurück zum Zitat Yang H, Zhou T, Zhang Y, Gao B, Wu J, Cai J (2016) Exploit bounding box annotations for multi-label object recognition. In: CVPR, pp 280–288 Yang H, Zhou T, Zhang Y, Gao B, Wu J, Cai J (2016) Exploit bounding box annotations for multi-label object recognition. In: CVPR, pp 280–288
34.
Zurück zum Zitat Yin R, You J, Morris C, Ren X, Hamilton WL, Leskovec J (2018) Hierarchical graph representation learning with differentiable pooling. In: NIPS, pp 4805–4815 Yin R, You J, Morris C, Ren X, Hamilton WL, Leskovec J (2018) Hierarchical graph representation learning with differentiable pooling. In: NIPS, pp 4805–4815
36.
Zurück zum Zitat Yuan J, Chen S, Zhang Y, Shi Z, Geng X, Fan J, Rui Y (2022) Graph attention transformer network for multi-label image classification Yuan J, Chen S, Zhang Y, Shi Z, Geng X, Fan J, Rui Y (2022) Graph attention transformer network for multi-label image classification
37.
Zurück zum Zitat Zagoruyko S, Komodakis N (2017) Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. In: ICLR Zagoruyko S, Komodakis N (2017) Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. In: ICLR
38.
Zurück zum Zitat Zhang ML, Zhou ZH (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837CrossRef Zhang ML, Zhou ZH (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837CrossRef
40.
Zurück zum Zitat Zhao H, Zhang Y, Liu S, Shi J, Loy C, Lin D, Jia J (2018) Psanet: point-wise spatial attention network for scene parsing. In: Computer vision—ECCV 2018. . Springer, Cham, pp 270–286 Zhao H, Zhang Y, Liu S, Shi J, Loy C, Lin D, Jia J (2018) Psanet: point-wise spatial attention network for scene parsing. In: Computer vision—ECCV 2018. . Springer, Cham, pp 270–286
42.
Zurück zum Zitat Zitnick CL, Dollár P (2014) Edge boxes: locating object proposals from edges. In: ECCV, pp 391–405 Zitnick CL, Dollár P (2014) Edge boxes: locating object proposals from edges. In: ECCV, pp 391–405
Metadaten
Titel
Multi-layered semantic representation network for multi-label image classification
verfasst von
Xiwen Qu
Hao Che
Jun Huang
Linchuan Xu
Xiao Zheng
Publikationsdatum
11.05.2023
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 10/2023
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-023-01841-6

Weitere Artikel der Ausgabe 10/2023

International Journal of Machine Learning and Cybernetics 10/2023 Zur Ausgabe

Neuer Inhalt