nach oben

International Journal of Machine Learning and Cybernetics

Erschienen in:

11.05.2023 | Original Article

Multi-layered semantic representation network for multi-label image classification

verfasst von: Xiwen Qu, Hao Che, Jun Huang, Linchuan Xu, Xiao Zheng

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 10/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Multi-label image classification is a fundamental and practical task, which aims to assign multiple possible labels to an image. In recent years, many deep convolutional neural network (CNN) based approaches have been proposed which model label correlations to discover semantics of labels and learn semantic representations of images. This paper advances this research direction by improving both the modeling of label correlations and the learning of semantic representations. On the one hand, besides the local semantics of each label, we propose to further explore global semantics shared by multiple labels. On the other hand, existing approaches mainly learn the semantic representations at the last convolutional layer of a CNN. But it has been noted that the image representations of different layers of CNN capture different levels or scales of features and have different discriminative abilities. We thus propose to learn semantic representations at multiple convolutional layers. To this end, this paper designs a Multi-layered Semantic Representation Network (MSRN) which discovers both local and global semantics of labels through modeling label correlations and utilizes the label semantics to guide the semantic representations learning at multiple layers through an attention mechanism. Extensive experiments on five benchmark datasets including VOC2007, VOC2012, MS-COCO, NUS-WIDE, and Apparel show a competitive performance of the proposed MSRN against state-of-the-art models.

Vorheriger Artikel Transformer-based contrastive learning framework for image anomaly detection

Nächster Artikel Pseudo-label driven deep hashing for unsupervised cross-modal retrieval

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

ATZelectronics worldwide

ATZlectronics worldwide is up-to-speed on new trends and developments in automotive electronics on a scientific level with a high depth of information.

Order your 30-days-trial for free and without any commitment.

Jetzt informieren

ATZelektronik

Die Fachzeitschrift ATZelektronik bietet für Entwickler und Entscheider in der Automobil- und Zulieferindustrie qualitativ hochwertige und fundierte Informationen aus dem gesamten Spektrum der Pkw- und Nutzfahrzeug-Elektronik.

Lassen Sie sich jetzt unverbindlich 2 kostenlose Ausgabe zusenden.

Jetzt informieren

Source codes and pre-trained models of our method are publicly available at https://jiunhwang.github.io/

http://www.kaggle.com/kaiska/apparel-dataset.

Cai Z, Fan Q, Feris RS, Vasconcelos N (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: ECCV, pp 354–370

Chen T, Wang Z, Li G, Lin L (2017) Recurrent attentional reinforcement learning for multi-label image recognition. In: AAAI

Chen S, Chen Y, Yeh C, Wang YF (2018) Order-free rnn with visual attention for multi-label classification. In: AAAI

Chen Z, Wei X, Wang P, Guo Y (2019) Multi-label image recognition with graph convolutional networks. In: CVPR, pp 5172–5181. https://doi.org/10.1109/CVPR.2019.00532

Chen T, Xu M, Hui X, Wu H, Lin L (2019) Learning semantic-specific graph representation for multi-label image recognition. In: ICCV, pp 522–531

Chen B, Zhang ZLY, Chen F, Lu G, Zhang D (2021) Semantic-interactive graph convolutional network for multilabel image recognition. IEEE Trans Syst Man Cybernet Syst. https://doi.org/10.1109/TSMC.2021.3103842CrossRef

Chua TS, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) Nus-wide: a real-world web image database from national university of Singapore. In: ICIVR, ACM

Everingham M, Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. IJCV 88(2):303–338CrossRef

Everingham M, Gool LV, Williams CK, Winn J, Zisserman A (2012) The PASCAL visual object classes challenge (VOC2012) results. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html

10.

Gao H, Z, Liu M, Laurens W, Kilian Q (2017) Densely connected convolutional networks. In: CVPR, pp 4700–4708

11.

Ge ZY, Mahapatra D, Sedai S, Garnavi R, Chakravorty R (2018) Chest x-rays classification: a multi-label and fine-grained problem. arXiv

12.

He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: CVPR, pp 770–778. https://doi.org/10.1109/CVPR.2016.90

13.

He S, Xu C, Guo T, Xu C, Tao D (2018) Reinforced multi-label image classification by exploring curriculum. In: Proceedings of the AAAI conference on artificial intelligence, vol 32(1). https://ojs.aaai.org/index.php/AAAI/article/view/11770

14.

Hou C, Zhou Z (2018) One-pass learning with incremental and decremental features. IEEE Trans Pattern Anal Mach Intell 40(11):2776–2792. https://doi.org/10.1109/TPAMI.2017.2769047CrossRef

15.

Hou C, Zeng L, Hu D (2019) Safe classification with augmented features. IEEE Trans Pattern Anal Mach Intell 41(9):2176–2192. https://doi.org/10.1109/TPAMI.2018.2849378CrossRef

16.

Jetley S, Lord NA, Lee N, Torr PHS (2018) Learn to pay attention. In: ICLR

17.

Kang K, Ouyang W, Li H, Wang X (2016) Object detection from video tubelets with convolutional neural networks. In: CVPR, pp 817–825. https://doi.org/10.1109/CVPR.2016.95

18.

Kim J, On K, Kim J, Ha J, Zhang B (2016) Hadamard product for low-rank bilinear pooling (10)

19.

Lee C, Fang W, Yeh C, Wang FY (2018) Multi-label zero-shot learning with structured knowledge graphs. In: CVPR, pp 1576–1585. https://doi.org/10.1109/CVPR.2018.00170

20.

Li Q, Peng X, Qiao Y, Peng Q (2020) Learning label correlations for multi-label image recognition with graph networks. PRL 138:378–384CrossRef

21.

Lin T, Maire M, Belongie S, Bourdev L, Girshick R, Hays J, Perona P, Ramanan D, Zitnick CL, Dollr P (2014) Microsoft coco: common objects in context

22.

Liu Y, Sheng L, Shao J, Yan J, Xiang S, Pan C (2018) Multi-label image classification via knowledge distillation from weakly supervised detection. In: ACM MM, pp 700–708

23.

Ma C, Chen Z, Lu J, Zhou J (2018) Rank-consistency multi-label deep hashing. In: ICME, pp 1–6

24.

Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: ICLR

25.

Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y (2018) Graph Attention Networks. In: ICLR

26.

Wang D, Zhang S (2020) Unsupervised person re-identification via multi-label classification. In: CVPR, pp 10978–10987

27.

Wang J, Yang Y, Mao J, Huang Z, Huang C, Xu W (2016) CNN-RNN: a unified framework for multi-label image classification. In: CVPR, pp 2285–2294. https://doi.org/10.1109/CVPR.2016.251

28.

Wang Z, Chen T, Li G, Xu R, Lin L (2017) Multi-label image recognition by recurrently discovering attentional regions. In: ICCV, pp 464–472

29.

Wang Y, He D, Li F, Long X, Zhou Z, Ma J, Wen S (2020) Multi-label classification with label graph superimposing. In: AAAI, pp 12265–12272

30.

Wang Y, Xie Y, Liu Y, Zhou K, Li X (2020) Fast graph convolution network based multi-label image recognition via cross-modal fusion. In: CIKM, pp 1575–1584

31.

Wei Y, Xia W, Lin M, Huang J, Ni B, Dong J, Zhao Y, Yan S (2016) Hcp: a flexible cnn framework for multi-label image classification. IEEE Trans Pattern Anal Mach Intell 38(9):1901–1907. https://doi.org/10.1109/TPAMI.2015.2491929CrossRef

32.

Yang S, Ramanan D (2015) Multi-scale recognition with dag-cnns. In: ICCV, pp 1215–1223

33.

Yang H, Zhou T, Zhang Y, Gao B, Wu J, Cai J (2016) Exploit bounding box annotations for multi-label object recognition. In: CVPR, pp 280–288

34.

Yin R, You J, Morris C, Ren X, Hamilton WL, Leskovec J (2018) Hierarchical graph representation learning with differentiable pooling. In: NIPS, pp 4805–4815

35.

You R, Guo Z, Cui L, Long X, Bao Y, Wen S (2020) Cross-modality attention with semantic graph embedding for multi-label classification. AAAI 34:12709–12716. https://doi.org/10.1609/aaai.v34i07.6964CrossRef

36.

Yuan J, Chen S, Zhang Y, Shi Z, Geng X, Fan J, Rui Y (2022) Graph attention transformer network for multi-label image classification

37.

Zagoruyko S, Komodakis N (2017) Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. In: ICLR

38.

Zhang ML, Zhou ZH (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837CrossRef

39.

Zhang Z, Xu Y, Shao L, Yang J (2018) Discriminative block-diagonal representation learning for image recognition. IEEE Trans Neural Netw Learn Syst 29(7):3111–3125. https://doi.org/10.1109/TNNLS.2017.2712801MathSciNetCrossRef

40.

Zhao H, Zhang Y, Liu S, Shi J, Loy C, Lin D, Jia J (2018) Psanet: point-wise spatial attention network for scene parsing. In: Computer vision—ECCV 2018. . Springer, Cham, pp 270–286

41.

Zhu F, Li H, Ouyang W, Yu N, Wang X (2017) Learning spatial regularization with image-level supervisions for multi-label image classification. In: CVPR, pp 2027–2036. https://doi.org/10.1109/CVPR.2017.219

42.

Zitnick CL, Dollár P (2014) Edge boxes: locating object proposals from edges. In: ECCV, pp 391–405

Titel: Multi-layered semantic representation network for multi-label image classification
verfasst von: Xiwen Qu
Hao Che
Jun Huang
Linchuan Xu
Xiao Zheng
Publikationsdatum: 11.05.2023
Verlag: Springer Berlin Heidelberg
Erschienen in: International Journal of Machine Learning and Cybernetics / Ausgabe 10/2023
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI: https://doi.org/10.1007/s13042-023-01841-6

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Die Gewinner und Laudatoren des Sustainability Award in Automotive 2024/© Uli Regenscheit | ATZlive, Search Icon, Banner Hanser, Additiv gefertigte Teile/© Marina_Skoropadskaya | Getty Images | iStock, Warnschild "Land unter"/© Bluedesign / Fotolia, Gardiner von Trapp/© Alpega Group, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade, chassis.tech plus 2023/© [M] ATZlive / TÜV SÜD PRODUCT SERVICE GMBH, adäsion-Webinar-Matinee/© krystiannawrocki_ Getty Images

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

ATZelectronics worldwide

ATZelektronik

Weitere Artikel der Ausgabe 10/2023

LSTM with spatiotemporal attention for IoT-based wireless sensor collected hydrological time-series forecasting

Dynamic graph-based attribute reduction approach with fuzzy rough sets

Transferring variational autoencoders with coarse-and-fine alignment for open set broad classification

Improving cross-lingual language understanding with consistency regularization-based fine-tuning

Asymmetric models of intuitionistic fuzzy rough sets and their applications in decision-making

ICUnet++: an Inception-CBAM network based on Unet++ for MR spine image segmentation

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.