Skip to main content
Top

2020 | OriginalPaper | Chapter

Multi-scale Attention Consistency for Multi-label Image Classification

Authors : Haotian Xu, Xiaobo Jin, Qiufeng Wang, Kaizhu Huang

Published in: Neural Information Processing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Human has well demonstrated its cognitive consistency over image transformations such as flipping and scaling. In order to learn from human’s visual perception consistency, researchers find out that convolutional neural network’s capacity of discernment can be further elevated via forcing the network to concentrate on certain area in the picture in accordance with the human natural visual perception. Attention heatmap, as a supplementary tool to reveal the essential region that the network chooses to focus on, has been developed and widely adopted by CNNs. Based on this regime of visual consistency, we propose a novel end-to-end trainable CNN architecture with multi-scale attention consistency. Specifically, our model takes an original picture and its flipped counterpart as inputs, and then send them into a single standard Resnet with additional attention-enhanced modules to generate a semantically strong attention heatmap. We also compute the distance between multi-scale attention heatmaps of these two pictures and take it as an additional loss to help the network achieve better performance. Our network shows superiority on the multi-label classification task and attains compelling results on the WIDER Attribute Dataset.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2013)CrossRef Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2013)CrossRef
3.
go back to reference Gao, Z., Liu, D., Huang, K., Huang, Y.: Context-aware human activity and smartphone position-mining with motion sensors. Remote Sens. 11(21), 2531 (2019)CrossRef Gao, Z., Liu, D., Huang, K., Huang, Y.: Context-aware human activity and smartphone position-mining with motion sensors. Remote Sens. 11(21), 2531 (2019)CrossRef
4.
go back to reference Lavie, N.: Distracted and confused? Selective attention under load. Trends Cogn. Sci. 9(2), 75–82 (2005)CrossRef Lavie, N.: Distracted and confused? Selective attention under load. Trends Cogn. Sci. 9(2), 75–82 (2005)CrossRef
5.
go back to reference Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In Proceedings of the CVPR, pp. 2921–2929 (2016) Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In Proceedings of the CVPR, pp. 2921–2929 (2016)
6.
go back to reference Desimone, R., Duncan, J.: Neural mechanisms of selective visual attention. Ann. Rev. Neurosci. 18(1), 193–222 (1995)CrossRef Desimone, R., Duncan, J.: Neural mechanisms of selective visual attention. Ann. Rev. Neurosci. 18(1), 193–222 (1995)CrossRef
7.
go back to reference Guo, H., Zheng, K., Fan, X., Yu, H., Wang, S.: Visual attention consistency under image transforms for multi-label image classification. In: Proceedings of the CVPR, pp. 729–739 (2019) Guo, H., Zheng, K., Fan, X., Yu, H., Wang, S.: Visual attention consistency under image transforms for multi-label image classification. In: Proceedings of the CVPR, pp. 729–739 (2019)
8.
go back to reference Stollenga, M.F., Masci, J., Gomez, F., Schmidhuber, J.: Deep networks with internal selective attention through feedback connections. In: Advances in Neural Information Processing Systems, pp. 3545–3553 (2014) Stollenga, M.F., Masci, J., Gomez, F., Schmidhuber, J.: Deep networks with internal selective attention through feedback connections. In: Advances in Neural Information Processing Systems, pp. 3545–3553 (2014)
9.
go back to reference Li, X., Zhao, F., Guo, Y.: Multi-label image classification with a probabilistic label enhancement model. In: UAI, vol. 1, p. 3 (2014) Li, X., Zhao, F., Guo, Y.: Multi-label image classification with a probabilistic label enhancement model. In: UAI, vol. 1, p. 3 (2014)
10.
go back to reference Cabral, R., De la Torre, F., Costeira, J.P., Bernardino, A.: Matrix completion for weakly-supervised multi-label image classification. IEEE Trans. Pattern Anal. Mach. Intell. 37(1), 121–135 (2014)CrossRef Cabral, R., De la Torre, F., Costeira, J.P., Bernardino, A.: Matrix completion for weakly-supervised multi-label image classification. IEEE Trans. Pattern Anal. Mach. Intell. 37(1), 121–135 (2014)CrossRef
11.
go back to reference Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the CVPR, pp. 7132–7141 (2018) Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the CVPR, pp. 7132–7141 (2018)
13.
go back to reference He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the CVPR, pp. 770–778 (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the CVPR, pp. 770–778 (2016)
14.
go back to reference Dembczynski, K., Kotlowski, W., Hüllermeier, E.: Consistent multilabel ranking through univariate losses. arXiv preprint arXiv:1206.6401 (2012) Dembczynski, K., Kotlowski, W., Hüllermeier, E.: Consistent multilabel ranking through univariate losses. arXiv preprint arXiv:​1206.​6401 (2012)
15.
go back to reference Li, D., Chen, X., Huang, K.: Multi-attribute learning for pedestrian attribute recognition in surveillance scenarios. In: 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR), pp. 111–115 (2015) Li, D., Chen, X., Huang, K.: Multi-attribute learning for pedestrian attribute recognition in surveillance scenarios. In: 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR), pp. 111–115 (2015)
17.
go back to reference Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: Proceedings of the CVPR, pp. 248–255(2009) Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: Proceedings of the CVPR, pp. 248–255(2009)
19.
go back to reference Guo, H., Fan, X., Wang, S.: Human attribute recognition by refining attention heat map. Pattern Recogn. Lett. 94, 38–45 (2017)CrossRef Guo, H., Fan, X., Wang, S.: Human attribute recognition by refining attention heat map. Pattern Recogn. Lett. 94, 38–45 (2017)CrossRef
20.
go back to reference Zhu, F., Li, H., Ouyang, W., Yu, N., Wang, X.: Learning spatial regularization with image-level supervisions for multi-label image classification. In: Proceedings of the CVPR, pp. 5513–5522 (2017) Zhu, F., Li, H., Ouyang, W., Yu, N., Wang, X.: Learning spatial regularization with image-level supervisions for multi-label image classification. In: Proceedings of the CVPR, pp. 5513–5522 (2017)
21.
go back to reference Sarafianos, N., Xu, X., Kakadiaris, I.A.: Deep imbalanced attribute classification using visual attention aggregation. In: Proceedings of the ECCV, pp. 680–697 (2018) Sarafianos, N., Xu, X., Kakadiaris, I.A.: Deep imbalanced attribute classification using visual attention aggregation. In: Proceedings of the ECCV, pp. 680–697 (2018)
22.
go back to reference Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015) Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Metadata
Title
Multi-scale Attention Consistency for Multi-label Image Classification
Authors
Haotian Xu
Xiaobo Jin
Qiufeng Wang
Kaizhu Huang
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-63820-7_93

Premium Partner