nach oben

Neural Computing and Applications

Erschienen in:

18.06.2019 | Original Article

Context-aware attention network for image recognition

verfasst von: Jiaxu Leng, Ying Liu, Shang Chen

Erschienen in: Neural Computing and Applications | Ausgabe 12/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Existing recognition methods based on deep learning have achieved impressive performance. However, most of these algorithms do not fully utilize the contexts and discriminative parts, which limit the recognition performance. In this paper, we propose a context-aware attention network that imitates the human visual attention mechanism. The proposed network mainly consists of a context learning module and an attention transfer module. Firstly, we design the context learning module that carries on contextual information transmission along four directions: left, right, top and down to capture valuable contexts. Second, the attention transfer module is proposed to generate attention maps that contain different attention regions, benefiting for extracting discriminative features. Specially, the attention maps are generated through multiple glimpses. In each glimpse, we generate the corresponding attention map and apply it to the next glimpse. This means that our attention is shifting constantly, and the shift is not random but is closely related to the last attention. Finally, we consider all located attention regions to achieve accurate image recognition. Experimental results show that our method achieves state-of-the-art performance with 97.68% accuracy, 82.42% accuracy, 80.32% accuracy and 86.12% accuracy on CIFAR-10, CIFAR-100, Caltech-256 and CUB-200, respectively.

Vorheriger Artikel Intermittent pinning synchronization of reaction–diffusion neural networks with multiple spatial diffusion couplings

Nächster Artikel A classification model of railway fasteners based on computer vision

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PH (2016) Fully-convolutional siamese networks for object tracking. In: ECCV. Springer, pp 850–865

Nam H, Han B (2016) Learning multi-domain convolutional neural net-506 works for visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4293–4302

Chen LC, Papandreou G, Kokkinos I et al (2018) DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848CrossRef

Girshick RB, Donahue J, Darrell T et al (2016) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142–158CrossRef

Redmon J, Farhadi A (2016) YOLO9000: better, faster, stronger. arXiv preprint, 1612

Santoro A, Raposo D, Barrett DG et al (2017) A simple neural network module for relational reasoning. In: Advances in neural information processing systems, pp 4974–4983

Leng J, Liu Y (2018) An enhanced SSD with feature fusion and visual reasoning for object detection. Neural Comput Appl. https://doi.org/10.1007/s00521-018-3486-1 CrossRef

Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. ICLR

He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: CVPR

10.

Itti L, Koch C (2001) Computational modelling of visual attention. Nat Rev Neurosci 2(3):194CrossRef

11.

Mnih V, Heess N, Graves A et al (2014) Recurrent models of visual attention. In: NIPS

12.

Zhao B, Wu X, Feng J, Peng Q, Yan S (2016) Diversified visual attention networks for fine-grained object classification. arXiv preprint arXiv:1606.08572

13.

Xiao T, Xu Y, Yang K, Zhang J, Peng Y, Zhang Z (2015) The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In: CVPR, pp 842–850

14.

Liu X, Xia T, Wang J, Lin Y (2016) Fully convolutional attention localization networks: efficient attention localization for fine-grained recognition. CoRR arXiv:1603.06765

15.

Ji Y, Zhang H, Wu QMJ (2018) Salient object detection via multi-scale attention CNN. Neurocomputing 322:130–140CrossRef

16.

Zhang H, Ji Y, Huang W et al (2018) Sitcom-star-based clothing retrieval for video advertising: a deep learning framework. Neural Comput Appl. https://doi.org/10.1007/s00521-018-3579-x CrossRef

17.

Xu K, Ba J, Kiros R et al (2015) Show, attend and tell: Neural image caption generation with visual attention. In: International conference on machine learning, pp 2048–2057

18.

Chen L, Zhang H, Xiao J et al (2017) SCA-CNN: spatial and channel-wise attention in convolutional networks for image captioning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5659–5667

19.

Seo PH, Lin Z, Cohen S et al (2016) Progressive attention networks for visual attribute prediction. arXiv preprint arXiv:1606.02393

20.

Das D, George Lee CS (2018) Sample-to-sample correspondence for unsupervised domain adaptation. Eng Appl Artif Intell 73:80–91CrossRef

21.

Das D, George Lee CS (2018) Unsupervised domain adaptation using regularized hyper-graph matching. In: 2018 25th IEEE international conference on image processing (ICIP). IEEE

22.

Courty N et al (2017) Optimal transport for domain adaptation. IEEE Trans Pattern Anal Mach Intell 39(9):1853–1865CrossRef

23.

Larochelle H, Hinton GE (2010) Learning to combine foveal glimpses with a third-order Boltzmann machine. In: Advances in neural information processing systems, pp 1243–1251

24.

Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRef

25.

Kim JH, Lee SW, Kwak D et al (2016) Multimodal residual learning for visual QA. In: Advances in neural information processing systems, pp 361–369

26.

Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 1520–1528

27.

Srivastava RK, Greff K, Schmidhuber J (2015) Training very deep networks. In: Advances in neural information processing systems, pp 2377–2385

28.

Jaderberg M, Simonyan K, Zisserman A (2015) Spatial transformer networks. In: Advances in neural information processing systems, pp 2017–2025

29.

Xiao T, Xu Y, Yang K et al (2015) The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 842–850

30.

Fu J, Zheng H, Mei T (2017) Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition. CVPR 2:3

31.

Wang F et al (2017) Residual attention network for image classification. In: CVPR

32.

Divvala SK, Hoiem D, Hays JH, Efros AA, Hebert M (2009) An empirical study of context in object detection. In: CVPR

33.

Galleguillos C, Rabinovich A, Belongie S (2008) Object categorization using co-occurrence, location and appearance. In: CVPR

34.

Uijlings JR, De Sande KE, Gevers T et al (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171CrossRef

35.

He K, Zhang X, Ren S et al (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: European conference on computer vision, pp 346–361CrossRef

36.

Girshick RB (2015) Fast R-CNN. In: International conference on computer vision, pp 1440–1448

37.

Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Technical report, University of Toronto

38.

Griffin G, Holub A, Perona P (2007) Caltech-256 object category dataset

39.

Welinder P, Branson S, Mita T, Wah C, Schroff F, Be-longie S, Perona P (2010) Caltech-UCSD Birds 200. Technical report CNS-TR-2010-001, California Institute of Technology

40.

Fei-Fei L, Fergus R, Perona P (2004) Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. In: IEEE CVPR 2004, workshop on generative-model based vision

41.

Jaderberg M, Simonyan K, Zisserman A, Kavukcuoglu K (2015) Spatial transformer networks. In: NIPS, pp 2017–2025

42.

Wang F, Jiang M, Qian C et al (2017) Residual attention network for image classification. arXiv preprint arXiv:1704.06904

43.

Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: ICLR, pp 1409–1556

44.

Szegedy C et al (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition

Titel: Context-aware attention network for image recognition
verfasst von: Jiaxu Leng
Ying Liu
Shang Chen
Publikationsdatum: 18.06.2019
Verlag: Springer London
Erschienen in: Neural Computing and Applications / Ausgabe 12/2019
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI: https://doi.org/10.1007/s00521-019-04281-y

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Weitere Artikel der Ausgabe 12/2019

Fecal coliform predictive model using genetic algorithm-based radial basis function neural networks (GA-RBFNNs)

Presentation of a new hybrid approach for forecasting economic growth using artificial intelligence approaches

Convolutional neural network for spectral–spatial classification of hyperspectral images

Prediction of pK(a) values of neutral and alkaline drugs with particle swarm optimization algorithm and artificial neural network

Research on feature point extraction and matching machine learning method based on light field imaging

Introducing an expert system for prediction of soccer player ranking using ensemble learning

Premium Partner