Skip to main content
Top

2020 | OriginalPaper | Chapter

Simple Fine-Tuning Attention Modules for Human Pose Estimation

Authors : Tien-Dat Tran, Xuan-Thuy Vo, Moahamammad-Ashraf Russo, Kang-Hyun Jo

Published in: Advances in Computational Collective Intelligence

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The convolution neural networks (CNNs) have achieved the best performance not only for human pose estimation but also for other computer vision tasks (e.g., object detection, semantic segmentation, image classification). Then this paper focuses on a useful attention module (AM) for feed-forward CNNs. Firstly, feed the feature map after a block in the backbone network into the attention module, split into two separate dimensions, channel and spatial. After that, the AM combines these two feature maps by multiplication and gives it to the next block in the backbone. The network can capture the information in the long-range dependencies (channel) and the spatial data, which can gain better performance in accuracy. Therefore, our experimental results will illustrate how different between when using the attention module and the existing methods. As a result, the predicted joint heatmap maintains the accuracy and spatially better with the simple baseline. Besides, the proposed architecture gains 1.0 points in AP higher than the baseline. Moreover, the proposed network trained on COCO 2017 benchmarks, which is an accessible dataset nowadays.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: 2D human pose estimation: new benchmark and state of the art analysis. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3686–3693 (2014) Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: 2D human pose estimation: new benchmark and state of the art analysis. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3686–3693 (2014)
2.
go back to reference Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields (2016) Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields (2016)
4.
go back to reference Chou, C.J., Chien, J.T., Chen, H.T.: Self adversarial training for human pose estimation (2017) Chou, C.J., Chien, J.T., Chen, H.T.: Self adversarial training for human pose estimation (2017)
5.
go back to reference Chu, X., Yang, W., Ouyang, W., Ma, C., Yuille, A.L., Wang, X.: Multi-context attention for human pose estimation (2017) Chu, X., Yang, W., Ouyang, W., Ma, C., Yuille, A.L., Wang, X.: Multi-context attention for human pose estimation (2017)
6.
go back to reference Dumoulin, V., Visin, F.: A guide to convolution arithmetic for deep learning (2016) Dumoulin, V., Visin, F.: A guide to convolution arithmetic for deep learning (2016)
7.
go back to reference He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN (2017) He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN (2017)
8.
go back to reference He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015)
9.
go back to reference Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E.: Squeeze-and-excitation networks (2017) Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E.: Squeeze-and-excitation networks (2017)
10.
go back to reference Hussain, Z., Sheng, M., Zhang, W.E.: Different approaches for human activity recognition: a survey (2019) Hussain, Z., Sheng, M., Zhang, W.E.: Different approaches for human activity recognition: a survey (2019)
12.
go back to reference Insafutdinov, E., Pishchulin, L., Andres, B., Andriluka, M., Schiele, B.: Deepercut: a deeper, stronger, and faster multi-person pose estimation model (2016) Insafutdinov, E., Pishchulin, L., Andres, B., Andriluka, M., Schiele, B.: Deepercut: a deeper, stronger, and faster multi-person pose estimation model (2016)
13.
go back to reference Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift (2015) Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift (2015)
14.
go back to reference Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks (2015) Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks (2015)
16.
go back to reference Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations, December 2014 Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations, December 2014
17.
go back to reference Li, W., Zhao, R., Wang, X.: Human reidentification with transferred metric learning. In: Asian Conference on Computer Vision (ACCV), pp. 31–44, November 2012 Li, W., Zhao, R., Wang, X.: Human reidentification with transferred metric learning. In: Asian Conference on Computer Vision (ACCV), pp. 31–44, November 2012
18.
go back to reference Li, X., Wang, W., Hu, X., Yang, J.: Selective kernel networks (2019) Li, X., Wang, W., Hu, X., Yang, J.: Selective kernel networks (2019)
21.
go back to reference Moon, G., Chang, J.Y., Lee, K.M.: Posefix: model-agnostic general human pose refinement network (2018) Moon, G., Chang, J.Y., Lee, K.M.: Posefix: model-agnostic general human pose refinement network (2018)
23.
go back to reference Ning, G., Zhang, Z., He, Z.: Knowledge-guided deep fractal neural networks for human pose estimation (2017) Ning, G., Zhang, Z., He, Z.: Knowledge-guided deep fractal neural networks for human pose estimation (2017)
24.
go back to reference Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation (2019) Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation (2019)
25.
go back to reference Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inception-resnet and the impact of residual connections on learning (2016) Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inception-resnet and the impact of residual connections on learning (2016)
26.
go back to reference Tang, Z., Peng, X., Geng, S., Wu, L., Zhang, S., Metaxas, D.: Quantized densely connected u-nets for efficient landmark localization (2018) Tang, Z., Peng, X., Geng, S., Wu, L., Zhang, S., Metaxas, D.: Quantized densely connected u-nets for efficient landmark localization (2018)
29.
go back to reference Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines (2016) Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines (2016)
30.
go back to reference Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: convolutional block attention module (2018) Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: convolutional block attention module (2018)
Metadata
Title
Simple Fine-Tuning Attention Modules for Human Pose Estimation
Authors
Tien-Dat Tran
Xuan-Thuy Vo
Moahamammad-Ashraf Russo
Kang-Hyun Jo
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-63119-2_15

Premium Partner