Skip to main content
Top
Published in:

2021 | OriginalPaper | Chapter

MSCANet: Adaptive Multi-scale Context Aggregation Network for Congested Crowd Counting

Authors : Yani Zhang, Huailin Zhao, Fangbo Zhou, Qing Zhang, Yanjiao Shi, Lanjun Liang

Published in: MultiMedia Modeling

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Crowd counting has achieved significant progress with deep convolutional neural networks. However, most of the existing methods don’t fully utilize spatial context information, and it is difficult for them to count the congested crowd accurately. To this end, we propose a novel Adaptive Multi-scale Context Aggregation Network (MSCANet), in which a Multi-scale Context Aggregation module (MSCA) is designed to adaptively extract and aggregate the contextual information from different scales of the crowd. More specifically, for each input, we first extract multi-scale context features via atrous convolution layers. Then, the multi-scale context features are progressively aggregated via a channel attention to enrich the crowd representations in different scales. Finally, a \(1\times 1\) convolution layer is applied to regress the crowd density. We perform extensive experiments on three public datasets: ShanghaiTech Part_A, UCF_CC_50 and UCF-QNRF, and the experimental results demonstrate the superiority of our method compared to current the state-of-the-art methods.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Cao, X., Wang, Z., Zhao, Y., Su, F.: Scale aggregation network for accurate and efficient crowd counting. In: ECCV (2018) Cao, X., Wang, Z., Zhao, Y., Su, F.: Scale aggregation network for accurate and efficient crowd counting. In: ECCV (2018)
2.
go back to reference Chen, X., Bin, Y., Sang, N., Gao, C.: Scale pyramid network for crowd counting. In: WACV (2019) Chen, X., Bin, Y., Sang, N., Gao, C.: Scale pyramid network for crowd counting. In: WACV (2019)
3.
go back to reference Deb, D., Ventura, J.: An aggregated multicolumn dilated convolution network for perspective-free counting. In: CVPR Workshop (2018) Deb, D., Ventura, J.: An aggregated multicolumn dilated convolution network for perspective-free counting. In: CVPR Workshop (2018)
4.
go back to reference Gao, J., Lin, W., Zhao, B., Wang, D., Gao, C., Wen, J.: C\(^3\) framework: an open-source pytorch code for crowd counting. arXiv preprint arXiv:1907.02724 (2019) Gao, J., Lin, W., Zhao, B., Wang, D., Gao, C., Wen, J.: C\(^3\) framework: an open-source pytorch code for crowd counting. arXiv preprint arXiv:​1907.​02724 (2019)
5.
go back to reference Gao, J., Wang, Q., Li, X.: PCC net: perspective crowd counting via spatial convolutional network. IEEE TCSVT 1 (2019) Gao, J., Wang, Q., Li, X.: PCC net: perspective crowd counting via spatial convolutional network. IEEE TCSVT 1 (2019)
6.
go back to reference He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
7.
go back to reference Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: CVPR (2018) Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: CVPR (2018)
8.
go back to reference Idrees, H., Saleemi, I., Seibert, C., Shah, M.: Multi-source multi-scale counting in extremely dense crowd images. In: CVPR (2013) Idrees, H., Saleemi, I., Seibert, C., Shah, M.: Multi-source multi-scale counting in extremely dense crowd images. In: CVPR (2013)
9.
go back to reference Idrees, H., et al.: Composition loss for counting, density map estimation and localization in dense crowds. In: ECCV (2018) Idrees, H., et al.: Composition loss for counting, density map estimation and localization in dense crowds. In: ECCV (2018)
10.
go back to reference Jiang, X., et al.: Crowd counting and density estimation by trellis encoder-decoder networks. In: CVPR (2019) Jiang, X., et al.: Crowd counting and density estimation by trellis encoder-decoder networks. In: CVPR (2019)
11.
go back to reference Lempitsky, V., Zisserman, A.: Learning to count objects in images. In: NeurIPS (2010) Lempitsky, V., Zisserman, A.: Learning to count objects in images. In: NeurIPS (2010)
12.
go back to reference Li, Y., Zhang, X., Chen, D.: CSRNet: dilated convolutional neural networks for understanding the highly congested scenes. In: CVPR (2018) Li, Y., Zhang, X., Chen, D.: CSRNet: dilated convolutional neural networks for understanding the highly congested scenes. In: CVPR (2018)
13.
go back to reference Liu, N., Long, Y., Zou, C., Niu, Q., Pan, L., Wu, H.: ADCrowdNet: an attention-injective deformable convolutional network for crowd understanding. In: CVPR (2019) Liu, N., Long, Y., Zou, C., Niu, Q., Pan, L., Wu, H.: ADCrowdNet: an attention-injective deformable convolutional network for crowd understanding. In: CVPR (2019)
14.
go back to reference Liu, W., Salzmann, M., Fua, P.: Context-aware crowd counting. In: CVPR (2019) Liu, W., Salzmann, M., Fua, P.: Context-aware crowd counting. In: CVPR (2019)
16.
go back to reference Ranjan, V., Le, H., Hoai, M.: Iterative crowd counting. In: ECCV (2018) Ranjan, V., Le, H., Hoai, M.: Iterative crowd counting. In: ECCV (2018)
17.
go back to reference Sam, D.B., Surya, S., Babu, R.V.: Switching convolutional neural network for crowd counting. In: CVPR (2017) Sam, D.B., Surya, S., Babu, R.V.: Switching convolutional neural network for crowd counting. In: CVPR (2017)
18.
go back to reference Shi, Z., Mettes, P., Snoek, C.G.M.: Counting with focus for free. In: ICCV (2019) Shi, Z., Mettes, P., Snoek, C.G.M.: Counting with focus for free. In: ICCV (2019)
19.
go back to reference Sindagi, V.A., Patel, V.M.: Generating high-quality crowd density maps using contextual pyramid CNNs. In: ICCV (2017) Sindagi, V.A., Patel, V.M.: Generating high-quality crowd density maps using contextual pyramid CNNs. In: ICCV (2017)
20.
go back to reference Wang, Q., Gao, J., Lin, W., Yuan, Y.: Learning from synthetic data for crowd counting in the wild. In: CVPR (2019) Wang, Q., Gao, J., Lin, W., Yuan, Y.: Learning from synthetic data for crowd counting in the wild. In: CVPR (2019)
21.
go back to reference Wang, S., Lu, Y., Zhou, T., Di, H., Lu, L., Zhang, L.: SCLNet: spatial context learning network for congested crowd counting. Neurocomputing 404, 227–239 (2020)CrossRef Wang, S., Lu, Y., Zhou, T., Di, H., Lu, L., Zhang, L.: SCLNet: spatial context learning network for congested crowd counting. Neurocomputing 404, 227–239 (2020)CrossRef
23.
go back to reference Wang, Z., Xiao, Z., Xie, K., Qiu, Q., Zhen, X., Cao, X.: In defense of single-column networks for crowd counting. In: BMVC (2018) Wang, Z., Xiao, Z., Xie, K., Qiu, Q., Zhen, X., Cao, X.: In defense of single-column networks for crowd counting. In: BMVC (2018)
24.
go back to reference Xie, Y., Lu, Y., Wang, S.: RSANet: deep recurrent scale-aware network for crowd counting. In: ICIP (2020) Xie, Y., Lu, Y., Wang, S.: RSANet: deep recurrent scale-aware network for crowd counting. In: ICIP (2020)
25.
go back to reference Yang, L., Peng, H., Zhang, D., Fu, J., Han, J.: Revisiting anchor mechanisms for temporal action localization. IEEE TIP 29, 8535–8548 (2020) Yang, L., Peng, H., Zhang, D., Fu, J., Han, J.: Revisiting anchor mechanisms for temporal action localization. IEEE TIP 29, 8535–8548 (2020)
26.
go back to reference Zhang, C., Li, H., Wang, X., Yang, X.: Cross-scene crowd counting via deep convolutional neural networks. In: CVPR (2015) Zhang, C., Li, H., Wang, X., Yang, X.: Cross-scene crowd counting via deep convolutional neural networks. In: CVPR (2015)
27.
go back to reference Zhang, P., Liu, W., Lei, Y., Lu, H., Yang, X.: Cascaded context pyramid for full-resolution 3D semantic scene completion. arXiv preprint arXiv:1908.00382 (2019) Zhang, P., Liu, W., Lei, Y., Lu, H., Yang, X.: Cascaded context pyramid for full-resolution 3D semantic scene completion. arXiv preprint arXiv:​1908.​00382 (2019)
28.
go back to reference Zhang, Y., Zhou, D., Chen, S., Gao, S., Ma, Y.: Single-image crowd counting via multi-column convolutional neural network. In: CVPR (2016) Zhang, Y., Zhou, D., Chen, S., Gao, S., Ma, Y.: Single-image crowd counting via multi-column convolutional neural network. In: CVPR (2016)
29.
go back to reference Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR (2017) Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR (2017)
30.
go back to reference Zhou, T., Li, J., Wang, S., Tao, R., Shen, J.: MATNet: motion-attentive transition network for zero-shot video object segmentation. IEEE TIP 29, 8326–8338 (2020) Zhou, T., Li, J., Wang, S., Tao, R., Shen, J.: MATNet: motion-attentive transition network for zero-shot video object segmentation. IEEE TIP 29, 8326–8338 (2020)
31.
go back to reference Zhou, T., Lu, Y., Di, H.: Locality-constrained collaborative model for robust visual tracking. IEEE TCSVT 27(2), 313–325 (2015) Zhou, T., Lu, Y., Di, H.: Locality-constrained collaborative model for robust visual tracking. IEEE TCSVT 27(2), 313–325 (2015)
32.
go back to reference Zhou, T., Lu, Y., Di, H., Zhang, J.: Video object segmentation aggregation. In: ICME (2016) Zhou, T., Lu, Y., Di, H., Zhang, J.: Video object segmentation aggregation. In: ICME (2016)
33.
go back to reference Zhou, T., Lu, Y., Lv, F., Di, H., Zhao, Q., Zhang, J.: Abrupt motion tracking via nearest neighbor field driven stochastic sampling. Neurocomputing 165, 350–360 (2015)CrossRef Zhou, T., Lu, Y., Lv, F., Di, H., Zhao, Q., Zhang, J.: Abrupt motion tracking via nearest neighbor field driven stochastic sampling. Neurocomputing 165, 350–360 (2015)CrossRef
34.
go back to reference Zhou, T., Wang, S., Zhou, Y., Yao, Y., Li, J., Shao, L.: Motion-attentive transition for zero-shot video object segmentation. In: AAAI (2020) Zhou, T., Wang, S., Zhou, Y., Yao, Y., Li, J., Shao, L.: Motion-attentive transition for zero-shot video object segmentation. In: AAAI (2020)
35.
go back to reference Zhou, T., Wang, W., Qi, S., Ling, H., Shen, J.: Cascaded human-object interaction recognition. In: CVPR (2020) Zhou, T., Wang, W., Qi, S., Ling, H., Shen, J.: Cascaded human-object interaction recognition. In: CVPR (2020)
Metadata
Title
MSCANet: Adaptive Multi-scale Context Aggregation Network for Congested Crowd Counting
Authors
Yani Zhang
Huailin Zhao
Fangbo Zhou
Qing Zhang
Yanjiao Shi
Lanjun Liang
Copyright Year
2021
DOI
https://doi.org/10.1007/978-3-030-67835-7_1

Premium Partner