Skip to main content

2020 | OriginalPaper | Buchkapitel

GTFNet: Ground Truth Fitting Network for Crowd Counting

verfasst von : Jinghan Tan, Jun Sang, Zhili Xiang, Ying Shi, Xiaofeng Xia

Erschienen in: Artificial Neural Networks and Machine Learning – ICANN 2020

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Crowd counting aims to estimate the number of pedestrians in a single image. Current crowd counting methods usually obtain counting results by integrating density maps. However, the label density map generated by the Gaussian kernel cannot accurately map the ground truth in the corresponding crowd image, thereby affecting the final counting result. In this paper, a ground truth fitting network called GTFNet was proposed, which aims to generate estimated density maps which can fit the ground truth better. Firstly, the VGG network combined with the dilated convolutional layers was used as the backbone network of GTFNet to extract hierarchical features. The multi-level features were concatenated to achieve compensation for information loss caused by pooling operations, which may assist the network to obtain texture information and spatial information. Secondly, the regional consistency loss function was designed to obtain the mapping results of the estimated density map and the label density map at different region levels. During the training process, the region-level dynamic weights were designed to assign a suitable region fitting range for the network, which can effectively reduce the impact of label errors on the estimated density maps. Finally, our proposed GTFNet was evaluated upon three crowd counting datasets (ShanghaiTech, UCF_CC_50 and UCF-QRNF). The experimental results demonstrated that the proposed GTFNet achieved excellent overall performance on all these datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Beibei, Z.: Crowd analysis: a survey. Mach. Vis. Appl. 19(5–6), 345–357 (2008) Beibei, Z.: Crowd analysis: a survey. Mach. Vis. Appl. 19(5–6), 345–357 (2008)
2.
Zurück zum Zitat Teng, L.: Crowded scene analysis: a survey. IEEE Trans. Circuits Syst. Video Technol. 25(3), 367–386 (2015) Teng, L.: Crowded scene analysis: a survey. IEEE Trans. Circuits Syst. Video Technol. 25(3), 367–386 (2015)
3.
Zurück zum Zitat Dalal, N.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, pp. 886–893. IEEE (2005) Dalal, N.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, pp. 886–893. IEEE (2005)
4.
Zurück zum Zitat Felzenszwalb, P.F.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2009)CrossRef Felzenszwalb, P.F.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2009)CrossRef
5.
Zurück zum Zitat Zhao, T.: Segmentation and tracking of multiple humans in crowded environments. IEEE Trans. Pattern Anal. Mach. Intell. 30(7), 1198–1211 (2008)CrossRef Zhao, T.: Segmentation and tracking of multiple humans in crowded environments. IEEE Trans. Pattern Anal. Mach. Intell. 30(7), 1198–1211 (2008)CrossRef
6.
Zurück zum Zitat Rodriguez, M.: Density-aware person detection and tracking in crowds. In: 2011 International Conference on Computer Vision, pp. 2423–2430. IEEE (2011) Rodriguez, M.: Density-aware person detection and tracking in crowds. In: 2011 International Conference on Computer Vision, pp. 2423–2430. IEEE (2011)
7.
Zurück zum Zitat Wang, M.: Automatic adaptation of a generic pedestrian detector to a specific traffic scene. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11), vol. 7, pp. 3401–3408. IEEE (2011) Wang, M.: Automatic adaptation of a generic pedestrian detector to a specific traffic scene. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11), vol. 7, pp. 3401–3408. IEEE (2011)
8.
Zurück zum Zitat Wu, B.: Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detectors. In: Tenth IEEE International Conference on Computer Vision (ICCV’05), vol. 1, pp. 90–97. IEEE (2005) Wu, B.: Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detectors. In: Tenth IEEE International Conference on Computer Vision (ICCV’05), vol. 1, pp. 90–97. IEEE (2005)
9.
Zurück zum Zitat Zhang, C.: Cross-scene crowd counting via deep convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 833–841. IEEE (2015) Zhang, C.: Cross-scene crowd counting via deep convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 833–841. IEEE (2015)
10.
Zurück zum Zitat Szegedy, C.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9. IEEE (2015) Szegedy, C.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9. IEEE (2015)
11.
Zurück zum Zitat Szegedy, C.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826. IEEE (2016) Szegedy, C.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826. IEEE (2016)
12.
Zurück zum Zitat He, K.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE (2016) He, K.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE (2016)
13.
Zurück zum Zitat Zhang, Y.: Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 589–597. IEEE (2016) Zhang, Y.: Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 589–597. IEEE (2016)
14.
Zurück zum Zitat Li, Y.: Csrnet: dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1091–1100. IEEE (2018) Li, Y.: Csrnet: dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1091–1100. IEEE (2018)
15.
16.
Zurück zum Zitat Yu, F.: Dilated residual networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 472–480. IEEE (2017) Yu, F.: Dilated residual networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 472–480. IEEE (2017)
17.
Zurück zum Zitat Jiang, X.: Crowd counting and density estimation by trellis encoder-decoder networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6133–6142. IEEE (2019) Jiang, X.: Crowd counting and density estimation by trellis encoder-decoder networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6133–6142. IEEE (2019)
18.
Zurück zum Zitat Idrees, H.: Multi-source multi-scale counting in extremely dense crowd images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2547–2554. IEEE (2013) Idrees, H.: Multi-source multi-scale counting in extremely dense crowd images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2547–2554. IEEE (2013)
20.
Zurück zum Zitat Yosinski, J.: How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems, pp. 3320–3328. MIT Press (2014) Yosinski, J.: How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems, pp. 3320–3328. MIT Press (2014)
21.
Zurück zum Zitat Paszke, A., Gross, S., Chintala, S., Chanan, G.: Pytorch: tensors and dynamic neural networks in python with strong gpu acceleration. PyTorch: tensors and dynamic neural networks in Python with strong GPU acceleration 6 (2017) Paszke, A., Gross, S., Chintala, S., Chanan, G.: Pytorch: tensors and dynamic neural networks in python with strong gpu acceleration. PyTorch: tensors and dynamic neural networks in Python with strong GPU acceleration 6 (2017)
23.
Zurück zum Zitat Wang, Q.: Learning from synthetic data for crowd counting in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8198–8207. IEEE (2019) Wang, Q.: Learning from synthetic data for crowd counting in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8198–8207. IEEE (2019)
24.
Zurück zum Zitat Liu, N.: Adcrowdnet: an attention-injective deformable convolutional network for crowd understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3225–3234. IEEE (2019) Liu, N.: Adcrowdnet: an attention-injective deformable convolutional network for crowd understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3225–3234. IEEE (2019)
25.
Zurück zum Zitat Shi, M.: Perspective-aware CNN for crowd counting (2018) Shi, M.: Perspective-aware CNN for crowd counting (2018)
Metadaten
Titel
GTFNet: Ground Truth Fitting Network for Crowd Counting
verfasst von
Jinghan Tan
Jun Sang
Zhili Xiang
Ying Shi
Xiaofeng Xia
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-61609-0_19