nach oben

International Journal of Computer Vision

Erschienen in:

01.04.2023

From Open Set to Closed Set: Supervised Spatial Divide-and-Conquer for Object Counting

verfasst von: Haipeng Xiong, Hao Lu, Chengxin Liu, Liang Liu, Chunhua Shen, Zhiguo Cao

Erschienen in: International Journal of Computer Vision | Ausgabe 7/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Visual counting, a task that aims to estimate the number of objects from an image/video, is an open-set problem by nature as the number of population can vary in \([0,+\infty )\) in theory. However, collected data are limited in reality, which means that only a closed set is observed. Existing methods typically model this task through regression, while they are prone to suffer from unseen scenes with counts out of the scope of the closed set. In fact, counting has an interesting and exclusive property—spatially decomposable. A dense region can always be divided until sub-region counts are within the previously observed closed set. We therefore introduce the idea of spatial divide-and-conquer (S-DC) that transforms open-set counting into a closed set problem. This idea is implemented by a novel Supervised Spatial Divide-and-Conquer Network (SS-DCNet). It can learn from a closed set but generalize to open-set scenarios via S-DC. We provide mathematical analyses and a controlled experiment on synthetic data, demonstrating why closed-set modeling works well. Experiments show that SS-DCNet achieves state-of-the-art performance in crowd counting, vehicle counting and plant counting. SS-DCNet also demonstrates superior transferablity under the cross-dataset setting. Code and models are available at: https://git.io/SS-DCNet.

Vorheriger Artikel Bi-calibration Networks for Weakly-Supervised Video Representation Learning

Nächster Artikel Disassembling Convolutional Segmentation Network

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Babu, S. D., Surya, S., & Venkatesh, B. R. (2017). Switching convolutional neural network for crowd counting. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 5744–5752).

Babu, S. D., Sajjan, N. N., Venkatesh, B. R., & Srinivasan, M. (2018). Divide and grow: Capturing huge diversity in crowd images with incrementally growing cnn. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 3618–3626).

Cao, X., Wang, Z., Zhao, Y., & Su, F. (2018). Scale aggregation network for accurate and efficient crowd counting. In The European Conference on Computer Vision (ECCV), (pp. 734–750).

Chattopadhyay, P., Vedantam , R., Selvaraju, R. R., Batra, D., & Parikh, D. (2017). Counting everyday objects in everyday scenes. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 1135–1144).

Chen, K., Loy, C. C., Gong, S., & Xiang, T. (2012). Feature mining for localised crowd counting. In Proceedings of British Machine Vision Conference (BMVC).

Chen, K., Gong , S., Xiang, T., & Change, Loy. C. (2013). Cumulative attribute space for age and crowd density estimation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 2467–2474).

Chen, X., Bin, Y., Sang, N., & Gao, C. (2019). Scale pyramid network for crowd counting. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), (pp. 1941–1950).

Cheng, Z., Li, J., Dai, Q., Wu, X., & Hauptmann, A. G. (2019). Learning spatial awareness to improve crowd counting. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), (pp. 6152–6161).

Cohen, J. P., Boucher, G., Glastonbury, C. A., Lo, H. Z., & Bengio, Y. (2017). Count-ception: Counting by fully convolutional redundant counting. In Proceedings of IEEE International Conference on Computer Vision Workshop (ICCVW), (pp. 18–26).

Dehaene, S., Izard, V., Spelke, E., & Pica, P. (2008). Log or linear? Distinct intuitions of the number scale in western and Amazonian indigene cultures. Science, 320(5880), 1217–1220.MathSciNetCrossRefMATH

Fu, H., Gong, M., Wang, C., Batmanghelich, K., & Tao, D. (2018). Deep ordinal regression network for monocular depth estimation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 2002–2011).

Girshick, R . (2015). Fast R-CNN. In Proceedings of IEEE International Conference on Computer Vision (ICCV), (pp. 1440–1448).

Guerrerogó, mezolmedo. R., Torrejiménez, B., Lópezsastre, R., Maldonadobascón, S., & Oñororubio, D. (2015). Extremely overlapping vehicle counting. In Pattern Recognition and Image Analysis, (pp. 423–431).

Idrees, H., Saleemi, I., Seibert, C., & Shah, M. (2013). Multi-source multi-scale counting in extremely dense crowd images. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 2547–2554).

Idrees, H., Tayyab, M., Athrey, K., Zhang, D., Al-Maadeed, S., Rajpoot, N., & Shah, M. (2018). Composition loss for counting, density map estimation and localization in dense crowds. In The European Conference on Computer Vision (ECCV), (pp. 532–546).

Jiang, X., Xiao, Z., Zhang, B., Zhen, X., Cao, X., Doermann, D., & Shao, L. (2019) . Crowd counting and density estimation by trellis encoder-decoder networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 6133–6142).

Lempitsky, V., & Zisserman, A. (2010). Learning to count objects in images. In Advances in Neural Information Processing Systems (NIPS), (pp. 1324–1332).

Li, R., Xian, K., Shen, C., Cao, Z., Lu, H., & Hang, L. (2018a). Deep attention-based classification network for robust depth prediction. In Proceedings of the Asian Conference on Computer Vision (ACCV).

Li, R., Xian, K., Shen, C., Cao, Z., Lu, H., & Hang, L. (2018b). Deep attention-based classification network for robust depth prediction. In Proceedings of Asian Conference on Computer Vision (ACCV).

Li, Y., Zhang, X., Chen, D. (2018c) Csrnet: Dilated convolutional neural networks for understanding the highly congested scenes. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 1091–1100)

Liu, J., Gao, C., Meng, D., & Hauptmann, A. G. (2018a). Decidenet: Counting varying density crowds through attention guided detection and density estimation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 5197–5206).

Liu, L., Wang, H., Li, G., Ouyang, W., & Liang, L. (2018b). Crowd counting using deep recurrent spatial-aware network. In International Joint Conference on Artificial Intelligence (IJCAI).

Liu, L., Lu, H., Xiong, H., Xian, K., Cao, Z., & Shen, C .(2019). Counting objects by blockwise classification. In IEEE Transactions on Circuits and Systems for Video Technology.

Lu, H., Cao, Z., Xiao, Y., Fang, Z., Zhu, Y., & Xian, K. (2015). Fine-grained maize tassel trait characterization with multi-view representations. Computers and Electronics in Agriculture, 118, 143–158. https://doi.org/10.1016/j.compag.2015.08.027CrossRef

Lu, H., Cao, Z., Xiao, Y., Li, Y., & Zhu, Y. (2016). Region-based colour modelling for joint crop and maize tassel segmentation. Biosystems Engineering, 147, 139–150. https://doi.org/10.1016/j.biosystemseng.2016.04.007CrossRef

Lu, H., Cao, Z., Xiao, Y., Zhuang, B., & Shen, C. (2017). TasselNet: counting maize tassels in the wild via local counts regression network. Plant Methods, 13(1), 79–95.CrossRef

Ma, Z., Wei, X., Hong, X., & Gong, Y. (2019). Bayesian loss for crowd count estimation with point supervision. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), (pp. 6142–6151).

Niu, Z., Zhou, M., Wang, L., Gao, X., & Hua, G. (2016) Ordinal regression with multiple output cnn for age estimation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 4920–4928).

Oñoro-Rubio, D., & López-Sastre, R. J. (2016). Towards perspective-free object counting with deep learning. In The European Conference on Computer Vision (ECCV), (pp. 615–629).

Osterreicher, F., & Vajda, I. (2003). A new class of metric divergences on probability spaces and its applicability in statistics. Annals of the Institute of Statistical Mathematics, 55(3), 639–653.MathSciNetCrossRefMATH

Panareda Busto, P., Gall ,J. (2017). Open set domain adaptation. In Proceedings of IEEE International Conference on Computer Vision (ICCV), (pp. 754–763).

Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al. (2019). Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems (NIPS), (pp. 8024–8035).

Ranjan, V., Le, H., Hoai, M. (2018). Iterative crowd counting. In The European Conference on Computer Vision (ECCV), (pp. 270–285).

Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-assisted Intervention, (pp. 234–241).

Scheirer, W. J., de Rezende, Rocha A., Sapkota, A., & Boult, T. E. (2012). Toward open set recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(7), 1757–1772.CrossRef

Shen, Z., Xu, Y., Ni, B., Wang, M., Hu, J., & Yang, X. (2018). Crowd counting via adversarial cross-scale consistency pursuit. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 5245–5254).

Shi, Z., Zhang, L., Liu, Y., Cao, X., Ye, Y., Cheng, M. M., & Zheng, G. (2018). Crowd counting with deep negative correlation learning. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 5382–5390).

Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition.

Sindagi, V. A., Patel, V. M. (2017). Generating high-quality crowd density maps using contextual pyramid cnns. In The IEEE International Conference on Computer Vision (ICCV), (pp. 1861–1870).

Sindagi, V. A., Yasarla, R., & Patel, V. M. (2019). Pushing the frontiers of unconstrained crowd counting: New dataset and benchmark method. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), (pp. 1221–1231).

Stahl, T., Pintea, S. L., & van Gemert, J. C. (2019). Divide and count: Generic object counting by image divisions. IEEE Transactions on Image Processing, 28(2), 1035–1044.MathSciNetCrossRefMATH

Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A. (2015). Going deeper with convolutions. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 1–9).

Tian, Y., Lei, Y., Zhang, J., & Wang, J. Z. (2019). Padnet: Pan-density crowd counting. In IEEE Transactions on Image Processing, (pp. 1–1), https://doi.org/10.1109/TIP.2019.2952083

Tota, K., & Idrees. H . (2015). Counting in dense crowds using deep features. In Center for Research in Computer Vision (CRCV).

Tsai, Y. H., Hung, W. C., Schulter, S., Sohn, K., Yang , M. H., & Chandraker, M. (2018). Learning to adapt structured output space for semantic segmentation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

Uijlings, J. R. R., Sande, K. E. A. V. D., Gevers, T., & Smeulders, A. W. M. (2013). Selective search for object recognition. International Journal of Computer Vision, 104(2), 154–171.CrossRef

Xiong, H., Cao, Z., Lu, H., Madec, S., Liu, L., & Shen, C. (2019). TasselNetv2: In-field counting of wheat spikes with context-augmented local regression networks. Plant Methods, 15, 150–163.CrossRef

Xiong, H., Lu, H., Liu, C., Liang, L., Cao, Z., & Shen, C. (2019b). From open set to closed set: Counting objects by spatial divide-and-conquer. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (pp. 8362–8371).

Xu, C., Qiu, K., Fu, J., Bai, S., Xu, Y., & Bai, X. (2019). Learn to scale: Generating multipolar normalized density maps for crowd counting. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), (pp. 8382–8390).

Yan, Z., Yuan, Y., Zuo, W., Tan, X., Wang, Y., Wen, S., & Ding, E. (2019). Perspective-guided convolution networks for crowd counting. In The IEEE International Conference on Computer Vision (ICCV).

Zhang, C., Li, H., Wang, X., & Yang, X. (2015). Cross-scene crowd counting via deep convolutional neural networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 833–841).

Zhang, Y., Zhou, D., Chen, S., Gao, S., & Ma, Y. (2016). Single-image crowd counting via multi-column convolutional neural network. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 589–597).

Titel: From Open Set to Closed Set: Supervised Spatial Divide-and-Conquer for Object Counting
verfasst von: Haipeng Xiong
Hao Lu
Chengxin Liu
Liang Liu
Chunhua Shen
Zhiguo Cao
Publikationsdatum: 01.04.2023
Verlag: Springer US
Erschienen in: International Journal of Computer Vision / Ausgabe 7/2023
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI: https://doi.org/10.1007/s11263-023-01782-1

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 7/2023

Learning Accurate Performance Predictors for Ultrafast Automated Model Compression

Trust-Region Adaptive Frequency for Online Continual Learning

Adaptive Deep PnP Algorithm for Video Snapshot Compressive Imaging

Neural Architecture Search for Dense Prediction Tasks in Computer Vision

Memory Based Temporal Fusion Network for Video Deblurring

Disentangling Geometric Deformation Spaces in Generative Latent Shape Models

Premium Partner