Skip to main content

2016 | OriginalPaper | Buchkapitel

Learning to Count with CNN Boosting

verfasst von : Elad Walach, Lior Wolf

Erschienen in: Computer Vision – ECCV 2016

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we address the task of object counting in images. We follow modern learning approaches in which a density map is estimated directly from the input image. We employ CNNs and incorporate two significant improvements to the state of the art methods: layered boosting and selective sampling. As a result, we manage both to increase the counting accuracy and to reduce processing time. Moreover, we show that the proposed method is effective, even in the presence of labeling errors. Extensive experiments on five different datasets demonstrate the efficacy and robustness of our approach. Mean Absolute error was reduced by 20 % to 35 %. At the same time, the training time of each CNN has been reduced by 50 %.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Lempitsky, V., Zisserman, A.: Learning to count objects in images. In: Lafferty, J.D., Williams, C.K.I., Shawe-Taylor, J., Zemel, R.S., Culotta, A. (eds.) Advances in Neural Information Processing Systems 23, pp. 1324–1332. Curran Associates Inc. (2010) Lempitsky, V., Zisserman, A.: Learning to count objects in images. In: Lafferty, J.D., Williams, C.K.I., Shawe-Taylor, J., Zemel, R.S., Culotta, A. (eds.) Advances in Neural Information Processing Systems 23, pp. 1324–1332. Curran Associates Inc. (2010)
2.
Zurück zum Zitat Zhang, C., Li, H., Wang, X., Yang, X.: Cross-scene crowd counting via deep convolutional neural networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015 Zhang, C., Li, H., Wang, X., Yang, X.: Cross-scene crowd counting via deep convolutional neural networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015
3.
Zurück zum Zitat Chan, A.B., sheng John, Z., Vasconcelos, L.N.: Privacy preserving crowd monitoring: counting people without people models or tracking. In: CVPR, pp. 1–7 (2008) Chan, A.B., sheng John, Z., Vasconcelos, L.N.: Privacy preserving crowd monitoring: counting people without people models or tracking. In: CVPR, pp. 1–7 (2008)
4.
Zurück zum Zitat Idrees, H., Saleemi, I., Seibert, C., Shah, M.: Multi-source multi-scale counting in extremely dense crowd images. In: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2013, pp. 2547–2554. IEEE Computer Society, Washington DC (2013) Idrees, H., Saleemi, I., Seibert, C., Shah, M.: Multi-source multi-scale counting in extremely dense crowd images. In: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2013, pp. 2547–2554. IEEE Computer Society, Washington DC (2013)
5.
Zurück zum Zitat Dong, L., Parameswaran, V., Ramesh, V., Zoghlami, I.: Fast crowd segmentation using shape indexing. In: IEEE 11th International Conference on Computer Vision, ICCV 2007, pp. 1–8, October 2007 Dong, L., Parameswaran, V., Ramesh, V., Zoghlami, I.: Fast crowd segmentation using shape indexing. In: IEEE 11th International Conference on Computer Vision, ICCV 2007, pp. 1–8, October 2007
6.
Zurück zum Zitat An, S., Peursum, P., Liu, W., Venkatesh, S.: Efficient algorithms for subwindow search in object detection and localization. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 264–271, June 2009 An, S., Peursum, P., Liu, W., Venkatesh, S.: Efficient algorithms for subwindow search in object detection and localization. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 264–271, June 2009
7.
Zurück zum Zitat Ryan, D., Denman, S., Fookes, C., Sridharan, S.: Crowd counting using multiple local features. In: Digital Image Computing: Techniques and Applications, DICTA 2009, pp. 81–88, December 2009 Ryan, D., Denman, S., Fookes, C., Sridharan, S.: Crowd counting using multiple local features. In: Digital Image Computing: Techniques and Applications, DICTA 2009, pp. 81–88, December 2009
8.
Zurück zum Zitat Chan, A.B., Liang, Z.S.J., Vasconcelos, N.: Privacy preserving crowd monitoring: counting people without people models or tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–7, June 2008 Chan, A.B., Liang, Z.S.J., Vasconcelos, N.: Privacy preserving crowd monitoring: counting people without people models or tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–7, June 2008
9.
Zurück zum Zitat Fiaschi, L., Koethe, U., Nair, R., Hamprecht, F.A.: Learning to count with regression forest and structured labels. In: 2012 21st International Conference on Pattern Recognition (ICPR), pp. 2685–2688, November 2012 Fiaschi, L., Koethe, U., Nair, R., Hamprecht, F.A.: Learning to count with regression forest and structured labels. In: 2012 21st International Conference on Pattern Recognition (ICPR), pp. 2685–2688, November 2012
10.
Zurück zum Zitat Pham, V.Q., Kozakaya, T., Yamaguchi, O., Okada, R.: Count forest: co-voting uncertain number of targets using random forest for crowd density estimation. In: The IEEE International Conference on Computer Vision (ICCV), December 2015 Pham, V.Q., Kozakaya, T., Yamaguchi, O., Okada, R.: Count forest: co-voting uncertain number of targets using random forest for crowd density estimation. In: The IEEE International Conference on Computer Vision (ICCV), December 2015
11.
Zurück zum Zitat Zeng, X., Ouyang, W., Wang, M., Wang, X.: Deep learning of scene-specific classifier for pedestrian detection. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part III. LNCS, vol. 8691, pp. 472–487. Springer, Heidelberg (2014) Zeng, X., Ouyang, W., Wang, M., Wang, X.: Deep learning of scene-specific classifier for pedestrian detection. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part III. LNCS, vol. 8691, pp. 472–487. Springer, Heidelberg (2014)
12.
Zurück zum Zitat Zeng, X., Ouyang, W., Wang, X.: Multi-stage contextual deep learning for pedestrian detection. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 121–128, December 2013 Zeng, X., Ouyang, W., Wang, X.: Multi-stage contextual deep learning for pedestrian detection. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 121–128, December 2013
14.
Zurück zum Zitat Wang, C., Zhang, H., Yang, L., Liu, S., Cao, X.: Deep people counting in extremely dense crowds. In: Proceedings of the 23rd ACM International Conference on Multimedia, MM 2015, pp. 1299–1302. ACM, New York (2015) Wang, C., Zhang, H., Yang, L., Liu, S., Cao, X.: Deep people counting in extremely dense crowds. In: Proceedings of the 23rd ACM International Conference on Multimedia, MM 2015, pp. 1299–1302. ACM, New York (2015)
15.
Zurück zum Zitat Li, X., Wang, L., Sung, E.: A study of adaboost with SVM based weak learners. In: Proceedings of the 2005 IEEE International Joint Conference on Neural Networks, IJCNN 2005, vol. 1, pp. 196–201, July 2005 Li, X., Wang, L., Sung, E.: A study of adaboost with SVM based weak learners. In: Proceedings of the 2005 IEEE International Joint Conference on Neural Networks, IJCNN 2005, vol. 1, pp. 196–201, July 2005
16.
Zurück zum Zitat Karianakis, N., Fuchs, T.J., Soatto, S.: Boosting convolutional features for robust object proposals (2015). CoRR arXiv:1503.06350 Karianakis, N., Fuchs, T.J., Soatto, S.: Boosting convolutional features for robust object proposals (2015). CoRR arXiv:​1503.​06350
17.
Zurück zum Zitat Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting (1997) Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting (1997)
18.
Zurück zum Zitat Yamashita, T., Watasue, T., Yamauchi, Y., Fujiyoshi, H.: Improving quality of training samples through exhaustless generation and effective selection for deep convolutional neural networks. In: ICPR 2012 (2012) Yamashita, T., Watasue, T., Yamauchi, Y., Fujiyoshi, H.: Improving quality of training samples through exhaustless generation and effective selection for deep convolutional neural networks. In: ICPR 2012 (2012)
19.
Zurück zum Zitat Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, vol. 1, pp. I–511. IEEE (2001) Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, vol. 1, pp. I–511. IEEE (2001)
20.
Zurück zum Zitat Li, H., Lin, Z., Shen, X., Brandt, J., Hua, G.: A convolutional neural network cascade for face detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5325–5334 (2015) Li, H., Lin, Z., Shen, X., Brandt, J., Hua, G.: A convolutional neural network cascade for face detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5325–5334 (2015)
21.
Zurück zum Zitat Sung, K.K., Poggio, T.: Example-based learning for view-based human face detection. IEEE Trans. Pattern Anal. Mach. Intell. 20(1), 39–51 (1998)CrossRef Sung, K.K., Poggio, T.: Example-based learning for view-based human face detection. IEEE Trans. Pattern Anal. Mach. Intell. 20(1), 39–51 (1998)CrossRef
22.
23.
Zurück zum Zitat Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)MathSciNetMATH Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)MathSciNetMATH
24.
Zurück zum Zitat Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Gordon, G.J., Dunson, D.B. (eds.) Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS-2011), vol. 15, Journal of Machine Learning Research - Workshop and Conference Proceedings, pp. 315–323 (2011) Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Gordon, G.J., Dunson, D.B. (eds.) Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS-2011), vol. 15, Journal of Machine Learning Research - Workshop and Conference Proceedings, pp. 315–323 (2011)
25.
Zurück zum Zitat Tieleman, T., Hinton, G.: Lecture 6.5–RmsProp: divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning (2012) Tieleman, T., Hinton, G.: Lecture 6.5–RmsProp: divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning (2012)
26.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: Surpassing human-level performance on imagenet classification (2015). CoRR arXiv:1502.01852 He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: Surpassing human-level performance on imagenet classification (2015). CoRR arXiv:​1502.​01852
29.
Zurück zum Zitat Long, P.M., Servedio, R.A.: Random classification noise defeats all convex potential boosters. Mach. Learn. 78(3), 287–304 (2009)MathSciNetCrossRef Long, P.M., Servedio, R.A.: Random classification noise defeats all convex potential boosters. Mach. Learn. 78(3), 287–304 (2009)MathSciNetCrossRef
30.
Zurück zum Zitat Chen, K., Gong, S., Xiang, T., Loy, C.C.: Cumulative attribute space for age and crowd density estimation. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2467–2474, June 2013 Chen, K., Gong, S., Xiang, T., Loy, C.C.: Cumulative attribute space for age and crowd density estimation. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2467–2474, June 2013
31.
32.
Zurück zum Zitat Chen, K., Loy, C.C., Gong, S., Xiang, T.: Feature mining for localised crowd counting. In: BMVC Chen, K., Loy, C.C., Gong, S., Xiang, T.: Feature mining for localised crowd counting. In: BMVC
33.
Zurück zum Zitat Saxena, A., Chung, S.H., Ng, A.Y.: Learning depth from single monocular images. In: NIPS 18. MIT Press (2005) Saxena, A., Chung, S.H., Ng, A.Y.: Learning depth from single monocular images. In: NIPS 18. MIT Press (2005)
34.
Zurück zum Zitat Li, C., Kowdle, A., Saxena, A., Chen, T.: Towards holistic scene understanding: feedback enabled cascaded classification models. In: Lafferty, J.D., Williams, C.K.I., Shawe-Taylor, J., Zemel, R.S., Culotta, A. (eds.) Advances in Neural Information Processing Systems 23, pp. 1351–1359. Curran Associates Inc. (2010) Li, C., Kowdle, A., Saxena, A., Chen, T.: Towards holistic scene understanding: feedback enabled cascaded classification models. In: Lafferty, J.D., Williams, C.K.I., Shawe-Taylor, J., Zemel, R.S., Culotta, A. (eds.) Advances in Neural Information Processing Systems 23, pp. 1351–1359. Curran Associates Inc. (2010)
35.
Zurück zum Zitat Karsch, K., Liu, C., Kang, S.B.: Depth extraction from video using non-parametric sampling. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 775–788. Springer, Heidelberg (2012) Karsch, K., Liu, C., Kang, S.B.: Depth extraction from video using non-parametric sampling. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 775–788. Springer, Heidelberg (2012)
36.
Zurück zum Zitat Liu, F., Shen, C., Lin, G., Reid, I.D.: Learning depth from single monocular images using deep convolutional neural fields (2015). CoRR arXiv:1502.07411 Liu, F., Shen, C., Lin, G., Reid, I.D.: Learning depth from single monocular images using deep convolutional neural fields (2015). CoRR arXiv:​1502.​07411
37.
Zurück zum Zitat Liu, M., Salzmann, M., He, X.: Discrete-continuous depth estimation from a single image. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2014 Liu, M., Salzmann, M., He, X.: Discrete-continuous depth estimation from a single image. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2014
38.
Zurück zum Zitat Saxena, A., Sun, M., Ng, A.Y.: Make3D: learning 3D scene structure from a single still image. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 824–840 (2009)CrossRef Saxena, A., Sun, M., Ng, A.Y.: Make3D: learning 3D scene structure from a single still image. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 824–840 (2009)CrossRef
Metadaten
Titel
Learning to Count with CNN Boosting
verfasst von
Elad Walach
Lior Wolf
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-46475-6_41

Premium Partner