Skip to main content

2016 | OriginalPaper | Buchkapitel

Grid Loss: Detecting Occluded Faces

verfasst von : Michael Opitz, Georg Waltner, Georg Poier, Horst Possegger, Horst Bischof

Erschienen in: Computer Vision – ECCV 2016

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Detection of partially occluded objects is a challenging computer vision problem. Standard Convolutional Neural Network (CNN) detectors fail if parts of the detection window are occluded, since not every sub-part of the window is discriminative on its own. To address this issue, we propose a novel loss layer for CNNs, named grid loss, which minimizes the error rate on sub-blocks of a convolution layer independently rather than over the whole feature map. This results in parts being more discriminative on their own, enabling the detector to recover if the detection window is partially occluded. By mapping our loss layer back to a regular fully connected layer, no additional computational cost is incurred at runtime compared to standard CNNs. We demonstrate our method for face detection on several public face detection benchmarks and show that our method outperforms regular CNNs, is suitable for realtime applications and achieves state-of-the-art performance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Benenson, R., Mathias, M., Tuytelaars, T., Van Gool, L.: Seeking the Strongest Rigid Detector. In: Proceedings of the CVPR (2013) Benenson, R., Mathias, M., Tuytelaars, T., Van Gool, L.: Seeking the Strongest Rigid Detector. In: Proceedings of the CVPR (2013)
2.
Zurück zum Zitat Dollár, P., Appel, R., Belongie, S., Perona, P.: Fast feature pyramids for object detection. PAMI 36(8), 1532–1545 (2014)CrossRef Dollár, P., Appel, R., Belongie, S., Perona, P.: Fast feature pyramids for object detection. PAMI 36(8), 1532–1545 (2014)CrossRef
3.
Zurück zum Zitat Mathias, M., Benenson, R., Pedersoli, M., Gool, L.: Face detection without bells and whistles. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 720–735. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10593-2_47 Mathias, M., Benenson, R., Pedersoli, M., Gool, L.: Face detection without bells and whistles. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 720–735. Springer, Heidelberg (2014). doi:10.​1007/​978-3-319-10593-2_​47
4.
Zurück zum Zitat Schulter, S., Leistner, C., Wohlhart, P., Roth, P.M., Bischof, H.: Accurate object detection with joint classification-regression random forests. In: Proceedings of the CVPR (2014) Schulter, S., Leistner, C., Wohlhart, P., Roth, P.M., Bischof, H.: Accurate object detection with joint classification-regression random forests. In: Proceedings of the CVPR (2014)
5.
Zurück zum Zitat Viola, P., Jones, M.J.: Robust real-time face detection. IJCV 57(2), 137–154 (2004)CrossRef Viola, P., Jones, M.J.: Robust real-time face detection. IJCV 57(2), 137–154 (2004)CrossRef
6.
Zurück zum Zitat Zhang, S., Benenson, R., Schiele, B.: Filtered channel features for pedestrian detection. In: Proceedings of the CVPR (2015) Zhang, S., Benenson, R., Schiele, B.: Filtered channel features for pedestrian detection. In: Proceedings of the CVPR (2015)
7.
Zurück zum Zitat Yang, B., Yan, J., Lei, Z., Li, S.Z.: Convolutional channel features. In: Proceedings of the ICCV (2015) Yang, B., Yan, J., Lei, Z., Li, S.Z.: Convolutional channel features. In: Proceedings of the ICCV (2015)
8.
Zurück zum Zitat Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. PAMI 32(9), 1627–1645 (2010)CrossRef Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. PAMI 32(9), 1627–1645 (2010)CrossRef
9.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)
10.
Zurück zum Zitat Farfade, S.S., Saberian, M., Li, L.J.: Multi-view face detection using deep convolutional neural networks. In: Proceedings of the ICMR (2015) Farfade, S.S., Saberian, M., Li, L.J.: Multi-view face detection using deep convolutional neural networks. In: Proceedings of the ICMR (2015)
11.
Zurück zum Zitat Hosang, J., Omran, M., Benenson, R., Schiele, B.: Taking a deeper look at pedestrians. In: Proceedings of the CVPR (2015) Hosang, J., Omran, M., Benenson, R., Schiele, B.: Taking a deeper look at pedestrians. In: Proceedings of the CVPR (2015)
12.
Zurück zum Zitat Li, H., Lin, Z., Shen, X., Brandt, J., Hua, G.: A convolutional neural network cascade for face detection. In: Proceedings of the CVPR (2015) Li, H., Lin, Z., Shen, X., Brandt, J., Hua, G.: A convolutional neural network cascade for face detection. In: Proceedings of the CVPR (2015)
13.
Zurück zum Zitat Sermanet, P., Kavukcuoglu, K., Chintala, S., LeCun, Y.: Pedestrian detection with unsupervised multi-stage feature learning. In: Proceedings of the CVPR (2013) Sermanet, P., Kavukcuoglu, K., Chintala, S., LeCun, Y.: Pedestrian detection with unsupervised multi-stage feature learning. In: Proceedings of the CVPR (2013)
14.
Zurück zum Zitat Jain, V., Learned-Miller, E.: FDDB: A benchmark for face detection in unconstrained settings. Technical report UM-CS-2010-009, University of Massachusetts, Amherst (2010) Jain, V., Learned-Miller, E.: FDDB: A benchmark for face detection in unconstrained settings. Technical report UM-CS-2010-009, University of Massachusetts, Amherst (2010)
15.
Zurück zum Zitat Yan, J., Zhang, X., Lei, Z., Li, S.Z.: Face detection by structural models. IVC 32(10), 790–799 (2014)CrossRef Yan, J., Zhang, X., Lei, Z., Li, S.Z.: Face detection by structural models. IVC 32(10), 790–799 (2014)CrossRef
16.
Zurück zum Zitat Zhu, X., Ramanan, D.: Face detection, pose estimation and landmark estimation in the wild. In: Proceedings of the CVPR (2012) Zhu, X., Ramanan, D.: Face detection, pose estimation and landmark estimation in the wild. In: Proceedings of the CVPR (2012)
17.
Zurück zum Zitat Zafeiriou, S., Zhang, C., Zhang, Z.: A survey on face detection in the wild: past present and future. CVIU 138, 1–24 (2015) Zafeiriou, S., Zhang, C., Zhang, Z.: A survey on face detection in the wild: past present and future. CVIU 138, 1–24 (2015)
18.
Zurück zum Zitat Li, J., Zhang, Y.: Learning surf cascade for fast and accurate object detection. In: Proceedings of the CVPR (2013) Li, J., Zhang, Y.: Learning surf cascade for fast and accurate object detection. In: Proceedings of the CVPR (2013)
19.
Zurück zum Zitat Li, H., Lin, Z., Brandt, J., Shen, X., Hua, G.: Efficient boosted exemplar-based face detection. In: Proceedings of the CVPR (2014) Li, H., Lin, Z., Brandt, J., Shen, X., Hua, G.: Efficient boosted exemplar-based face detection. In: Proceedings of the CVPR (2014)
20.
Zurück zum Zitat Chen, D., Ren, S., Wei, Y., Cao, X., Sun, J.: Joint cascade face detection and alignment. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 109–122. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10599-4_8 Chen, D., Ren, S., Wei, Y., Cao, X., Sun, J.: Joint cascade face detection and alignment. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 109–122. Springer, Heidelberg (2014). doi:10.​1007/​978-3-319-10599-4_​8
21.
Zurück zum Zitat Yang, B., Yan, J., Lei, Z., Li, S.Z.: Aggregate channel features for multi-view face detection. In: Proceedings of the IJCB (2014) Yang, B., Yan, J., Lei, Z., Li, S.Z.: Aggregate channel features for multi-view face detection. In: Proceedings of the IJCB (2014)
22.
Zurück zum Zitat Ghiasi, G., Fowlkes, C.C.: Occlusion coherence: localizing occluded faces with a hierarchical deformable part model. In: Proceedings of the CVPR (2014) Ghiasi, G., Fowlkes, C.C.: Occlusion coherence: localizing occluded faces with a hierarchical deformable part model. In: Proceedings of the CVPR (2014)
23.
Zurück zum Zitat Yan, J., Lei, Z., Wen, L., Li, S.: The fastest deformable part model for object detection. In: Proceedings of the CVPR (2014) Yan, J., Lei, Z., Wen, L., Li, S.: The fastest deformable part model for object detection. In: Proceedings of the CVPR (2014)
24.
Zurück zum Zitat Li, H., Hua, G., Lin, Z., Brandt, J., Yang, J.: Probabilistic elastic part model for unsupervised face detector adaptation. In: Proceedings of the ICCV (2013) Li, H., Hua, G., Lin, Z., Brandt, J., Yang, J.: Probabilistic elastic part model for unsupervised face detector adaptation. In: Proceedings of the ICCV (2013)
25.
Zurück zum Zitat Shen, X., Lin, Z., Brandt, J., Wu, Y.: Detecting and aligning faces by image retrieval. In: Proceedings of the CVPR (2013) Shen, X., Lin, Z., Brandt, J., Wu, Y.: Detecting and aligning faces by image retrieval. In: Proceedings of the CVPR (2013)
26.
Zurück zum Zitat Kumar, V., Namboodiri, A.M., Jawahar, C.V.: Visual phrases for exemplar face detection. In: Proceedings of the ICCV (2015) Kumar, V., Namboodiri, A.M., Jawahar, C.V.: Visual phrases for exemplar face detection. In: Proceedings of the ICCV (2015)
27.
Zurück zum Zitat Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the CVPR (2014) Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the CVPR (2014)
28.
Zurück zum Zitat Garcia, C., Delakis, M.: Convolutional face finder: a neural architecture for fast and robust face detection. PAMI 26(11), 1408–1423 (2004)CrossRef Garcia, C., Delakis, M.: Convolutional face finder: a neural architecture for fast and robust face detection. PAMI 26(11), 1408–1423 (2004)CrossRef
29.
Zurück zum Zitat Rowley, H., Baluja, S., Kanade, T., et al.: Neural network-based face detection. PAMI 20(1), 23–38 (1998)CrossRef Rowley, H., Baluja, S., Kanade, T., et al.: Neural network-based face detection. PAMI 20(1), 23–38 (1998)CrossRef
30.
Zurück zum Zitat Vaillant, R., Monrocq, C., LeCun, Y.: Original approach for the localisation of objects in images. IEEE Proc. Vis., Image Sig. Proces. 141(4), 245–250 (1994)CrossRef Vaillant, R., Monrocq, C., LeCun, Y.: Original approach for the localisation of objects in images. IEEE Proc. Vis., Image Sig. Proces. 141(4), 245–250 (1994)CrossRef
31.
Zurück zum Zitat Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge, IJCV, pp. 1–42 (2015) Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge, IJCV, pp. 1–42 (2015)
32.
Zurück zum Zitat Yang, S., Luo, P., Loy, C.C., Tang, X.: From facial parts responses to face detection: a deep learning approach. In: Proceedings of the ICCV (2015) Yang, S., Luo, P., Loy, C.C., Tang, X.: From facial parts responses to face detection: a deep learning approach. In: Proceedings of the ICCV (2015)
33.
Zurück zum Zitat Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: OverFeat: integrated recognition, localization and detection using convolutional networks. In: Proceedings of the ICLR (2014) Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: OverFeat: integrated recognition, localization and detection using convolutional networks. In: Proceedings of the ICLR (2014)
34.
Zurück zum Zitat Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. JMLR 15(1), 1929–1958 (2014)MathSciNetMATH Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. JMLR 15(1), 1929–1958 (2014)MathSciNetMATH
35.
Zurück zum Zitat Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the CVPR (2015) Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the CVPR (2015)
36.
Zurück zum Zitat Köstinger, M., Wohlhart, P., Roth, P.M., Bischof, H.: Annotated Facial Landmarks in the Wild: A large-scale, real-world database for facial landmark localization. In: Proceedings of the BeFIT (in conj. with ICCV) (2011) Köstinger, M., Wohlhart, P., Roth, P.M., Bischof, H.: Annotated Facial Landmarks in the Wild: A large-scale, real-world database for facial landmark localization. In: Proceedings of the BeFIT (in conj. with ICCV) (2011)
37.
Zurück zum Zitat Everingham, M., Eslami, S.M.A., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The Pascal visual object classes challenge: a retrospective. IJCV 111(1), 98–136 (2015)CrossRef Everingham, M., Eslami, S.M.A., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The Pascal visual object classes challenge: a retrospective. IJCV 111(1), 98–136 (2015)CrossRef
38.
Zurück zum Zitat Lee, C.Y., Xie, S., Gallagher, P., Zhang, Z., Tu, Z.: Deeply-supervised nets. In: Proceedings of the AISTATS (2015) Lee, C.Y., Xie, S., Gallagher, P., Zhang, Z., Tu, Z.: Deeply-supervised nets. In: Proceedings of the AISTATS (2015)
39.
Zurück zum Zitat Burgos-Artizzu, X., Perona, P., Dollár, P.: Robust face landmark estimation under occlusion. In: Proceedings of the ICCV (2013) Burgos-Artizzu, X., Perona, P., Dollár, P.: Robust face landmark estimation under occlusion. In: Proceedings of the ICCV (2013)
40.
Zurück zum Zitat Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of the ICLR (2015) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of the ICLR (2015)
41.
Zurück zum Zitat Bastien, F., Lamblin, P., Pascanu, R., Bergstra, J., Goodfellow, I.J., Bergeron, A., Bouchard, N., Bengio, Y.: Theano: new features and speed improvements. In: Proceedings of the NIPS Deep Learning Workshop (2012) Bastien, F., Lamblin, P., Pascanu, R., Bergstra, J., Goodfellow, I.J., Bergeron, A., Bouchard, N., Bengio, Y.: Theano: new features and speed improvements. In: Proceedings of the NIPS Deep Learning Workshop (2012)
42.
Zurück zum Zitat Dubout, C., Fleuret, F.: Exact acceleration of linear object detectors. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 301–311. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33712-3_22 Dubout, C., Fleuret, F.: Exact acceleration of linear object detectors. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 301–311. Springer, Heidelberg (2012). doi:10.​1007/​978-3-642-33712-3_​22
43.
Zurück zum Zitat Girshick, R., Iandola, F., Darrell, T., Malik, J.: Deformable part models are convolutional neural networks. In: Proceedings of the CVPR (2015) Girshick, R., Iandola, F., Darrell, T., Malik, J.: Deformable part models are convolutional neural networks. In: Proceedings of the CVPR (2015)
44.
Zurück zum Zitat Jaderberg, M., Vedaldi, A., Zisserman, A.: Speeding up convolutional neural networks with low rank expansions. In: Proceedings of the BMVC (2014) Jaderberg, M., Vedaldi, A., Zisserman, A.: Speeding up convolutional neural networks with low rank expansions. In: Proceedings of the BMVC (2014)
45.
Metadaten
Titel
Grid Loss: Detecting Occluded Faces
verfasst von
Michael Opitz
Georg Waltner
Georg Poier
Horst Possegger
Horst Bischof
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-46487-9_24