Top

Published in:

2017 | OriginalPaper | Chapter

3. CMS-RCNN: Contextual Multi-Scale Region-Based CNN for Unconstrained Face Detection

Authors : Chenchen Zhu, Yutong Zheng, Khoa Luu, Marios Savvides

Published in: Deep Learning for Biometrics

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Robust face detection in the wild is one of the ultimate components to support various facial related problems, i.e., unconstrained face recognition, facial periocular recognition, facial landmarking and pose estimation, facial expression recognition, 3D facial model construction, etc. Although the face detection problem has been intensely studied for decades with various commercial applications, it still meets problems in some real-world scenarios due to numerous challenges, e.g., heavy facial occlusions, extremely low resolutions, strong illumination, exceptional pose variations, image or video compression artifacts, etc. In this paper, we present a face detection approach named Contextual Multi-Scale Region-based Convolution Neural Network (CMS-RCNN) to robustly solve the problems mentioned above. Similar to the region-based CNNs, our proposed network consists of the region proposal component and the region-of-interest (RoI) detection component. However, far apart of that network, there are two main contributions in our proposed network that play a significant role to achieve the state-of-the-art performance in face detection. First, the multi-scale information is grouped both in region proposal and RoI detection to deal with tiny face regions. Second, our proposed network allows explicit body contextual reasoning in the network inspired from the intuition of human vision system. The proposed approach is benchmarked on two recent challenging face detection databases, i.e., the WIDER FACE Dataset which contains high degree of variability, as well as the Face Detection Dataset and Benchmark (FDDB). The experimental results show that our proposed approach trained on WIDER FACE Dataset outperforms strong baselines on WIDER FACE Dataset by a large margin, and consistently achieves competitive results on FDDB against the recent state-of-the-art face detection methods.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Real-Time Face Identification via Multi-convolutional Neural Network and Boosted Hashing Forest

next chapter Latent Fingerprint Image Segmentation Using Deep Neural Network

S. Yang, P. Luo, C.C. Loy, X. Tang, Wider face: a face detection benchmark, in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (IEEE, 2016), pp. 5525–5533

P. Viola, M. Jones, Rapid object detection using a boosted cascade of simple features, in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001. CVPR 2001, vol. 1 (IEEE, 2001), pp. I–511

C. Zhang, Z. Zhang, A survey of recent advances in face detection. Technical Report MSR-TR-2010-66 (2010), http://research.microsoft.com/apps/pubs/default.aspx?id=132077

X. Zhu, D. Ramanan, Face detection, pose estimation, and landmark localization in the wild, in 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (IEEE, 2012), pp. 2879–2886

J. Li, Y. Zhang, Learning surf cascade for fast and accurate object detection, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2013), pp. 3468–3475

H. Li, G. Hua, Z. Lin, J. Brandt, J. Yang, Probabilistic elastic part model for unsupervised face detector adaptation, in Proceedings of the IEEE International Conference on Computer Vision (2013), pp. 793–800

N. Markuš, M. Frljak, I.S. Pandžić, J. Ahlberg, R. Forchheimer, A method for object detection based on pixel intensity comparisons organized in decision trees (2013). arXiv:1305.4537

H. Li, Z. Lin, J. Brandt, X. Shen, G. Hua, Efficient boosted exemplar-based face detection, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2014), pp. 1843–1850

M. Mathias, R. Benenson, M. Pedersoli, L. Van Gool, Face detection without bells and whistles, in Computer Vision-ECCV 2014 (Springer, Berlin, 2014), pp. 720–735)

10.

D. Chen, S. Ren, Y. Wei, X. Cao, J. Sun, Joint cascade face detection and alignment, in Computer Vision-ECCV 2014 (Springer, Berlin, 2014), pp. 109–122

11.

B. Yang, J. Yan, Z. Lei, S.Z. Li, Aggregate channel features for multi-view face detection, in 2014 IEEE International Joint Conference on Biometrics (IJCB) (IEEE, 2014), pp. 1–8

12.

G. Ghiasi, C.C. Fowlkes, Occlusion coherence: detecting and localizing occluded faces (2015). arXiv:1506.08347

13.

S. Liao, A. Jain, S. Li, A fast and accurate unconstrained face detector (2014)

14.

H. Li, Z. Lin, X. Shen, J. Brandt, G. Hua, A convolutional neural network cascade for face detection, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 5325–5334

15.

S.S. Farfade, M.J. Saberian, L.-J. Li, Multi-view face detection using deep convolutional neural networks, in Proceedings of the 5th ACM on International Conference on Multimedia Retrieval (ACM, 2015), pp. 643–650

16.

S. Yang, P. Luo, C.-C. Loy, X. Tang, From facial parts responses to face detection: a deep learning approach, in Proceedings of the IEEE International Conference on Computer Vision (2015), pp. 3676–3684

17.

R. Ranjan, V.M. Patel, R. Chellappa, A deep pyramid deformable part model for face detection, in 2015 IEEE 7th International Conference on Biometrics Theory, Applications and Systems (BTAS) (IEEE, 2015), pp. 1–8

18.

B. Yang, J. Yan, Z. Lei, S.Z. Li, Convolutional channel features, in Proceedings of the IEEE International Conference on Computer Vision (2015), pp. 82–90

19.

R. Ranjan, V.M. Patel, R. Chellappa, Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition (2016). arXiv:1603.01249

20.

V. Jain, E. Learned-Miller, FDDB: a benchmark for face detection in unconstrained settings. University of Massachusetts, Amherst, Technical Report UM-CS-2010-009 (2010)

21.

K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition (2014). arXiv:1409.1556

22.

P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan, Object detection with discriminatively trained part-based models. IEEE Trans. PAMI 32(9), 1627–1645 (2010)CrossRef

23.

X. Yu, J. Huang, S. Zhang, W. Yan, D. Metaxas, Pose-free facial landmark fitting via optimized part mixtures and cascaded deformable shape model, in Proceedings of the IEEE International Conference on Computer Vision (2013), pp. 1944–1951

24.

S.K. Divvala, D. Hoiem, J.H. Hays, A.A. Efros, M. Hebert, An empirical study of context in object detection, in IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009 (IEEE, 2009), pp. 1271–1278

25.

S. Bell, C.L. Zitnick, K. Bala, R. Girshick, Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks (2015). arXiv:1512.04143

26.

S. Zagoruyko, A. Lerer, T.-Y. Lin, P.O. Pinheiro, S. Gross, S. Chintala, P. Dollár, A multipath network for object detection (2016). arXiv:1604.02135

27.

A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, in Advances in Neural Information Processing Systems (2012), pp. 1097–1105

28.

R. Girshick, J. Donahue, T. Darrell, J. Malik, Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 142–158 (2016)CrossRef

29.

R. Girshick, Fast R-CNN, in Proceedings of the IEEE International Conference on Computer Vision (2015), pp. 1440–1448

30.

S. Ren, K. He, R. Girshick, J. Sun, Faster R-CNN: towards real-time object detection with region proposal networks, in Advances in Neural Information Processing Systems (2015), pp. 91–99

31.

M.D. Zeiler, R. Fergus, Visualizing and understanding convolutional networks, in Computer Vision-ECCV 2014 (Springer, Berlin, 2014), pp. 818–833

32.

T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C.L. Zitnick, Microsoft coco: Common objects in context, in ECCV (2014), pp. 740–755

33.

B. Hariharan, P. Arbeláez, R. Girshick, J. Malik, Hypercolumns for object segmentation and fine-grained localization, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 447–456

34.

W. Liu, A. Rabinovich, A.C. Berg, Parsenet: looking wider to see better (2015). arXiv:1506.04579

35.

Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, T. Darrell, Caffe: Convolutional architecture for fast feature embedding, in Proceedings of the ACM International Conference on Multimedia (ACM, 2014), pp. 675–678

36.

C.L. Zitnick, P. Dollár, Edge boxes: locating object proposals from edges, in ECCV (Springer, Berlin, 2014), pp. 391–405

37.

M. Everingham, L. Van Gool, C.K. Williams, J. Winn, A. Zisserman, The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)CrossRef

Title: CMS-RCNN: Contextual Multi-Scale Region-Based CNN for Unconstrained Face Detection
Authors: Chenchen Zhu
Yutong Zheng
Khoa Luu
Marios Savvides
Publisher: Springer International Publishing
Book: Deep Learning for Biometrics
Print ISBN: 978-3-319-61656-8

Electronic ISBN: 978-3-319-61657-5

Copyright Year: 2017
DOI: https://doi.org/10.1007/978-3-319-61657-5_3

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner