nach oben

Erschienen in:

2019 | OriginalPaper | Buchkapitel

Real-Time Face Features Localization with Recurrent Refined Dense CNN Architectures

verfasst von : Nicolas Livet

Erschienen in: Advances in Visual Computing

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Based on an innovative, efficient recurrent deep learning architecture, we present a highly stable and robust technique to localize face features on still images, captured and live video sequences. This dense (Fully Convolutional) CNN architecture, referred as the Refined Dense Mobilenet (RDM), is composed of (1) a main encoder-decoder block which aims to approximate face feature locations and, (2) a sequence of refiners which aims to robustly converge at the vicinity of the features. On video sequences, architecture is adapted into a Recurrent RDM where a shape prior component is re-injected in the form of temporal heatmaps obtained at previous frame inference.

Accuracy and stability of RDM/R-RDM architectures are compared with state-of-the-art Random Forest and CNN based approaches. The idea of combining a holistic feature localizer – taking advantage of large receptive fields to minimize large error – and refiners – working at higher resolution to converge at feature vicinities – is proving high accuracy in localizing face features. We demonstrate RDM/R-RDM architectures improve localization scores on 300W and AFLW datasets. Moreover, by relying on modern, efficient convolutional blocks and based on our recurrent architecture, we deliver the first stable and accurate real-time implementation of face feature localization on low-end Mobile devices.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Pupil Center Localization Using SOMA and CNN

Nächstes Kapitel Estimation of the Distance Between Fingertips Using Silhouette and Texture Information of Dorsal of Hand

Accompanying video available at https://vimeo.com/348063383.

Belhumeur, P.N., Jacobs, D.W., Kriegman, D.J., Kumar, N.: Localizing parts of faces using a consensus of exemplars. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2930–2940 (2013)CrossRef

Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRef

Bulat, A., Tzimiropoulos, G.: Binarized convolutional landmark localizers for human pose estimation and face alignment with limited resources, March 2017

Bulat, A., Tzimiropoulos, G.: How far are we from solving the 2D & 3D face alignment problem? (and a dataset of 230, 000 3D facial landmarks). CoRR abs/1703.07332 (2017)

Cao, X., Wei, Y., Wen, F., Sun, J.: Face alignment by explicit shape regression. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2887–2894, June 2012

Cootes, T.F., Taylor, C.J., Cooper, D.H., Graham, J.: Active shape models: their training and application. Comput. Vis. Image Underst. 61(1), 38–59 (1995)CrossRef

Cootes, T.F., Ionita, M.C., Lindner, C., Sauer, P.: Robust and accurate shape model fitting using random forest regression voting. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7578, pp. 278–291. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33786-4_21CrossRef

Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. IEEE Trans. Pattern Anal. Mach. Intell. 23(6), 681–685 (2001)CrossRef

Cristinacce, D., Cootes, T.F.: Feature detection and tracking with constrained local models, vol. 41, pp. 929–938, January 2006

10.

Dollár, P., Welinder, P., Perona, P.: Cascaded pose regression. In: CVPR (2010)

11.

Feng, Z.H., Kittler, J., Awais, M., Huber, P., Wu, X.J.: Wing loss for robust facial landmark localisation with convolutional neural networks. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2235–2245. IEEE (2018)

12.

Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016). http://www.deeplearningbook.org

13.

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778, June 2016

14.

He, Z., Kan, M., Zhang, J., Chen, X., Shan, S.: A fully end-to-end cascaded CNN for facial landmark detection. In: 2017 12th IEEE International Conference on Automatic Face Gesture Recognition (FG 2017), pp. 200–207, May 2017

15.

Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. CoRR abs/1704.04861 (2017)

16.

Jacob, B., et al.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. CoRR abs/1712.05877 (2017)

17.

King, D.E.: Dlib-ml: a machine learning toolkit. J. Mach. Learn. Res. 10, 1755–1758 (2009)

18.

Le, V., Brandt, J., Lin, Z., Bourdev, L., Huang, T.S.: Interactive facial feature localization. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 679–692. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33712-3_49CrossRef

19.

Livet, N., Berkowski, G.: Shape and appearance based sequenced convnets to detect real-time face attributes on mobile devices. In: Perales, F.J., Kittler, J. (eds.) AMDO 2018. LNCS, vol. 10945, pp. 73–84. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-94544-6_8CrossRef

20.

Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3431–3440, June 2015

21.

Luo, C., Wang, Z., Wang, S., Zhang, J., Yu, J.: Locating facial landmarks using probabilistic random forest. IEEE Signal Process. Lett. 22(12), 2324–2328 (2015)CrossRef

22.

Koestinger, M., Wohlhart, P., Roth, P.M., Bischof, H.: Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: Proceedings of the First IEEE International Workshop on Benchmarking Facial Image Analysis Technologies (2011)

23.

Peng, X., Feris, R.S., Wang, X., Metaxas, D.N.: A recurrent encoder-decoder network for sequential face alignment. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016, Part I. LNCS, vol. 9905, pp. 38–56. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_3CrossRef

24.

Ranjan, R., Patel, V.M., Chellappa, R.: HyperFace: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. CoRR abs/1603.01249 (2016)

25.

Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28. Available on arXiv:1505.04597 [cs.CV]CrossRef

26.

Sagonas, C., Antonakos, E., Tzimiropoulos, G., Zafeiriou, S., Pantic, M.: 300 faces in-the-wild challenge. Image Vision Comput. 47(C), 3–18 (2016) CrossRef

27.

Sandler, M., Howard, A.G., Zhu, M., Zhmoginov, A., Chen, L.: Inverted residuals and linear bottlenecks: mobile networks for classification, detection and segmentation. CoRR abs/1801.04381 (2018)

28.

Saragih, J.M., Lucey, S., Cohn, J.F.: Deformable model fitting by regularized landmark mean-shift. Int. J. Comput. Vision 91(2), 200–215 (2011)MathSciNetCrossRef

29.

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)

30.

Sun, Y., Wang, X., Tang, X.: Deep convolutional network cascade for facial point detection. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3476–3483, June 2013

31.

Szegedy, C., et al.: Going deeper with convolutions. CoRR abs/1409.4842 (2014)

32.

Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: DeepFace: closing the gap to human-level performance in face verification. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708, June 2014

33.

Trigeorgis, G., Snape, P., Nicolaou, M., Antonakos, E., Zafeiriou, S.: Mnemonic descent method: a recurrent process applied for end-to-end face alignment, June 2016. https://doi.org/10.1109/CVPR.2016.453

34.

Viola, P., Jones, M.: Robust real-time object detection. Int. J. Comput. Vision 4(34–47), 4 (2001)

35.

Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. CoRR abs/1602.00134 (2016)

36.

XZIMG: Magic face - face features tracker for augmented reality apps (2016). http://www.xzimg.com

37.

Zhou, E., Fan, H., Cao, Z., Jiang, Y., Yin, Q.: Extensive facial landmark localization with coarse-to-fine convolutional network cascade. In: 2013 IEEE International Conference on Computer Vision Workshops, pp. 386–391, December 2013

38.

Zhu, S., Li, C., Loy, C.C., Tang, X.: Unconstrained face alignment via cascaded compositional learning, pp. 3409–3417, June 2016

39.

Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2879–2886, June 2012

Titel: Real-Time Face Features Localization with Recurrent Refined Dense CNN Architectures
verfasst von: Nicolas Livet
Verlag: Springer International Publishing
Buch: Advances in Visual Computing
Print ISBN: 978-3-030-33719-3

Electronic ISBN: 978-3-030-33720-9

Copyright-Jahr: 2019
DOI: https://doi.org/10.1007/978-3-030-33720-9_35

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner