Skip to main content
Erschienen in: Cognitive Neurodynamics 1/2021

13.07.2020 | Research Article

End-to-end face parsing via interlinked convolutional neural networks

verfasst von: Zi Yin, Valentin Yiu, Xiaolin Hu, Liang Tang

Erschienen in: Cognitive Neurodynamics | Ausgabe 1/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Face parsing is an important computer vision task that requires accurate pixel segmentation of facial parts (such as eyes, nose, mouth, etc.), providing a basis for further face analysis, modification, and other applications. Interlinked Convolutional Neural Networks (iCNN) was proved to be an effective two-stage model for face parsing. However, the original iCNN was trained separately in two stages, limiting its performance. To solve this problem, we introduce a simple, end-to-end face parsing framework: STN-aided iCNN(STN-iCNN), which extends the iCNN by adding a Spatial Transformer Network (STN) between the two isolated stages. The STN-iCNN uses the STN to provide a trainable connection to the original two-stage iCNN pipeline, making end-to-end joint training possible. Moreover, as a by-product, STN also provides more precise cropped parts than the original cropper. Due to these two advantages, our approach significantly improves the accuracy of the original model. Our model achieved competitive performance on the Helen Dataset, the standard face parsing dataset. It also achieved superior performance on CelebAMask-HQ dataset, proving its good generalization. Our code has been released at https://​github.​com/​aod321/​STN-iCNN.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Abbasi AA, Hussain L, Awan IA, Abbasi I, Majid A, Nadeem MSA, Chaudhary QA (2020) Detecting prostate cancer using deep learning convolution neural network with transfer learning approach. Cognit Neurodyn 14:1–11CrossRef Abbasi AA, Hussain L, Awan IA, Abbasi I, Majid A, Nadeem MSA, Chaudhary QA (2020) Detecting prostate cancer using deep learning convolution neural network with transfer learning approach. Cognit Neurodyn 14:1–11CrossRef
Zurück zum Zitat Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495CrossRef Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495CrossRef
Zurück zum Zitat Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017a) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848CrossRef Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017a) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848CrossRef
Zurück zum Zitat Chen LC, Papandreou G, Schroff F, Adam H (2017b) Rethinking atrous convolution for semantic image segmentation. Preprint arXiv:170605587 Chen LC, Papandreou G, Schroff F, Adam H (2017b) Rethinking atrous convolution for semantic image segmentation. Preprint arXiv:​170605587
Zurück zum Zitat Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder–decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 801–818 Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder–decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 801–818
Zurück zum Zitat Fischer P, Dosovitskiy A, Brox T (2014) Descriptor matching with convolutional neural networks: a comparison to sift. Preprint arXiv:14055769 Fischer P, Dosovitskiy A, Brox T (2014) Descriptor matching with convolutional neural networks: a comparison to sift. Preprint arXiv:​14055769
Zurück zum Zitat Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587 Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Zurück zum Zitat He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969 He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969
Zurück zum Zitat Hubel DH, Wiesel TN (1962) Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol 160(1):106–154CrossRef Hubel DH, Wiesel TN (1962) Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol 160(1):106–154CrossRef
Zurück zum Zitat Jackson AS, Valstar M, Tzimiropoulos G (2016) A cnn cascade for landmark guided semantic part segmentation. In: European conference on computer vision, Springer, pp 143–155 Jackson AS, Valstar M, Tzimiropoulos G (2016) A cnn cascade for landmark guided semantic part segmentation. In: European conference on computer vision, Springer, pp 143–155
Zurück zum Zitat Jaderberg M, Simonyan K, Zisserman A, et al. (2015) Spatial transformer networks. In: Advances in neural information processing systems, pp 2017–2025 Jaderberg M, Simonyan K, Zisserman A, et al. (2015) Spatial transformer networks. In: Advances in neural information processing systems, pp 2017–2025
Zurück zum Zitat Jin X, Tan X (2017) Face alignment in-the-wild: a survey. Comput Vis Image Underst 162:1–22CrossRef Jin X, Tan X (2017) Face alignment in-the-wild: a survey. Comput Vis Image Underst 162:1–22CrossRef
Zurück zum Zitat Lee CH, Liu Z, Wu L, Luo P (2019) Maskgan: towards diverse and interactive facial image manipulation. Technical Report Lee CH, Liu Z, Wu L, Luo P (2019) Maskgan: towards diverse and interactive facial image manipulation. Technical Report
Zurück zum Zitat Lin J, Yang H, Chen D, Zeng M, Wen F, Yuan L (2019) Face parsing with roi tanh-warping. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5654–5663 Lin J, Yang H, Chen D, Zeng M, Wen F, Yuan L (2019) Face parsing with roi tanh-warping. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5654–5663
Zurück zum Zitat Liu C, Chen LC, Schroff F, Adam H, Hua W, Yuille AL, Fei-Fei L (2019) Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 82–92 Liu C, Chen LC, Schroff F, Adam H, Hua W, Yuille AL, Fei-Fei L (2019) Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 82–92
Zurück zum Zitat Liu S, Yang J, Huang C, Yang MH (2015) Multi-objective convolutional learning for face labeling. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3451–3459 Liu S, Yang J, Huang C, Yang MH (2015) Multi-objective convolutional learning for face labeling. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3451–3459
Zurück zum Zitat Liu S, Shi J, Liang J, Yang M (2017) Face parsing via recurrent propagation. In: British machine vision conference 2017, BMVC 2017, BMVA Press, British machine vision conference 2017, BMVC 2017 Liu S, Shi J, Liang J, Yang M (2017) Face parsing via recurrent propagation. In: British machine vision conference 2017, BMVC 2017, BMVA Press, British machine vision conference 2017, BMVC 2017
Zurück zum Zitat Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440 Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Zurück zum Zitat Long JL, Zhang N, Darrell T (2014) Do convnets learn correspondence? In: Advances in neural information processing systems, pp 1601–1609 Long JL, Zhang N, Darrell T (2014) Do convnets learn correspondence? In: Advances in neural information processing systems, pp 1601–1609
Zurück zum Zitat Luo P, Wang X, Tang X (2012) Hierarchical face parsing via deep learning. In: 2012 IEEE conference on computer vision and pattern recognition, IEEE, pp 2480–2487 Luo P, Wang X, Tang X (2012) Hierarchical face parsing via deep learning. In: 2012 IEEE conference on computer vision and pattern recognition, IEEE, pp 2480–2487
Zurück zum Zitat Oyedotun OK, Khashman A (2017) Banknote recognition: investigating processing and cognition framework using competitive neural network. Cognit Neurodyn 11(1):67–79CrossRef Oyedotun OK, Khashman A (2017) Banknote recognition: investigating processing and cognition framework using competitive neural network. Cognit Neurodyn 11(1):67–79CrossRef
Zurück zum Zitat Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99 Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Zurück zum Zitat Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, Springer, pp 234–241 Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, Springer, pp 234–241
Zurück zum Zitat Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations
Zurück zum Zitat Smith BM, Zhang L, Brandt J, Lin Z, Yang J (2013) Exemplar-based face parsing. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3484–3491 Smith BM, Zhang L, Brandt J, Lin Z, Yang J (2013) Exemplar-based face parsing. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3484–3491
Zurück zum Zitat Tang C, Sheng L, Zhang Z, Hu X (2019) Improving pedestrian attribute recognition with weakly-supervised multi-scale attribute-specific localization. In: Proceedings of the IEEE international conference on computer vision, pp 4997–5006 Tang C, Sheng L, Zhang Z, Hu X (2019) Improving pedestrian attribute recognition with weakly-supervised multi-scale attribute-specific localization. In: Proceedings of the IEEE international conference on computer vision, pp 4997–5006
Zurück zum Zitat Warrell J, Prince SJ (2009) Labelfaces: parsing facial features by multiclass labeling with an epitome prior. In: 2009 16th IEEE international conference on image processing (ICIP), IEEE, pp 2481–2484 Warrell J, Prince SJ (2009) Labelfaces: parsing facial features by multiclass labeling with an epitome prior. In: 2009 16th IEEE international conference on image processing (ICIP), IEEE, pp 2481–2484
Zurück zum Zitat Wei Z, Sun Y, Wang J, Lai H, Liu S (2017) Learning adaptive receptive fields for deep image parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2434–2442 Wei Z, Sun Y, Wang J, Lai H, Liu S (2017) Learning adaptive receptive fields for deep image parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2434–2442
Zurück zum Zitat Yamashita T, Nakamura T, Fukui H, Yamauchi Y, Fujiyoshi H (2015) Cost-alleviative learning for deep convolutional neural network-based facial part labeling. IPSJ Trans Comput Vis Appl 7:99–103CrossRef Yamashita T, Nakamura T, Fukui H, Yamauchi Y, Fujiyoshi H (2015) Cost-alleviative learning for deep convolutional neural network-based facial part labeling. IPSJ Trans Comput Vis Appl 7:99–103CrossRef
Zurück zum Zitat Zeng H, Yang C, Dai G, Qin F, Zhang J, Kong W (2018) EEG classification of driver mental states by deep learning. Cognit Neurodyn 12(6):597–606CrossRef Zeng H, Yang C, Dai G, Qin F, Zhang J, Kong W (2018) EEG classification of driver mental states by deep learning. Cognit Neurodyn 12(6):597–606CrossRef
Zurück zum Zitat Zhang N, Donahue J, Girshick R, Darrell T (2014) Part-based r-cnns for fine-grained category detection. In: European conference on computer vision, Springer, pp 834–849 Zhang N, Donahue J, Girshick R, Darrell T (2014) Part-based r-cnns for fine-grained category detection. In: European conference on computer vision, Springer, pp 834–849
Zurück zum Zitat Zhou L, Liu Z, He X (2017) Face parsing via a fully-convolutional continuous crf neural network. Preprint arXiv:170803736 Zhou L, Liu Z, He X (2017) Face parsing via a fully-convolutional continuous crf neural network. Preprint arXiv:​170803736
Zurück zum Zitat Zhou Y, Hu X, Zhang B (2015) Interlinked convolutional neural networks for face parsing. In: International symposium on neural networks, Springer, pp 222–231 Zhou Y, Hu X, Zhang B (2015) Interlinked convolutional neural networks for face parsing. In: International symposium on neural networks, Springer, pp 222–231
Zurück zum Zitat Ziwei Liu XW, Ping L, Tang X (2015) Deep learning face attributes in the wild. In: Proceedings of international conference on computer vision (ICCV) Ziwei Liu XW, Ping L, Tang X (2015) Deep learning face attributes in the wild. In: Proceedings of international conference on computer vision (ICCV)
Metadaten
Titel
End-to-end face parsing via interlinked convolutional neural networks
verfasst von
Zi Yin
Valentin Yiu
Xiaolin Hu
Liang Tang
Publikationsdatum
13.07.2020
Verlag
Springer Netherlands
Erschienen in
Cognitive Neurodynamics / Ausgabe 1/2021
Print ISSN: 1871-4080
Elektronische ISSN: 1871-4099
DOI
https://doi.org/10.1007/s11571-020-09615-4

Weitere Artikel der Ausgabe 1/2021

Cognitive Neurodynamics 1/2021 Zur Ausgabe

Neuer Inhalt