nach oben

Erschienen in:

2022 | OriginalPaper | Buchkapitel

A Complementary Fusion Strategy for RGB-D Face Recognition

verfasst von : Haoyuan Zheng, Weihang Wang, Fei Wen, Peilin Liu

Erschienen in: MultiMedia Modeling

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

RGB-D Face Recognition (FR) with low-quality depth maps recently plays an important role in biometric identification. Intrinsic geometry properties and shape clues reflected by depth information significantly promote the FR robustness to light and pose variations. However, the existing multi-modal fusion methods mostly lack the ability of complementary feature learning and establishing correlated relationships between different facial features. In this paper, we propose a Complementary Multi-Modal Fusion Transformer (CMMF-Trans) network which is able to complement the fusion while preserving the modal-specific properties. In addition, the proposed novel tokenization and self-attention modules stimulate the Transformer to capture long-range dependencies supplementary to local representations of face areas. We test our model on two public datasets: Lock3DFace and IIIT-D which contain challenging variations in pose, occlusion, expression and illumination. Our strategy achieves the state-of-the-art performance on them. Another meaningful contribution in our work is that we have created a challenging RGB-D FR dataset which contains more kinds of difficult scenarios, such as, mask occlusion, backlight shadow, etc.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Double Granularity Relation Network with Self-criticism for Occluded Person Re-identification

Nächstes Kapitel Multi-scale Cross-Modal Transformer Network for RGB-D Object Detection

https://bat.sjtu.edu.cn/zh/smart-tof-face/.

Goswami, G., Vatsa, M., Singh, R.: RGB-D face recognition with texture and attribute features. IEEE Trans. Inf. Forensics Secur. 9(10), 1629–1640 (2014)CrossRef

Lee, Y.C., Chen, J., Tseng, C.W., Lai, S.H.: Accurate and robust face recognition from RGB-D images with a deep learning approach. In: BMVC, pp. 123.1–123.14 (Sep 2016)

Chowdhury, A., Ghosh, S., Singh, R., Vatsa, M.: RGB-D face recognition via learning-based reconstruction. In: 2016 IEEE 8th International Conference on Biometrics Theory, Applications and Systems (BTAS), pp. 1–7. IEEE (Sep 2016)

Zhang, H., Han, H., Cui, J., Shan, S., Chen, X.: RGB-D Face Recognition via Deep Complementary and Common Feature Learning. In: 13^th IEEE International Conference on Automatic Face & Gesture Recognition (FG), pp. 8–15 (May 2018)

Zhang, Z.: Microsoft Kinect sensor and its effect. IEEE Multimedia Mag. 19(2), 4–10 (2012)CrossRef

Keselman, L., Woodfill, J.I., Grunnet-Jepsen, A., Bhowmik, A.: Intel(R) RealSense (TM) Stereoscopic Depth Cameras. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1–10 (Jul 2017)

Lin, T.Y., Chiu, C.T., Tang, C.T.: RGB-D based multi-modal deep learning for face identification. In: ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1668–1672. IEEE (May 2020)

Jiang, L., Zhang, J., Deng, B.: Robust RGB-D face recognition using attribute-aware loss. IEEE Trans. Pattern Anal. Mach. Intell. 42(10), 2552–2566 (2020)CrossRef

Khan, S., Rahmani, H., Shah, S.A.A., Bennamoun, M.: A guide to convolutional neural networks for computer vision. Synth. Lect. Comput. Vision 8(1), 1–207 (2018)CrossRef

10.

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Houlsby, N.: An image is worth 16x16 words: transformers for image recognition at scale. In: International Conference on Learning Representations, pp. 1–7 (2021)

11.

Chhokra, P., Chowdhury, A., Goswami, G., Vatsa, M., Singh, R.: Unconstrained Kinect video face database. Inf. Fusion 44, 113–125 (2018)CrossRef

12.

Min, R., Kose, N., Dugelay, J.L.: KinectFaceDB: A Kinect database for face recognition. IEEE Trans. Syst. Man Cybern. Syst. 44(11), 1534–1548 (2014)CrossRef

13.

Zhang, J., Huang, D., Wang, Y., Sun, J.: Lock3DFace: aA large-scale database of low-cost Kinect 3D faces. In: 2016 International Conference on Biometrics, pp. 1–8. IEEE (2016)

14.

Cao, Q., Shen, L., Xie, W., Parkhi, O.M., Zisserman, A.: Vggface2: aA dataset for recognising faces across pose and age. In: 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018), pp. 67–74. IEEE (May 2018)

15.

Guo, Y., Zhang, L., Yuxiao, H., He, X., Gao, J.: Ms-celeb-1m: a dataset and benchmark for large-scale face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision – ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part III, pp. 87–102. Springer International Publishing, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_6CrossRef

16.

Uppal, H., Sepas-Moghaddam, A., Greenspan, M., Etemad, A.: Depth as attention for face representation learning. IEEE Trans. Inf. Forensics Secur. 16, 2461–2476 (2021)CrossRef

17.

Uppal, H., Sepas-Moghaddam, A., Greenspan, M., Etemad, A.: Two-level attention-based fusion learning for RGB-D face recognition. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 10120–10127. IEEE (Jan 2021)

18.

Zhu, X., Liu, X., Lei, Z., Li, S.Z.: Face alignment in full pose range: a 3d total solution. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 78–92 (2017)CrossRef

19.

Goswami, G., Bharadwaj, S., Vatsa, M., Singh, R.: On RGB-D face recognition using Kinect. In: 2013 IEEE Sixth International Conference on Biometrics: Theory, Applications and Systems (BTAS), pp. 1–6. IEEE (Sep 2013)

20.

Yuan, K., Guo, S., Liu, Z., Zhou, A., Yu, F., Wu, W.: Incorporating convolution designs into visual transformers. arXiv preprint arXiv:2103.11816 (2021)

21.

Wu, H., Xiao, B., Codella, N., Liu, M., Dai, X., Yuan, L.: Cvt: introducing convolutions to vision transformers. arXiv preprint arXiv:2103.15808 (2021)

22.

Lu, J., Batra, D., Parikh, D., Lee, S.: ViLBERT: pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, pp. 13–23 (Dec 2019)

23.

Li, G., Duan, N., Fang, Y., Gong, M., Jiang, D.: Unicoder-VL: a universal encoder for vision and language by cross-modal pre-training. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 1, pp. 11336–11344 (2020)

24.

Prakash, A., Chitta, K., Geiger, A.: Multi-modal fusion transformer for end-to-end autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7077–7087 (2021)

25.

Mu, G., Huang, D., Hu, G., Sun, J., Wang, Y.: Led3D: a lightweight and efficient deep approach to recognizing low-quality 3D faces. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5773–5782 (2019)

26.

Rahman, M.M., Tan, Y., Xue, J., Lu, K.: RGB-D object recognition with multimodal deep convolutional neural networks. In: 2017 IEEE International Conference on Multimedia and Expo (ICME), pp. 991–996. IEEE (July 2017)

27.

Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)

28.

Data Miracle Intelligent Technology Homepage. https://www.smarttof.com. Accessed 20 Aug 2021

29.

Cui, J., Zhang, H., Han, H., Shan, S., Chen, X.: Improving 2D face recognition via discriminative face depth estimation. In: 2018 International Conference on Biometrics (ICB), pp. 140–147. IEEE (Feb 2018)

30.

Chen, C.F., Fan, Q., Panda, R.: Crossvit: Cross-attention multi-scale vision transformer for image classification. arXiv preprint arXiv:2103.14899 (2021)

Titel: A Complementary Fusion Strategy for RGB-D Face Recognition
verfasst von: Haoyuan Zheng
Weihang Wang
Fei Wen
Peilin Liu
Verlag: Springer International Publishing
Buch: MultiMedia Modeling
Print ISBN: 978-3-030-98357-4

Electronic ISBN: 978-3-030-98358-1

Copyright-Jahr: 2022
DOI: https://doi.org/10.1007/978-3-030-98358-1_27

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner