Top

International Journal of Computer Vision

Published in:

22-03-2019

Synthesis of High-Quality Visible Faces from Polarimetric Thermal Faces using Generative Adversarial Networks

Authors: He Zhang, Benjamin S. Riggan, Shuowen Hu, Nathaniel J. Short, Vishal M. Patel

Published in: International Journal of Computer Vision | Issue 6-7/2019

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

The large domain discrepancy between faces captured in polarimetric (or conventional) thermal and visible domains makes cross-domain face verification a highly challenging problem for human examiners as well as computer vision algorithms. Previous approaches utilize either a two-step procedure (visible feature estimation and visible image reconstruction) or an input-level fusion technique, where different Stokes images are concatenated and used as a multi-channel input to synthesize the visible image given the corresponding polarimetric signatures. Although these methods have yielded improvements, we argue that input-level fusion alone may not be sufficient to realize the full potential of the available Stokes images. We propose a generative adversarial networks based multi-stream feature-level fusion technique to synthesize high-quality visible images from polarimetric thermal images. The proposed network consists of a generator sub-network, constructed using an encoder–decoder network based on dense residual blocks, and a multi-scale discriminator sub-network. The generator network is trained by optimizing an adversarial loss in addition to a perceptual loss and an identity preserving loss to enable photo realistic generation of visible images while preserving discriminative characteristics. An extended dataset consisting of polarimetric thermal facial signatures of 111 subjects is also introduced. Multiple experiments evaluated on different experimental protocols demonstrate that the proposed method achieves state-of-the-art performance. Code will be made available at https://github.com/hezhangsprinter.

previous article Disentangling Geometry and Appearance with Regularised Geometry-Aware Generative Adversarial Networks

next article Identity-Preserving Face Recovery from Stylized Portraits

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Input level fusion can be regarded as an extreme case for low-level feature fusion, where low-level features (from shallow layers) often preserve edge information rather than semantic mid-level or high-level class-specific information (Zeiler and Fergus 2014).

Weights are not shared among each stream.

Feature map size (width and height) in each level is same.

Basically, this network is composed of one stream of the encoder part followed by the same decoder without multi-level pooling.

Berthelot, D., Schumm, T., & Metz, L. (2017). Began: Boundary equilibrium generative adversarial networks. arXiv preprint arXiv:1703.10717.

Bodla, N., Zheng, J., Xu, H., Chen, J. C., Castillo, C., & Chellappa, R. (2017). Deep heterogeneous feature fusion for template-based face recognition. In 2017 IEEE winter conference on applications of computer vision (WACV) (pp. 586–595). IEEE.

Chen, J. C., Patel, V. M., & Chellappa, R. (2016). Unconstrained face verification using deep cnn features. In 2016 IEEE winter conference on applications of computer vision (WACV) (pp. 1–9). IEEE.

Chen, J. C., Ranjan, R., Sankaranarayanan, S., Kumar, A., Chen, C. H., Patel, V. M., et al. (2017). Unconstrained still/video-based face verification with deep convolutional neural networks. International Journal of Computer Vision. https://doi.org/10.1007/s11263-017-1029-3.

Chen, X., Flynn, P. J., & Bowyer, K. W. (2005). Ir and visible light face recognition. Computer Vision and Image Understanding, 99(3), 332–358.CrossRef

Creswell, A., & Bharath, A. A. (2016). Task specific adversarial cost function. arXiv preprint arXiv:1609.08661.

Di, X., Zhang, H., & Patel, V. M. (2019). Polarimetric thermal to visible face verification via attribute preserved synthesis. CoRR abs/1901.00889 arXiv:1901.00889.

Ding, H., Zhou, S. K., & Chellappa, R. (2017). Facenet2expnet: Regularizing a deep face recognition net for expression recognition. In 2017 12th IEEE international conference on automatic face and gesture recognition (FG 2017) (pp. 118–126). IEEE.

Espinosa-Duró, V., Faundez-Zanuy, M., & Mekyska, J. (2013). A new face database simultaneously acquired in visible, near-infrared and thermal spectrums. Cognitive Computation, 5(1), 119–135.CrossRef

Gao, F., Shi, S., Yu, J., & Huang, Q. (2017). Composition-aided sketch-realistic portrait generation. arXiv preprint arXiv:1712.00899.

Gonzalez-Sosa, E., Vera-Rodriguez, R., Fierrez, J., & Patel, V. M. (2017a). Exploring body shape from mmw images for person recognition. IEEE Transactions on Information Forensics and Security, 12(9), 2078–2089.CrossRef

Gonzalez-Sosa, E., Vera-Rodriguez, R., Fierrez, J., & Patel, V. M. (2017b). Millimetre wave person recognition: Hand-crafted vs. learned features. In ISBA (pp. 1–7)

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., et al. (2014). Generative adversarial nets. In NIPS (pp. 2672–2680).

Gurton, K. P., Yuffa, A. J., & Videen, G. W. (2014). Enhanced facial recognition for thermal imagery using polarimetric imaging. Optics Letters, 39(13), 3857–3859.CrossRef

He, R., Cao, J., Song, L., Sun, Z., & Tan, T. (2019). Cross-spectral face completion for nir-vis heterogeneous face recognition. arXiv preprint arXiv:1902.03565.

He, R., Wu, X., Sun, Z., & Tan, T. (2017). Wasserstein cnn: Learning invariant features for nir-vis face recognition. arXiv preprint arXiv:1708.02412.

Hu, S., Choi, J., Chan, A. L., & Schwartz, W. R. (2015). Thermal-to-visible face recognition using partial least squares. JOSA A, 32(3), 431–442.CrossRef

Hu, S., Short, N. J., Riggan, B. S., Gordon, C., Gurton, K. P., Thielke, M., Gurram, P., & Chan, A. L. (2016). A polarimetric thermal database for face recognition research. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 119–126).

Huang, G., Liu, Z., Weinberger, K. Q., & van der Maaten, L. (2016). Densely connected convolutional networks. arXiv preprint arXiv:1608.06993.

Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of The 32nd international conference on machine learning (pp. 448–456).

Iranmanesh, S. M., Dabouei, A., Kazemi, H., & Nasrabadi, N. M. (2018). Deep cross polarimetric thermal-to-visible face recognition. ArXiv e-prints.

Isola, P., Zhu, J. Y., Zhou, T., & Efros, A. A. (2016). Image-to-image translation with conditional adversarial networks. In 2017 IEEE conference on computer vision and pattern recognition (CVPR).

Jetchev, N., Bergmann, U., & Vollgraf, R. (2016). Texture synthesis with spatial generative adversarial networks. arXiv preprint arXiv:1611.08207.

Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. In European conference on computer vision (pp. 694–711). Springer.

Karacan, L., Akata, Z., Erdem, A., & Erdem, E. (2016). Learning to generate images of outdoor scenes from attributes and semantic layouts. arXiv preprint arXiv:1612.00215.

Kingma, D., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.

Klare, B., & Jain, A. K. (2010). Heterogeneous face recognition: Matching nir to visible light images. In ICPR (pp. 1513–1516).

Klare, B. F., & Jain, A. K. (2013). Heterogeneous face recognition using kernel prototype similarities. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(6), 1410–1422.CrossRef

Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., et al. (2017). Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1–8).

Lezama, J., Qiu, Q., & Sapiro, G. (2017). Not afraid of the dark: Nir-vis face recognition via cross-spectral hallucination and low-rank embedding. In 2017 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 6807–6816). IEEE.

Li, S., Yi, D., Lei, Z., & Liao, S. (2013). The casia nir-vis 2.0 face database. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 348–353).

Maas, A. L., Hannun, A. Y., & Ng, A. Y. (2013). Rectifier nonlinearities improve neural network acoustic models. In ICML workshop on deep learning for audio, speech and language processing.

Mahendran, A., & Vedaldi, A. (2015). Understanding deep image representations by inverting them. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5188–5196).

Meyers, E., & Wolf, L. (2008). Using biologically inspired features for face processing. International Journal of Computer Vision, 76(1), 93–104.CrossRef

Mirza, M., & Osindero, S. (2014). Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784.

Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10) (pp. 807–814).

Nicolo, F., & Schmid, N. A. (2012). Long range cross-spectral face recognition: Matching swir against visible light images. IEEE Transactions on Information Forensics and Security, 7(6), 1717–1726.CrossRef

Parkhi, O. M., Vedaldi, A., & Zisserman, A. (2015). Deep face recognition. In Proceedings of the British machine vision conference (BMVC).

Peng, C., Gao, X., Wang, N., Tao, D., Li, X., & Li, J. (2016). Multiple representations-based face sketch-photo synthesis. IEEE Transactions on Neural Networks and Learning Systems, 27(11), 2201–2215.CrossRef

Peng, X., Feris, R. S., Wang, X., & Metaxas, D. N. (2016). A recurrent encoder–decoder network for sequential face alignment. In European conference on computer vision (pp. 38–56). Springer International Publishing.

Peng, X., Tang, Z., Yang, F., Feris, R., & Metaxas, D. (2018). Jointly optimize data augmentation and network training: Adversarial data augmentation in human pose estimation. arXiv preprint arXiv:1805.09707.

Peng, X., Yu, X., Sohn, K., Metaxas, D. N., & Chandraker, M. (2017). Reconstruction-based disentanglement for pose-invariant face recognition. In Proceedings of the IEEE international conference on computer vision.

Perera, P., Abavisani, M., & Patel, V. M. (2017). In2i: Unsupervised multi-image-to-image translation using generative adversarial networks. arXiv preprint arXiv:1711.09334.

Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434.

Ranjan, R., Sankaranarayanan, S., Bansal, A., Bodla, N., Chen, J. C., Patel, V. M., et al. (2018). Deep learning for understanding faces: Machines may be just as good, or better, than humans. IEEE Signal Processing Magazine, 35(1), 66–83. https://doi.org/10.1109/MSP.2017.2764116.CrossRef

Ranjan, R., Sankaranarayanan, S., Castillo, C. D., & Chellappa, R. (2017). An all-in-one convolutional neural network for face analysis. In 2017 12th IEEE international conference on automatic face and gesture recognition (FG 2017) (pp. 17–24). IEEE.

Riggan, B. S., Reale, C., & Nasrabadi, N. M. (2015). Coupled auto-associative neural networks for heterogeneous face recognition. IEEE Access, 3, 1620–1632. https://doi.org/10.1109/ACCESS.2015.2479620.CrossRef

Riggan, B. S., Short, N. J., & Hu, S. (2016a). Optimal feature learning and discriminative framework for polarimetric thermal to visible face recognition. In 2016 IEEE winter conference on applications of computer vision (WACV) (pp. 1–7). IEEE.

Riggan, B. S., Short, N. J., Hu, S., & Kwon, H. (2016b). Estimation of visible spectrum faces from polarimetric thermal faces. In 2016 IEEE 8th international conference on biometrics theory, applications and systems (BTAS) (pp. 1–7). IEEE.

Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., & Chen, X. (2016). Improved techniques for training gans. In NIPS (pp. 2226–2234).

Sarfraz, M. S., & Stiefelhagen, R. (2015). Deep perceptual mapping for thermal to visible face recognition. arXiv preprint arXiv:1507.02879.

Sarfraz, M. S., & Stiefelhagen, R. (2017). Deep perceptual mapping for cross-modal face recognition. International Journal of Computer Vision, 122(3), 426–438.CrossRef

Schroff, F., Kalenichenko, D., & Philbin, J. (2015). Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 815–823).

Short, N., Hu, S., Gurram, P., Gurton, K., & Chan, A. (2015). Improving cross-modal face recognition using polarimetric imaging. Optics Letters, 40(6), 882–885. https://doi.org/10.1364/OL.40.000882.CrossRef

Sindagi, V. A., & Patel, V. M. (2017). Generating high-quality crowd density maps using contextual pyramid cnns. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1861–1870).

Song, L., Zhang, M., Wu, X., & He, R. (2018). Adversarial discriminative heterogeneous face recognition. In AAAI.

Sun, Y., Chen, Y., Wang, X., & Tang, X. (2014). Deep learning face representation by joint identification-verification. In Advances in neural information processing systems (pp. 1988–1996).

Tran, L., Yin, X., & Liu, X. (2017). Disentangled representation learning GAN for pose-invariant face recognition. In Proceeding of IEEE computer vision and pattern recognition (CVPR).

Tyo, J. S., Goldstein, D. L., Chenault, D. B., & Shaw, J. A. (2006). Review of passive imaging polarimetry for remote sensing applications. Applied Optics, 45(22), 5453–5469.CrossRef

Wang, L., Sindagi, V. A., & Patel, V. M. (2018). High-quality facial photo-sketch synthesis using multi-adversarial networks. In IEEE international conference on automatic face and gesture recognition.

Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: From error visibility to structural similarity. IEEE TIP, 13(4), 600–612.

Wu, X., He, R., Sun, Z., & Tan, T. (2018). A light cnn for deep face representation with noisy labels. IEEE Transactions on Information Forensics and Security, 13(11), 2884–2896.CrossRef

Wu, X., Huang, H., Patel, V. M., He, R., & Sun, Z. (2018) Disentangled variational representation for heterogeneous face recognition. arXiv preprint arXiv:1809.01936.

Xie, S., & Tu, Z. (2015). Holistically-nested edge detection. In Proceedings of the IEEE international conference on computer vision (pp. 1395–1403).

Xu, H., Zheng, J., Alavi, A., & Chellappa, R. (2016). Learning a structured dictionary for video-based face recognition. In 2016 IEEE winter conference on applications of computer vision (WACV) (pp. 1–9). IEEE.

Xu, T., Zhang, P., Huang, Q., Zhang, H., Gan, Z., Huang, X., & He, X. (2017). Attngan: Fine-grained text to image generation with attentional generative adversarial networks. arXiv preprint arXiv:1711.10485.

Xu, Z., Yang, X., Li, X., Sun, X., & Harbin, P. R. (2018). Strong baseline for single image dehazing with deep features and instance normalization. In BMVC (Vol. 2, p. 5).

Yang, J., Ren, P., Zhang, D., Chen, D., Wen, F., Li, H., & Hua, G. (2017). Neural aggregation network for video face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4362–4371).

Yang, X., Xu, Z., & Luo, J. (2018). Towards perceptual image dehazing by physics-based disentanglement and adversarial training. In Thirty-second AAAI conference on artificial intelligence.

Yi, D., Lei, Z., & Li, S. Z. (2015). Shared representation learning for heterogenous face recognition. In 2015 11th IEEE international conference and workshops on automatic face and gesture recognition (FG) (Vol. 1, pp. 1–7). IEEE.

Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., & Huang, T. S. (2018) Generative image inpainting with contextual attention. arXiv preprint arXiv:1801.07892.

Zeiler, M. D., Fergus, R. (2014). Visualizing and understanding convolutional networks. In European conference on computer vision (pp. 818–833). Springer.

Zhang, H., & Dana, K. (2017). Multi-style generative network for real-time transfer. arXiv preprint arXiv:1703.06953.

Zhang, H., Dana, K., Shi, J., Zhang, Z., Wang, X., Tyagi, A., & Agrawal, A. (2018). Context encoding for semantic segmentation. In The IEEE conference on computer vision and pattern recognition (CVPR).

Zhang, H., & Patel, V. M. (2018). Densely connected pyramid dehazing network. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3194–3203).

Zhang, H., Patel, V. M., Riggan, B. S., & Hu, S. (2017a). Generative adversarial network-based synthesis of visible faces from polarimetric thermal faces. In International joint conference on biometrics 2017.

Zhang, H., Sindagi, V., & Patel, V. M. (2017b). Image de-raining using a conditional generative adversarial network. arXiv preprint arXiv:1701.05957.

Zhang, Z., Yang, L., & Zheng, Y. (2018). Translating and segmenting multimodal medical volumes with cycle-and shape-consistency generative adversarial network. arXiv preprint arXiv:1802.09655.

Zhao, H., Shi, J., Qi, X., Wang, X., & Jia, J. (2017). Pyramid scene parsing network. In Proceedings of the IEEE international conference on computer vision (pp. 1–8).

Zhao, J., Mathieu, M., & LeCun, Y. (2016). Energy-based generative adversarial network. arXiv preprint arXiv:1609.03126.

Zhu, Y., Elhoseiny, M., Liu, B., & Elgammal, A. (2017). Imagine it for me: Generative adversarial approach for zero-shot learning from noisy texts. arXiv preprint arXiv:1712.01381.

Title: Synthesis of High-Quality Visible Faces from Polarimetric Thermal Faces using Generative Adversarial Networks
Authors: He Zhang
Benjamin S. Riggan
Shuowen Hu
Nathaniel J. Short
Vishal M. Patel
Publication date: 22-03-2019
Publisher: Springer US
Published in: International Journal of Computer Vision / Issue 6-7/2019
Print ISSN: 0920-5691
Electronic ISSN: 1573-1405
DOI: https://doi.org/10.1007/s11263-019-01175-3

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Other articles of this Issue 6-7/2019

Single-Shot Scale-Aware Network for Real-Time Face Detection

Wavelet Domain Generative Adversarial Network for Multi-scale Face Hallucination

Hierarchical Attention for Part-Aware Face Detection

Deep, Landmark-Free FAME: Face Alignment, Modeling, and Expression Estimation

An Adversarial Neuro-Tensorial Approach for Learning Disentangled Representations

Real-Time 3D Head Pose Tracking Through 2.5D Constrained Local Models with Local Neural Fields

Premium Partner