FusionGAN: A generative adversarial network for infrared and visible image fusion
Introduction
Image fusion is an enhancement technique that aims to combine images obtained by different kinds of sensors to generate a robust or informative image that can facilitate subsequent processing or help in decision making [1], [2]. Particularly, multi-sensor data such as thermal infrared and visible images has been used to enhance the performance in terms of human visual perception, object detection, and target recognition [3]. For example, infrared images capture thermal radiation, whereas visible images capture reflected light. These two types of images can provide scene information from different aspects with complementary properties, and they are also inherent in nearly all objects [4].
The image fusion problem has been developed with different schemes including multi-scale transform- [5], [6], [7], sparse representation-[8], [9], neural network- [10], [11], subspace-[12], [13], and saliency-based [14], [15] methods, hybrid models [16], [17], and other methods [18], [19]. Nevertheless, the major fusion framework involves three key components, including image transform, activity level measurement, and fusion rule designing [20]. Existing methods typically use the same transform or representation for different source images during the fusion process. However, it may not be appropriate for infrared and visible images, as the thermal radiation in infrared images and the appearance in visible images are manifestations of two different phenomena. In addition, the activity level measurement and fusion rule in most existing methods are designed in a manual way, and they have become more and more complex, having the limitations of implementation difficulty and computational cost [21].
To overcome the above mentioned issues, in this paper we propose an infrared and visible image fusion method from a novel perspective based on generative adversarial network (FusionGAN), which formulates the fusion as an adversarial game between keeping the infrared thermal radiation information and preserving the visible appearance texture information. More specifically, it can be seen as a minimax problem between a generator and a discriminator. The generator attempts to generate a fused image with major infrared intensities together with additional visible gradients, while the discriminator aims to force the fused image to have more texture details. This enables our fused image to maintain the thermal radiation in an infrared image and the texture details in a visible image at the same time. In addition, the end-to-end property of generative adversarial networks (GANs) can avoid manually designing complicated activity level measurements and fusion rules.
To show the major superiority of our method, we give a representative example in Fig. 1. The left two images are the infrared and visible images to be fused, where the visible image contains detailed background and the infrared image highlights the target, i.e. the water. The third image is the fusion result by using a recent method [22]. Clearly, this traditional method is just able to keep more texture details in source images, and the property of high contrast between target and background in the infrared image cannot be preserved in the fused image. In fact, the key information in the infrared image (i.e., the thermal radiation distribution) is totally lost in the fused image. The rightmost image in Fig. 1 is the fusion result by our FusionGAN. In contrast, our result preserves the thermal radiation distribution in the infrared image, and hence the target can be easily detected. Meanwhile, the details of the background (i.e., the trees, road and water plants) in the visible image are also well retained.
The main contributions of this work lie in the following four folds. First, we propose a generative adversarial architecture and design a loss function specialized for infrared and visible image fusion. The feasibility and superiority of GANs used for image fusion are also discussed. To the best of our knowledge, it is the first time that the GANs are adopted for addressing the image fusion task. Second, the proposed FusionGAN is an end-to-end model, where the fused image can be generated automatically from input source images without manually designing the activity level measurement or fusion rule. Third, we conduct experiments on public infrared and visible image fusion datasets with qualitative and quantitative comparisons to state-of-the-art methods. Compared to previous methods, the proposed FusionGAN can obtain results looking like sharpened infrared images with clear highlighted targets and abundant textures. Last but not the least, we generalize the proposed FusionGAN to fuse source images with different resolutions such as low-resolution infrared images and high-resolution visible images. It can generate high-resolution resulting images which do not suffer from noise caused by upsampling of infrared information.
The rest of this paper is arranged as follows. Section 2 describes background material and related work on GAN. In Section 3, we present our FusionGAN algorithm for infrared and visible image fusion. Section 4 illustrates the fusion performance of our method on various types of infrared and visible image/video pairs with comparisons to other approaches. We discuss the explainability of our FusionGAN in Section 5, followed by some concluding remarks in Section 6.
Section snippets
Related work
In this section, we briefly introduce the background material and relevant works, including traditional infrared and visible image fusion methods, deep learning based fusion techniques, as well as GANs and their variants.
Method
In this section, we describe the proposed FusionGAN for infrared and visible image fusion. We start by laying out the problem formulation with GANs, and then discuss the network architectures of the generator and the discriminator. Finally, we provide some details for the network training.
Experiments
In this section, we first briefly introduce the fusion metrics used in this paper and then demonstrate the efficacy of the proposed FusionGAN on public datasets, and compare it with eight state-of-the-art fusion methods including adaptive sparse representation (ASR) [37], curvelet transform (CVT) [38], dual-tree complex wavelet transform (DTCWT) [39], fourth order partial differential equation (FPDE) [12], guided filtering based fusion (GFF) [22], ratio of low-pass pyramid (LPP) [3], two-scale
Discussion
The deep learning based techniques usually have a common problem that they are regarded as black-box models and even if we understand the underlying mathematical principles of such models they lack an explicit declarative knowledge representation, hence have difficulty in generating the underlying explanatory structures [48]. In this section, we briefly discuss the explainability of our FusionGAN.
The essence of traditional GAN is to train a generator to capture the data distribution, so that
Conclusion
In this paper, we propose a novel infrared and visible image fusion method based on generative adversarial network. It can simultaneously keep the thermal radiation information in infrared images and the texture detail information in visible images. The proposed FusionGAN is an end-to-end model, which can avoid designing complicated activity level measurement and fusion rule manually as in traditional fusion strategies. Experiments on public datasets demonstrate that our fusion results look
Acknowledgment
This work was supported by the National Natural Science Foundation of China under Grant nos. 61773295 and 61503288, and the Beijing Advanced Innovation Center for Intelligent Robots and Systems under Grant no. 2016IRS15.
References (49)
- et al.
Infrared and visible image fusion using total variation model
Neurocomputing
(2016) Image fusion by a ratio of low-pass pyramid
Pattern Recognit. Lett.
(1989)- et al.
A survey of infrared and visual image fusion methods
Infrared Phys. Technol.
(2017) - et al.
Performance comparison of different multi-resolution transforms for image fusion
Inf. Fusion
(2011) - et al.
A wavelet-based image fusion tutorial
Pattern Recognit.
(2004) - et al.
Fusion method for infrared and visible images by using non-negative sparse representation
Infrared Phys. Technol.
(2014) - et al.
A fusion algorithm for infrared and visible images based on adaptive dual-channel unit-linking pcnn in nsct domain
Infrared Phys. Technol.
(2015) - et al.
Novel fusion method for visible light and infrared images based on nsst–sf–pcnn
Infrared Phys. Technol.
(2014) - et al.
Adaptive fusion method of visible light and infrared images based on non-subsampled shearlet transform and fast non-negative matrix factorization
Infrared Phys. Technology
(2014) - et al.
Infrared image enhancement through saliency feature analysis based on multi-scale decomposition
Infrared Phys. Technol.
(2014)
A general framework for image fusion based on multi-scale transform and sparse representation
Inf. Fusion
Infrared and visible image fusion based on visual saliency map and weighted least square optimization
Infrared Phys. Technol.
Infrared and visible image fusion via gradient transfer and total variation minimization
Inf. Fusion
Fusion of visible and infrared images using global entropy and gradient constrained regularization
Infrared Phys. Technol.
Pixel-level image fusion: a survey of the state of the art
Inf. Fusion
Deep learning for pixel-level image fusion: recent advances and future prospects
Information Fusion
A general framework for multiresolution image fusion: from pixels to regions
Inf. Fusion
Sparse representation based multi-sensor image fusion for multi-focus and multi-modality images: a review
Inf. Fusion
Multi-focus image fusion with a deep convolutional neural network
Inf. Fusion
Visual attention guided image fusion with sparse representation
Optik-Int. J. Light Electron Opt.
Remote sensing image fusion using the curvelet transform
Inf. Fusion
Pixel-and region-based image fusion with complex wavelets
Inf. fusion
Two-scale image fusion of visible and infrared images using saliency detection
Infrared Phys. Technol.
Infrared and visible image fusion methods and applications: a survey
Inf. Fusion
Cited by (1180)
MFHOD: Multi-modal image fusion method based on the higher-order degradation model
2024, Expert Systems with ApplicationsFASO-C: A rapid visualization technique based on optimized fusion with crossover-based atom search for multi-band imagery
2024, Expert Systems with ApplicationsCROSE: Low-light enhancement by CROss-SEnsor interaction for nighttime driving scenes
2024, Expert Systems with ApplicationsAdversarial attacks on GAN-based image fusion
2024, Information FusionA semantic-driven coupled network for infrared and visible image fusion
2024, Information Fusion