Skip to main content

A survey of classical methods and new trends in pansharpening of multispectral images

Abstract

There exist a number of satellites on different earth observation platforms, which provide multispectral images together with a panchromatic image, that is, an image containing reflectance data representative of a wide range of bands and wavelengths. Pansharpening is a pixel-level fusion technique used to increase the spatial resolution of the multispectral image while simultaneously preserving its spectral information. In this paper, we provide a review of the pan-sharpening methods proposed in the literature giving a clear classification of them and a description of their main characteristics. Finally, we analyze how the quality of the pansharpened images can be assessed both visually and quantitatively and examine the different quality measures proposed for that purpose.

1 Introduction

Nowadays, huge quantities of satellite images are available from many earth observation platforms, such as SPOT [1], Landsat 7 [2], IKONOS [3], QuickBird [4] and OrbView [5]. Moreover, due to the growing number of satellite sensors, the acquisition frequency of the same scene is continuously increasing. Remote sensing images are recorded in digital form and then processed by computers to produce image products useful for a wide range of applications.

The spatial resolution of a remote sensing imaging system is expressed as the area of the ground captured by one pixel and affects the reproduction of details within the scene. As the pixel size is reduced, more scene details are preserved in the digital representation [6]. The instantaneous field of view (IFOV) is the ground area sensed at a given instant of time. The spatial resolution depends on the IFOV. For a given number of pixels, the finer the IFOV is, the higher the spatial resolution. Spatial resolution is also viewed as the clarity of the high-frequency detail information available in an image. Spatial resolution in remote sensing is usually expressed in meters or feet, which represents the length of the side of the area covered by a pixel. Figure 1 shows three images of the same ground area but with different spatial resolutions. The image at 5 m depicted in Figure 1a was captured by the SPOT 5 satellite, while the other two images, at 10 m and 20 m, are simulated from the first image. As can be observed in these images, the detail information becomes clearer as the spatial resolution increases from 20 m to 5 m.

Figure 1
figure 1

Images of the same area with different spatial resolutions. Spatial resolution (a) 5 m. (b) 10 m, (c) 20 m.

Spectral resolution is the electromagnetic bandwidth of the signals captured by the sensor producing a given image. The narrower the spectral bandwidth is, the higher the spectral resolution. If the platform captures images with a few spectral bands, typically 4-7, they are referred to as multispectral (MS) data, while if the number of spectral bands is measured in hundreds or thousands, they are referred to as hyperspectral (HS) data [7]. Together with the MS or HS image, satellites usually provide a panchromatic (PAN) image. This is an image that contains reflectance data representative of a wide range of wavelengths from the visible to the thermal infrared, that is, it integrates the chromatic information; therefore, the name is "pan" chromatic. A PAN image of the visible bands captures a combination of red, green and blue data into a single measure of reflectance.

Remote sensing systems are designed within often competing constraints, among the most important ones being the trade-off between IFOV and signal-to-noise ratio (SNR). Since MS, and to a greater extent HS, sensors have reduced spectral bandwidths compared to PAN sensors, they typically have for a given IFOV a reduced spatial resolution in order to collect more photons and preserve the image SNR. Many sensors such as SPOT, ETM+, IKONOS, OrbView and QuickBird have a set of MS bands and a co-registered higher spatial resolution PAN band. With appropriate algorithms, it is possible to combine these data and produce MS imagery with higher spatial resolution. This concept is known as multispectral or multisensor merging, fusion or pansharpening (of the lower-resolution image) [8].

Pansharpening can consequently be defined as a pixel-level fusion technique used to increase the spatial resolution of the MS image [9]. Pansharpening is shorthand for panchromatic sharpening, meaning the use of a PAN (single band) image to sharpen an MS image. In this sense, to sharpen means to increase the spatial resolution of an MS image. Thus, pansharpening techniques increase the spatial resolution while simultaneously preserving the spectral information in the MS image, giving the best of the two worlds: high spectral resolution and high spatial resolution [7]. Some of the applications of pansharpening include improving geometric correction, enhancing certain features not visible in either of the single data alone, changing detection using temporal data sets and enhancing classification [10].

During the past years, an enormous amount of pansharpening techniques have been developed, and in order to choose the one that better serves to the user needs, there are some points, mentioned by Pohl [9], that have to be considered. In the first place, the objective or application of the pansharpened image can help in defining the necessary spectral and spatial resolution. For instance, some users may require frequent, repetitive coverage, with relatively low spatial resolution (i.e., meteorology applications), others may desire the highest possible spatial resolution (i.e., mapping), while other users may need both high spatial resolution and frequent coverage, plus rapid image delivery (i.e., military surveillance).

Then, the data that are more useful to meet the needs of the pansharpening applications, like the sensor, the satellite coverage and atmospheric constraints such as cloud cover and sun angle have to be selected. We are mostly interested in sensors that can capture simultaneously a PAN channel with high spatial resolution and some MS channels with high spectral resolution like SPOT 5, Landsat 7 and QuickBird satellites. In some cases, PAN and MS images captured by different satellite sensors at different dates for the same scene can be used for some applications [10], like in the case of fusing different MS SPOT 5 images captured at different times with one PAN IKONOS image [11], which can be considered as a multisensor, multitemporal and multiresolution pansharpening case.

We also have to take into account the need for data pre-processing, like registration, upsampling and histogram matching, as well as the selection of a pansharpening technique that makes the combination of the data most successful. Finally, evaluation criteria are needed to specify which is the most successful pansharpening approach.

In this paper, we examine the classical and state-of-the-art pansharpening methods described in the literature giving a clear classification of the methods and a description of their main characteristics. To the best of our knowledge, there is no recent paper providing a complete overview of the different pansharpening methods. However, some papers partially address the classification of pansharpening methods, see [12] for instance, or relate already proposed techniques of more global paradigms [1315].

This paper is organized as follows. In Section 2 data pre-processing techniques are described. In Section 3 a classification of the pansharpening methods is presented, with a description of the methods related to each category and some examples. In this section, we also point out open research problems in each category. In Section 4 we analyze how the quality of the pansharpened images can be assessed both visually and quantitatively and examine the different quality measures proposed for that purpose, and finally, Section 5 concludes the paper.

2 Pre-processing

Remote sensors acquire raw data that need to be processed in order to convert it to images. The grid of pixels that constitutes a digital image is determined by a combination of scanning in the cross-track direction (orthogonal to the motion of the sensor platform) and by the platform motion along the in-track direction. A pixel is created whenever the sensor system electronically samples the continuous data stream provided by the scanning [8]. The image data recorded by sensors and aircrafts can contain errors in geometry and measured brightness value of the pixels (which are referred to as radiometric errors) [16]. The relative motion of the platform, the non-idealities in the sensors themselves and the curvature of the Earth can lead to geometric errors of varying degrees of severity. The radiometric errors can result from the instrumentation used to record the data, the wavelength dependence of solar radiation and the effect of the atmosphere. For many applications using these images, it is necessary to make corrections in geometry and brightness before the data are used. By using correction techniques [8, 16], an image can be registered to a map coordinate system and therefore has its pixels addressable in terms of map coordinates rather than pixel and line numbers, a process often referred to as geocoding.

The Earth Observing System Data and Information System (EOSDIS) receives "raw" data from all spacecrafts and processes it to remove telemetry errors, eliminate communication artifacts and create Level 0 Standard Data Products that represent raw science data as measured by the instruments. Other levels of remote sensing data processing were defined in [17] by the NASA Earth Science program. In Level 1A, the reconstructed, unprocessed instrument data at full resolution, time-referenced and annotated with ancillary information (including radiometric and geometric calibration coefficients and georeferencing parameters) are computed and appended, but not applied to Level 0 data (i.e., Level 0 can be fully recovered from Level 1A). Some instruments have Level 1B data products, where the data resulting from Level 1A are processed to sensor units. At Level 2, the geographical variables are derived (e.g., Ocean wave height, soil moisture, ice concentration) at the same resolution and location as Level 1 data. Level 3 maps the variables on uniform space-time grids usually with some completeness and consistency, and finally, Level 4 gives the results from the analysis of the previous levels data. For many applications, Level 1 data are the most fundamental data records with significant scientific utility, and it is the foundation upon which all subsequent data sets are produced. For pansharpening, where the accuracy of the input data is crucial, at least radiometric and geometric corrections need to be performed on the satellite data. Radiometric correction rectifies defective columns and missing lines and reduces the non-uniformity of the sensor response among detectors. The geometrical correction deals with systematic effects such as panoramic effect, earth curvature and rotation. Note, however, that even with geometrically registered PAN and MS images, differences might appear between images as described in [10]. These differences include object disappearance or appearance and contrast inversion due to different spectral bands or different times of acquisition. Besides, both sensors do not aim exactly at the same direction, and acquisition times are not identical which have an impact on the imaging of fast-moving objects.

Once the image data have already been processed in one of the standard levels previously described, and in order to apply pansharpening techniques, the images are pre-processed to accommodate the pansharpening algorithm requirements. This pre-processing may include registration, resampling and histogram matching of the MS and PAN images. Let us now study these processes in detail.

2.1 Image registration

Many applications of remote sensing image data require two or more scenes of the same geographical region, acquired at different dates or from different sensors, in order to be processed together. In this case, the role of image registration is to make the pixels in the two images precisely coincide with the same points on the ground [8]. Two images can be registered to each other by registering each to a map coordinate base separately, or one image can be chosen as a master to which the other is to be registered [16]. However, due to the different physical characteristics of the different sensors, the problem of registration is more complex than registration of images from the same type of sensors [18] and has also to face problems like features present in one image that might appear only partially in the other image or do not appear at all. Contrast reversal in some image regions, multiple intensity values in one image that need to be mapped to a single intensity value in the other or considerably dissimilar images of the same scene produced by the image sensor when configured with different imaging parameters are also problems to be solved by the registration techniques.

Many image registration methods have been proposed in the literature. They can be classified into two categories: area-based methods and feature-based methods. Examples of area-based methods, which deal with the images without attempting to detect common objects, include Fourier methods, cross-correlation and mutual information methods [19]. Since gray-level values of the images to be matched may be quite different, and taking into account that for any two different image modalities, neither the correlation nor the mutual information is maximal when the images are spatially aligned, area-based techniques are not well adapted to the multisensor image registration problem[18]. Feature-based methods, which extract and match the common structures (features) from two images, have been shown to be more suitable for this task. Example methods in this category include methods using spatial relations, those based on invariant descriptors, relaxation, and pyramidal and wavelet image decompositions, among others [19].

2.2 Image upsampling and interpolation

When the registered remote sensing image is too coarse and does not meet the required resolution, upsampling may be needed to obtain a higher-resolution version of the image. The upsampling process may involve interpolation, usually performed via convolution of the image with an interpolation kernel [20]. In order to reduce the computational cost, preferably separable interpolants have been considered [19]. Many interpolants for various applications have been proposed in the literature. A brief discussion of interpolation methods used for image resampling is provided in [19]. Interpolation methods specific to remote sensing, as the one described in [21], have been proposed. In [22], the authors study the application of different interpolation methods to remote sensing imagery. These methods include nearest neighbor interpolation that only considers the closest pixel to the interpolated point, thus requiring the least processing time of all interpolation algorithms, bilinear interpolation that creates the new pixel in the target image from a weighted average of its four nearest neighboring pixels in the source image and interpolation with smoothing filter that produces a weighted average of the pixels contained in the area spanned by the filter mask. This process produces images with smooth transitions in gray level, while interpolation with sharpening filter enhances details that have been blurred and highlights fine details. However, sharpening filters produce aliasing in the output image, an undesirable effect that can be avoided applying interpolation with unsharp masking that subtracts a blurred version of an image from the image itself. The authors of [22] conclude that only bilinear interpolation, interpolation with smoothing filter and interpolation with unsharp masking have the potential to be used to interpolate remote sensing images. Note that interpolation does not increase the high-frequency detail information in the image but it is needed to match the number of pixels of images with different spatial resolutions.

2.3 Histogram matching

Some pansharpening algorithms assume that the spectral characteristics of the PAN image match those of each band of the MS image or match those of a transformed image based on the MS image. Unfortunately, this is not usually the case [16], and those pansharpening methods are prone to spectral distortions. Matching the histograms of the PAN image and MS bands will minimize brightness mismatching during the fusion process, which may help to reduce the spectral distortion in the pansharpened image. Although there are general purpose histogram matching techniques, as the ones described, for instance in [16] and [20], that could be used in remote sensing, specific techniques like the one presented in [23] are expected to provide more appropriate images for the application of pansharpening techniques. The technique in [23] minimizes the modification of the spectral information of the fused high-resolution multispectral (HRMS) image with respect to the original low-resolution multispectral (LRMS) image. This method modifies the value of the PAN image at each pixel (i, j) as

Stretche d P A N ( i , j ) = ( P A N ( i , j ) - μ P A N ) σ b σ P A N + μ b ,
(1)

where μ PAN and μ b are the mean of the PAN and MS image band b, respectively, and σ PAN and σ b are the standard deviation of the PAN and MS image band b, respectively. This technique ensures that the mean and standard deviation of PAN image and MS bands are within the same range, thus reducing the chromatic difference between both images.

3 Pansharpening categories

Once the remote sensing images are pre-processed in order to satisfy the pansharpening method requirements, the pansharpening process is performed. The literature shows a large collection of these pansharpening methods developed over the last two decades as well as a large number of terms used to refer to image fusion. In 1980, Wong et al.[24] proposed a technique for the integration of Landsat Multispectral Scanner (MSS) and Seasat synthetic aperture radar (SAR) images based on the modulation of the intensity of each pixel of the MSS channels with the value of the corresponding pixel of the SAR image, hence named intensity modulation (IM) integration method. Other scientists evaluated multisensor image data in the context of co-registered [25], resolution enhancement [26] or coincident [27] data analysis.

After the launch of the French SPOT satellite system in February of 1986, the civilian remote sensing sector was provided with the capability of applying high-resolution MS imagery to a range of land use and land cover analyses. Cliche et al.[28] who worked with SPOT simulation data prior to the satellite's launch showed that simulated 10-m resolution color images can be produced by modulating each SPOT MS (XS) band with PAN data individually, using three different intensity modulation (IM) methods. Welch et al.[29] used the term "merge" instead of "integration" and proposed merging of SPOT PAN and XS data using the Intensity-Hue-Saturation (IHS) transformation, a method previously proposed by Haydn et al.[30] to merge Landsat MSS with Return Beam Vidicon (RBV) data and Landsat MSS with Heat Capacity Mapping Mission data. In 1988, Chavez et al.[31] used SPOT panchromatic data to "sharpen" Landsat Thematic Mapper (TM) images by high-pass filtering (HPF) the SPOT PAN data before merging it with the TM data. A review of the so-called classical methods, which include IHS, HPF, Brovey transform (BT) [32] and principal component substitution (PCS) [33, 34], among others, can be found in [9].

In 1987, Price [35] developed a fusion technique based on the statistical properties of remote sensing images, for the combination of the two different spatial resolutions of the High Resolution Visible (HRV) SPOT sensor. Besides the Price method, the literature shows other pansharpening methods based on the statistical properties of the images, such as spatially adaptive methods [36] and Bayesian-based methods [37, 38].

More recently, multiresolution analysis employing the generalized Laplacian pyramid (GLP) [39, 40], the discrete wavelet transform [41, 42] and the contourlet transform [4345] has been used in pansharpening using the basic idea of extracting the spatial detail information from the PAN image not present in the low-resolution MS image, to inject it into the later.

Image fusion methods have been classified in several ways. Schowengerdt [8] classified them into spectral domain, spatial domain and scale-space techniques. Ranchin and Wald [46] classified them into three groups: projection and substitution methods, relative spectral contribution methods and those relevant to the ARSIS concept (from its French acronym "Amélioration de la Résolution Spatiale par Injection de Structures" which means "Enhancement of the spatial resolution by structure injections"). It was found that many of the existing image fusion methods, such as the HPF and additive wavelet transform (AWT) methods, can be accommodated within the ARSIS concept [13], but Tu et al.[47] found that the PCS, BT and AWT methods could be also considered as IHS-like image fusion methods. Meanwhile, Bretschneider et al.[12] classified IHS and PCA methods as transformation-based methods, in a classification that also included more categories such as addition and multiplication fusion, filter fusion (which includes HPF method), fusion based on inter-band relations, wavelet decomposition fusion and further fusion methods (based on statistical properties). Fusion methods that involve linear forward and backward transforms had been classified by Sheftigara [48] as component substitution methods. Recently, two comprehensive frameworks that generalize previously proposed fusion methods such as IHS, BT, PCA, HPF or AWT and study the relationships between different methods have been proposed in [14, 15].

Although it is not possible to find a universal classification, in this work we classify the pansharpening methods into the following categories according to the main technique they use:

  1. (1)

    Component Substitution (CS) family, which includes IHS, PCS and Gram-Schmidt (GS), because all these methods utilize, usually, a linear transformation and substitution for some components in the transformed domain.

  2. (2)

    Relative Spectral Contribution family, which includes BT, IM and P+XS, where a linear combination of the spectral bands, instead of substitution, is applied.

  3. (3)

    High-Frequency Injection family, which includes HPF and HPM, where these two methods inject high-frequency details extracted by subtracting a low-pass filtering PAN image from the original one.

  4. (4)

    Methods based on the statistics of the image, which include Price and spatially adaptive methods, Bayesian-based and super-resolution methods.

  5. (5)

    Multiresolution family, which includes generalized Laplacian pyramid, wavelet and contourlet methods and any combination of multiresolution analysis with methods from other categories.

Note that although the proposed classification defines five categories, as we have already mentioned, some methods can be classified in several categories and, so, the limits of each category are not sharp and there are many relations among them. The relations will be explained when the categories are described.

3.1 Component substitution family

The component substitution (CS) methods start by upsampling the low-resolution MS image to the size of the PAN image. Then, the MS image is transformed into a set of components, using usually a linear transform of the MS bands. The CS methods work by substituting a component of the (transformed) MS image, C l , with a component, C h , from the PAN image. These methods are physically meaningful only when these two components, C l and C h , contain almost the same spectral information. In other words, the C l component should contain all the redundant information of the MS and PAN images, but C h should contain more spatial information. An improper construction of the C l component tends to introduce high spectral distortion. The general algorithm for the CS sharpening techniques is summarized in Algorithm 1. This algorithm has been generalized by Tu et al.[47], where the authors also prove that the forward and backward transforms are not needed and steps 2-5 of Algorithm 1 can be summarized as finding a new component C l and adding the difference between the PAN and this new component to each upsampled MS image band. This framework has been further extended by Wang et al.[14] and Aiazzi et al.[15] in the so-called general image fusion (GIF) and extended GIF (EGIF) protocol, respectively.

Algorithm 1 Component substitution pansharpening

  1. 1.

    Upsample the MS image to the size of the PAN image.

  2. 2.

    Forward transform the MS image to the desired components.

  3. 3.

    Match the histogram of the PAN image with the C l component to be substituted.

  4. 4.

    Replace the C l component with the histogram-matched PAN image.

  5. 5.

    Backward transform the components to obtain the pansharpened image.

The CS family includes many popular pansharpening methods, such as the IHS, PCS and Gram-Schmidt (GS) methods [48, 49], each of them involving a different transformation of the MS image. CS techniques are attractive because they are fast and easy to implement and allow users' expectations to be fulfilled most of the time, since they provide pansharpened images with good visual/geometrical quality in most cases [50]. However, the results obtained by these methods highly depend on the correlation between the bands, and since the same transform is applied to the whole image, it does not take into account local dissimilarities between PAN and MS images [10, 51].

A single type of transform does not always obtain the optimal component required for substitution, and it would be difficult to choose the appropriate spectral transformation method for diverse data sets. In order to alleviate this problem, recent methods incorporate statistical tests or weighted measures to adaptively select an optimal component for substitution and transformation. This results in a new approach known as adaptive component substitution[5254].

The Intensity-Hue-Saturation (IHS) pansharpening method [31, 55] is one of the classical techniques included in this family, and it uses the IHS color space, which is often chosen due to the tendency of the visual cognitive system of human beings to treat the intensity (I), hue (H) and saturation (S) components as roughly orthogonal perceptual axes. IHS transform originally was applied to RGB true color, but in the remote sensing applications and for display purposes only, arbitrary bands are assigned to RGB channel to produce false color composites [14]. The ability of IHS transform to separate effectively spatial information (band I) and spectral information (bands H and S) [20] makes it very applicable in pan-sharpening. There are different models of IHS transform, differing in the method used to compute the intensity value. Smith's hexacone and triangular models are two of the most widely used ones [7]. An example of pansharpened image using IHS method is shown in Figure 2b.

Figure 2
figure 2

Results of some classical pansharpening methods using SPOT five images.

The major limitation of this technique is that only three bands are involved. Tu et al.[47] proposed a generalized IHS transform that surpasses the dimensional limitation. In any case, since the spectral response of I, as synthesized from the MS bands, does not generally match the radiometry of the histogram-matched PAN [50], when the fusion result is displayed in color composition, large spectral distortion may appear as color changes. In order to minimize the spectral distortion in IHS pansharpening, Tu et al.[56] proposed a new adaptive IHS method in which the intensity band approximates the PAN image for IKONOS images as closely as possible. This adaptive IHS has been extended by Rahmani et al.[52] to deal with any kind of image by determining the coefficients α i that best approximate

PAN= i α i M S i ,
(2)

subject to the physical constraint of nonnegativity of the coefficients α i . Note that, although this method reduces spectral distortion, local dissimilarities between MS and PAN images might remain [10].

Another method in the CS family is principal component substitution (PCS) that relies on the principal component analysis (PCA) mathematical transformation. The PCA, also known as the Karhunen-Loéve transform or the Hotelling transform, is widely used in signal processing, statistics and many other areas. This transformation generates a new set of rotated axes, in which the new image spectral components are not correlated. The largest amount of the variance is mapped to the first component, with decreasing variance going to each of the following ones. The sum of the variances in all the components is equal to the total variance present in the original input images. PCA and the calculation of the transformation matrices can be performed following the steps specified in [20]. Theoretically, the first principal component, PC 1, collects the information that is common to all bands used as input data to the PCA, i.e., the spatial information, while the spectral information that is specific to each band is captured in the other principal components [42, 33]. This makes PCS an adequate technique when merging MS and PAN images. PCS is similar to the IHS method, with the main advantage that an arbitrary number of bands can be considered. However, some spatial information may not be mapped to the first component, depending on the degree of correlation and spectral contrast existing among the MS bands [33], resulting in the same problems that IHS had. To overcome this drawback, Shah et al.[53] proposed a new adaptive PCA-based pansharpening method that determines, using cross-correlation, the appropriate PC component to be substituted by the PAN image. By replacing this PC component by the high spatial resolution PAN component, adaptive PCA method will produce better results than traditional ones [53].

A widespread CS technique is the Gram-Schmidt (GS) spectral sharpening. This method was invented by Laben and Brover in 1998 and patented by Eastman Kodak [57]. The GS transformation, as described in [58], is a common technique used in linear algebra and multivariate statistics. GS is used to orthogonalize matrix data or bands of a digital image removing redundant (i.e., correlated) information that is contained in multiple bands. If there were perfect correlation between input bands, the GS orthogonalization process would produce a final band with all its elements equal to zero. For its use in pansharpening, GS transformation had been modified [57]. In the modified process, the mean of each band is subtracted from each pixel in the band before the orthogonalization is performed to produce a more accurate outcome.

In GS-based pansharpening, a lower-resolution PAN band needs to be simulated and used as the first band of the input to the GS transformation, together with the MS image. Two methods are used in [57] to simulate this band; in the first method, the LRMS bands are combined into a single lower-resolution PAN (LR PAN) as the weighted mean of MS image. These weights depend on the spectral response of the MS bands and high-resolution PAN (HR PAN) image and on the optical transmittance of the PAN band. The second method simulates the LR PAN image by blurring and subsampling the observed PAN image. The major difference in results, mostly noticeable in a true color display, is that the first method exhibits outstanding spatial quality, but spectral distortions may occur. This distortion is due to the fact that the average of the MS spectral bands is not likely to have the same radiometry as the PAN image. The second method is unaffected by spectral distortion but generally suffers from a lower sharpness and spatial enhancement. This is due to the injection mechanism of high-pass details taken from PAN, which is embedded into the inverse GS transformation, carried out by using the full-resolution PAN, while the forward transformation uses the low-resolution approximation of PAN obtained by resampling the decimated PAN image provided by the user. In order to avoid this drawback, Aiazzi et al.[54] proposed an Enhanced GS method, where the LR PAN is generated by a weighted average of the MS bands and the weights are estimated to minimize the MMSE with the downsampled PAN. GS is more general than PCA, which can be understood as a particular case of GS in which LR PAN is the first principal component [15].

3.2 Relative Spectral Contribution (RSC) family

The RSC family can be considered as a variant of the CS pansharpening family, when a linear combination of the spectral bands, instead of substitution, is applied.

Let PANh be the high spatial resolution PAN image, M S b l the b low-resolution MS image band, h the original spatial resolution of PAN and l the original spatial resolution of MS b (l < h), while M S b h is the image M S b l the b low-resolution MS image band, h the original spatial resolution of PAN and l the original spatial resolution of MS b (l < h), while M S b l resampled at resolution h. RSC works only on the spectral bands M S b l the b low-resolution MS image band, h the original spatial resolution of PAN and l the original spatial resolution of MS b (l < h), while M S b l lying within the spectral range of the PANh image. The synthetic (pansharpened) bands HRM S b h are given at each pixel (i, j) by

HRM S b h ( i , j ) = M S b h ( i , j ) P A N h ( i , j ) b M S b h ( i , j ) ,
(3)

where b = 1, 2, ..., B and B is the number of MS bands. The process flow diagram of RSC sharpening techniques is shown in Algorithm 2. This family does not tell what to do when M S b l the b low-resolution MS image band, h the original spatial resolution of PAN and l the original spatial resolution of MS b (l < h), while M S b l lies outside the spectral range of PANh . In Equation 3 there is an influence of the other spectral bands on the assessment of M S b l the b low-resolution MS image band, h the original spatial resolution of PAN and l the original spatial resolution of MS b (l < h), while HRM S b h , thus causing a spectral distortion. Furthermore, the method does not preserve the original spectral content once the pansharpened images HRM S b h are brought back to the original low spatial resolution [46]. These methods include the Brovey transform (BT) [32], the P + XS [59, 60] and the intensity modulation (IM) method [61].

Algorithm 2 Relative spectral contribution pansharpening

  1. 1.

    Upsample the MS image to the size of the PAN image.

  2. 2.

    Match the histogram of the PAN image with each MS band.

  3. 3.

    Obtain the pansharpened image by applying Equation 3.

The Brovey transform (BT), named after its author, is a simple method to merge data from different sensors based on the chromaticity transform [32], with the limitation that only three bands are involved [42, 14]. A pansharpened image using the BT method is shown in Figure 2(c).

The Brovey transform provides excellent contrast in the image domain but greatly distorts the spectral characteristics [62]. The Brovey sharpened image is not suitable for pixel-based classification as the pixel values are changed drastically [7]. A variation of the BT method subtracts the intensity of the MS image from the PAN image before applying Equation 3 [14]. Although the first BT method injects more spatial details, the second one preserves better the spectral details.

The concept of intensity modulation (IM) was originally proposed by Wong et al.[24] in 1980 for integrating Landsat MSS and Seasat SAR images. Later, this method was used by Cliche et al.[28] for enhancing the spatial resolution of three-band SPOT MS (XS) images. As a method in the relative spectral contribution family, we can derive IM from Equation 3, by replacing the sum of all MS bands, by the intensity component of the IHS transformation [6]. Note that the use of the IHS transformation limits to three the number of bands utilized by this method. The intensity modulation may cause color distortion if the spectral range of the intensity replacement (or modulation) image is different from the spectral range covered by the three bands used in the color composition [63]. In the literature, different versions based on the IM concept have been used [6, 28, 63].

The relations between RSC and CS families have been deeply studied in [14, 47] where these families are considered as a particular case of the GIHS and GIF protocols, respectively. The authors also found that RSC methods are closely CS, with the difference, as already commented, that the contribution of the PAN varies locally.

3.3 High-frequency injection family

The high-frequency injection family methods were first proposed by Schowengerdt [64], working on full-resolution and spatially compressed Landsat MSS data. He demonstrated the use of a high-resolution band to "sharpen" or edge-enhance lower-resolution bands having the same approximate wavelength characteristics. Some years later, Chavez [65] proposed a project whose primary objective was to extract the spectral information from the Landsat TM and combine (inject) it with the spatial information from a data set having much higher spatial resolution. To extract the details from the high-resolution data set, he used a high-pass filter in order to "enhance the high-frequency/spatial information but, more important, suppress the low frequency/spectral information in the higher-resolution image" [31]. This was necessary so that simple addition of the images did not distort the spectral balance of the combined product.

A useful concept for understanding spatial filtering is that any image is made of spatial components at different kernel sizes. Suppose we process an image in such a way that the value at each output pixel is the average of a small neighborhood of input pixels, a box filter. The result is a low-pass (LP) blurred version of the original image that will be noted as LP. Subtracting this image from the original one produces high-pass (HP) image that represents the difference between each original pixel and the average of its neighborhood. This relation can be written as the following equation:

image ( i , j ) =LP ( i , j ) +HP ( i , j ) ,
(4)

which is valid for any neighborhood size (scale). As the neighborhood size is increased, the LP image hides successively larger and larger structures, while the HP image picks up the smaller structures lost in the LP image (see Equation 4) [8].

The idea behind this type of spatial domain fusion is to transfer the high-frequency content of the PAN image to the MS images by applying spatial filtering techniques [66]. However, the size of the filter kernels cannot be arbitrary because it has to reflect the radiometric normalization between the two images. Chavez et al.[34] suggested that the best kernel size is approximately twice the size of the ratio of the spatial resolutions of the sensors, which produce edge-enhanced synthetic images with the least spectral distortion and edge noises. According to [67], pansharpening methods based on injecting high-frequency components into resampled versions of the MS data have demonstrated a superior performance and compared with many other pansharpening methods such as the methods in the CS family. Several variations of high-frequency injection pansharpening methods have been proposed as High-Pass Filtering Pansharpening and High Pass Modulation.

As we have already mentioned, the main idea of the high-pass filtering (HPF) pansharpening method is to extract from the PAN image the high-frequency information, to later add or inject it into the MS image previously expanded to match the PAN pixel size. This spatial information extraction is performed by applying a low-pass spatial filter to the PAN image,

filtere d P A N = h 0 *PAN,
(5)

where h0 is a low-pass filter and * the convolution operator. The spatial information injection is performed adding, pixel by pixel, the filtered image that results from subtracting filtered PAN from the original PAN image, to the MS one [31, 68]. There are many different filters that can be used: Box filter, Gaussian, Laplacian, and so on. Recently, the use of the modulation transfer function (MTF) of the sensor as the low-pass filter has been proposed in [69]. The MTF is the amplitude spectrum of the system point spread function (PSF) [70]. In [69], the HP image is also multiplied by a weight selected to maximize the Quality Not requiring a Reference (QNR) criterion proposed in the paper.

As expected, HPF images present low spectral distortion. However, the ripple in the frequency response will have some negative impact [14]. The HPF method could be considered the predecessor of an extended group of image pansharpening procedures based on the same principle: to extract spatial detail information from the PAN image not present in the MS image and inject it into the latter in a multiresolution framework. This principle is known as the ARSIS concept[46].

In the High Pass Modulation (HPM), also known as High Frequency Modulation (HFM) algorithm [8], the PAN image is multiplied by each band of the LRMS image and normalized by a low-pass filtered version of the PAN image to estimate the enhanced MS image bands. The principle of HPM is to transfer the high-frequency information of the PAN image to the LRMS band b (LRMS b ) with a modulation coefficient k b which equals the ratio between the LRMS and the low-pass filtered version of the PAN image [14]. Thus, the algorithm assumes that each pixel of the enhanced (sharpened) MS image in band b is simply proportional to the corresponding higher-resolution image at each pixel. This constant of proportionality is a spatially variable gain factor, calculated by,

k b ( i , j ) = L R M S b ( i , j ) f i l t e r e d P A N ( i , j ) ,
(6)

where filtered PAN is a low-pass filtered version of PAN image (see Equation 5) [8]. According to [14] (where HFI has also been formulated into the GIF framework and relations with CS, RSC and some multiresolution family methods are explored) when the low-pass filter is chosen as in the HPF method, the HPM method will give slightly better performance than HPF because the color of the pixels is not biased toward gray.

The process flow diagram of the HFI sharpening techniques is shown in Algorithm 3. Also, a pansharpened image using the HPM method is shown in Figure 2d. Note that the HFI methods are closely related, as we will see later, to the multiresolution family. The main differences are the types of filter used, that a single level of decomposition is applied to the images and the different origins of the approaches.

Algorithm 3 High-frequency injection pansharpening

  1. 1.

    Upsample the MS image to the size of the PAN image.

  2. 2.

    Apply a low-pass filter on the PAN image using Equation 5.

  3. 3.

    Calculate the high-frequency image by subtracting the filtered PAN from the original PAN.

  4. 4.

    Obtain the pansharpened image by adding the high-frequency image to each band of the MS image (modulated by the factor k b (i, j) in Equation 6 in the case of HPM).

3.4 Methods based on the statistics of the image

The methods based on the statistics of the image include a set of methods that exploit the statistical characteristics of the MS and PAN images in the pansharpening process. The first known method in this family was proposed by Price [35] to combine PAN and MS imagery from dual-resolution satellite instruments based on the substantial redundancy existing in the PAN data and the local correlation between the PAN and MS images. Later, the method was improved by Price [71] by computing the local statistics of the images and by Park et al.[36] in the so-called spatially adaptive algorithm.

Price's method[71] uses the statistical relationship between each band of the LRMS image and HR images to sharpen the former. It models the relationship between the pixels of each band of the HRMS z b , the PAN image x and the corresponding band of the LRMS image y b linearly as

z b y b ^ = a ^ ( x x ^ ) ,
(7)

where y b ^ is the band b of the LRMS image y upsampled to the size of the HRMS image by pixel replication, x ^ represents the panchromatic image downsampled to the size of the MS image by averaging the pixels of x in the area covered by the pixels of y and upsampling again to its original size by pixel replication, and a ^ is a matrix defined as the upsampling, by pixel replication, of a weight matrix a whose elements are calculated from a window 3 × 3 of each LR image pixel.

Price's algorithm succeeds in preserving the low-resolution radiometry in the fusion process, but sometimes, it produces blocking artifact because it uses the same weight for all the HR pixels corresponding to one LR pixel. If the HR and LR images have little correlation, the blocking artifacts will be severe. A pansharpened image using Price's method proposed in [71] is shown in Figure 3a.

Figure 3
figure 3

Results of some statistical pansharpening methods using SPOT five images.

The spatially adaptive algorithm[36] starts from Price's method [71], but with a more general and improved mathematical model. It features adaptive insertion of information according to the local correlation between the two images, preventing spectral distortion as much as possible and sharpening the MS images simultaneously. This algorithm has also the advantage that a number of high-resolution images, not only one PAN image, can be utilized as references of high-frequency information, which is not the case for most methods [36].

Besides those methods, most of the papers in this family have used the Bayesian framework to model the knowledge about the images and estimate the pansharpened image. Since the work of Mascarenhas [37], a number of pansharpening methods have been proposed using the Bayesian framework (see [72, 73] for instance).

Bayesian methods model the degradation suffered by the original HRMS image, z, as the conditional probability distribution of the observed LRMS image, y, and the PAN image, x, given the original z, called the likelihood and denoted as p(y, x|z). They take into account the available prior knowledge about the expected characteristics of the pansharpened image, modeled in the so-called prior distribution p(z), to determine the posterior probability distribution p(z|y, x) by using Bayes law,

p ( z | y , x ) = p ( y , x | z ) p ( z ) p ( y , x ) ,
(8)

where p(y, x) is the joint probability distribution. Inference is performed from the posterior distribution to draw estimates of the HRMS image, z.

The main advantage of the Bayesian approach is to place the problem of pansharpening into a clear probabilistic framework [73], although assigning suitable distributions for the conditional and prior distributions and the selection of an inference method are critical points that lead to different Bayesian-based pansharpening methods.

As prior distribution, Fasbender et al.[73] assumed a noninformative prior p(z) 1, which gives equal probability to all possible solutions, that is, no solution is preferred as no clear information on the HRMS image is available. This prior has also been used by Hardie et al.[74]. In [37], the prior information is carried out by an interpolation operator and its covariance matrix; both will be used as the mean vector and the covariance matrix, respectively, for a Bayesian synthesis process. In [75], the prior knowledge about the smoothness of the object luminosity distribution within each band makes it possible to model the distribution of z using a simultaneous autoregressive model (SAR) as

p ( z ) = b = 1 B p ( z b ) b = 1 B exp - 1 2 α b | | C z b | | 2
(9)

where C denotes the Laplacian operator and 1/α b is the variance of the Gaussian distribution of z b , b = 1, ..., B, with B being the number of bands in the MS image. More advanced models try to incorporate a smoothness constrain while preserving the edges in the image. Those models include adaptive SAR model [38], Total Variation (TV) [76], Markov Random Fields (MRF) [77]-based models and Stochastic Mixing Models (SMM) [78]. Note that the described models do not take into account the correlations between the MS bands. In [79], the authors propose a TV prior model to take into account spatial pixel relationships and a quadratic model to enforce similarity between the pixels in the same position in the different bands.

It is usual to model the LRMS and PAN images as degraded versions of the HRMS image by two different processes: one modeling the LRMS image and usually described as

y= g s ( z ) + n s ,
(10)

where g s (z) represents a function that relates z to y and n s represents the noise of the LRMS image, and a second one that models how the PAN image is obtained from the HRMS image, which is written as

x= g p ( z ) + n p ,
(11)

where g p (z) represents a function that relates z to x and n p represents the noise of the PAN image. Note that, since the success of the pansharpening algorithm will be limited by the accuracy of those models, the physics of the sensor should be considered. In particular, the MTF of the sensor and the sensor's spectral response should be taken into account.

The conditional distribution of the observed images given the original one, p(y, x|z), is usually defined as

p ( y , x | z ) =p ( y | z ) p ( x | z )
(12)

by considering that the observed LRMS image and the PAN image are independent given the HRMS image. This allows an easier formulation of the degradation models. However, Fasbender et al.[73] took into account that y and x may carry information of quite different quality about z and defined p(y, x|z) = p(y|z)2(1-w)p(x|z)2w, where the parameter w [0, 1] can be interpreted as the weight to be given to the panchromatic information at the expense of the MS information. Note that w = 0.5 leads back to Equation 12 while a value of zero or one means that we are discarding the PAN or the MS image, respectively.

Different models have been proposed for the conditional distributions p(y|z) and p(x|z). The simpler model is to assume that g s (z) = z, so that y will be then y = z + n s [73] where n s ~ N(0, Σ s ). Note that in this case, y has the same resolution as z so an interpolation method has to be used to obtain y from the observed MS image. However, most of the authors consider the relation y = H z + n s where H is a matrix representing the blurring, usually represented by its MTF, the sensor integration function and the spatial subsampling and n s is the capture noise, assumed to be Gaussian with zero mean and variance 1/β, leading to the distribution

p ( y | z ) exp - 1 2 β | | y - H z | | 2 .
(13)

This model has been extensively used [77, 78, 80], and it is the base for the so-called super-resolution-based methods [81] as the ones described, for instance, in [38, 76]. The degradation model in [37] can be also written in this way. A pansharpened image using the super-resolution method proposed in [76] is shown in Figure 3b.

On the other hand, g p (z) has been defined as a linear regression model linking the MS pixels to the PAN ones, as estimated from both observed images, so that g p ( z ) =a+ b = 1 B λ b z b , where a and λ b , b = 1, 2, ..., B, are the regression parameters. Note that this model is used by IHS, PCA and Brovey methods to relate the PAN and HRMS images. Mateos et al.[82] (and also [38, 76, 77] for instance) used a special case for g p (z), where a = 0 and λ b ≥ 0, b = 1, 2, ..., B are known quantities that can be obtained from the sensor spectral characteristics (see Figure 4 for the Landsat 7 ETM+ spectral response) that represent the contribution of each MS band to the PAN image. In all those papers, the noise n p is assumed to be Gaussian with zero mean and covariance matrix C p and hence,

Figure 4
figure 4

Landsat 7 ETM+ band spectral response.

p ( x | z ) exp { 1 2 ( ( x g p ( z ) ) t C p 1 ( x g p ( z ) ) } .
(14)

Finally, we want to mention that a similar approach has been used in the problem of hyperspectral (HS) resolution enhancement in which a HS image is sharpened by a higher-resolution MS or PAN image. In this context, Eismann and Hardie [80, 78] and other authors later (see for instance [83]) proposed to use the model x = Stz + n, where z is the HR original HS image, x is a HRMS or PAN image used to sharpen a LRHS image, S is the spectral response matrix, and n is assumed to be a spatially independent zero-mean Gaussian noise with covariance matrix C. The spectral response matrix is a sparse matrix that contains in each column the spectral response of a MS band of x. Note that in the case of pansharpening, the image x has only one band and the matrix S will be a column vector with components λ b as in the model proposed in [82].

Once the prior and conditional distributions have been defined, Bayesian inference is performed to find an estimate of the original HRMS image. Different methods have been used in the literature to carry out the inference, depending on the form of the chosen distributions. Maximum likelihood (ML) [73], linear minimum mean square error (LMMSE) [83], maximum a posteriori (MAP) [74], the variational approach [38, 76] and simulated annealing [77] are some of the techniques used. Bayesian methods usually end up with an estimation of the HRMS image z that is a convex combination of the LRMS image, upsampled to the size of the HRMS image by inverting the degradation model, the PAN image, and the prior knowledge about the HRMS image. The combination factors usually are pixel adaptive and related to the spectral characteristics of the images.

Although all approaches already mentioned use the hypothesis of Gaussian additive noise for mathematical convenience, in practice, remote sensing imagery noise shows non-Gaussian characteristics [84]. In some applications, such as astronomical image restoration, Poisson noise is usually used, or a shaping filter [85] may be used in order to transform non-Gaussian noise into Gaussian. Recently, Niu et al.[84] proposed the use of a mixture of Gaussian (MoG) noise for multisensor fusion problems.

3.5 Multiresolution Family

In order to extract or modify the spatial information in remote sensing images, spatial transforms represent also a very interesting tool. Some of these transforms use only local image information (i.e., within a relatively small neighborhood of a given pixel), such as convolution, while others use frequency content, such as the Fourier transform. Beside these two extreme transformations, there is a need for a data representation allowing the access to spatial information over a wide range of scales from local to global [8]. This increasingly important category of scale-space filters utilize multiscale decomposition techniques such as Laplacian pyramids [86], wavelet transform [41], contourlet transform [43] and curvelets transform. These techniques are used in pansharpening to decompose MS and PAN images in different levels in order to derive spatial details that are imported into finer scales of the MS images, highlight the relationship between PAN and MS images in coarser scales and enhance spatial details [87]. This is the idea behind the methods based on the successful ARSIS (from French "Amèlioration de la Résolution by Structure Injection") concept [46].

We will now describe each of the above multiresolution methods and their different types in detail.

Multiresolution analysis based on the Laplacian pyramid (LP), originally proposed in [86], is a bandpass image decomposition derived from the Gaussian pyramid (GP) which is a multiresolution (multiscale) image representation obtained through a recursive reductions of the image. LP is an oversampled transform that decomposes the image into nearly disjoint bandpass channels in the spatial frequency domain, without losing the spatial connectivity of its edges [88]. Figure 5 shows the concept of GP and its relation to LP. The generalized Laplacian pyramid (GLP) is an extension of the LP where a scale factor different of two is used [89]. An attractive characteristic of the GLP is that the low-pass reduction filter, used to analyze the PAN image, may be designed to match the MTF of the band into which the details extracted will be injected. The benefit is that the restoration of the spatial frequency content of the MS band is embedded into the enhancement procedure of the band itself, instead of being accomplished ahead of time.

Figure 5
figure 5

Laplacian pyramid created from Gaussian pyramid by subtraction.

The steps for merging Landsat images using this GLP are described in Algorithm 4, where different injection methods can be used with GLP [40, 90]. In this context, injection means adding the details from the GLP to each MS band weighted by the coefficients obtained by the injection method. The Spectral Distortion Minimizing (SDM) injection model is both a spatially and spectrally varying model where the injected details at a pixel position must be parallel to the resampled MS vector at the same resolution. At the same time, the details are weighted to minimize the radiometric distortion measured as the Spectral Angle Mapper (SAM). In Context-Based Decision (CBD) injection model, the weights are calculated locally between the MS band resampled to the scale of the PAN image and an approximation of the PAN image at the resolution of the MS bands. Details are only injected if the local correlation coefficient between those images, calculated on a window of size N × N, is larger than a given threshold. The CBD model is uniquely defined by the set of thresholds, generally different for each band, and by the window size N, depending on the spatial resolutions and scale ratio of the images to be merged, as well as on the landscape characteristics, to avoid loss of local sensitivity [40]. Pansharpened images using wavelet/contourlet-based methods are shown in Figure 6.

Figure 6
figure 6

Results of some multiresolution pansharpening methods using SPOT five images.

Algorithm 4 General Laplacian Pyramid-based pansharpening

  1. 1.

    Upsample each MS band to the size of the PAN image.

  2. 2.

    Apply GLP on the PAN image.

  3. 3.

    According to the injection model, select the weights for the details from GLP at each level.

  4. 4.

    Obtain the pansharpened image by adding the details from the GLP to each MS band weighted by the coefficients obtained in the previous step.

The Ranchin-Wald-Mangolini (RWM) injection model [40], unlike the SDM and CBD models, is calculated on bandpass details instead of approximations. RWM models the MS details as a space and spectral-varying linear combination of the PAN image coefficients.

Another popular category of multiresolution pansharpening methods is the one based on Wavelet and Contourlet. Wavelet provide a framework for the decomposition of images into a hierarchy with a decreasing degree of resolution, separating detailed spatial information between successive levels [91]. The discrete approach of the wavelet transform, named discrete wavelet transform (DWT), can be performed using several different approaches, probably the most popular ones for image pansharpening being Mallat's [42, 46, 92, 93] and the "a' trous" [13, 94, 95] algorithms. Each one has its particular mathematical properties and leads to different image decompositions. The first is an orthogonal, dyadic, non-symmetric, decimated, non-redundant DWT algorithm, while "a'trous" is a non-orthogonal, shift-invariant, dyadic, symmetric, undecimated, redundant DWT algorithm [91]. Redundant wavelet decomposition, as well as GLP, has an attractive characteristic: the low-pass reduction filter used to analyze the PAN image may be easily designed to match the MTF of the band to be enhanced. If the filters are correctly chosen, the extracted high spatial frequency components from the PAN image are properly retained, thus resulting in a greater spatial enhancement. It is important to note that undecimated, shift-invariant decompositions, and specifically "a' trous" wavelets, where sub-band and original image pixels correspond to the same locations, produce fewer artifacts and better preserve the linear continuity of features that do not have a horizontal or vertical orientation [96] and hence are better suited for image fusion.

Contourlets provide a new representation system for image analysis [43]. The contourlet transform is so called because of its ability to capture and link the discontinuity points into linear structures (contours). The two-stage process used to derive the contourlet coefficients involves a multiscale transform and a local directional transform. First, a multiscale LP that detects discontinuities is applied. Then, a local directional filter bank is used to group these wavelet-like coefficients to obtain a smooth contour. Contourlets provide 2l directions at each scale, where l is the number of required orientations. This flexibility of having different numbers of directions at each scale makes contourlets different from other available multi-scale and directional image representations [53]. Similarly to wavelets, contourlets also have different implementations of the subsampled and non-subsampled [43] transforms.

Algorithm 5 Wavelet/contourlet-based pansharpening

  1. 1.

    Forward transform the PAN and MS images using a sub-band and directional decomposition such as the subsampled or non-subsampled wavelet or contourlet transform.

  2. 2.

    Apply a fusion rule onto the transform coefficients.

  3. 3.

    Obtain the pansharpened image by performing the inverse transform.

A number of pansharpening methods using the wavelet and, more recently, the contourlet transform have been proposed. In general, all the transform-based pansharpening methods follow the process in Algorithm 5. In the wavelet/contourlet-based approach, the MS and PAN images need to be decomposed multiple times in step 1 of Algorithm 5.

Preliminary studies have shown that the quality of the pansharpened images produced by the wavelet-based techniques is a function of the number of decomposition levels. If few decomposition levels are applied, the spatial quality of the pansharpened images is less satisfactory. If an excessive number of levels is applied, the spectral similarity between the original MS and the pansharpened images decreases. Pradhan et al.[97] attempt in their work to determine the optimal number of decomposition levels for the wavelet-based pansharpening, producing the optimal spatial and spectral quality.

The fusion rules in step 2 of the algorithm comprise, for instance, substituting the original MS coefficient bands by the coefficients of the PAN image or adding the coefficients of the PAN to the coefficients of the original MS bands, weighted sometimes by a factor dependent on the contribution of the PAN image to each MS band. It results in the different wavelet- and contourlet-based pansharpening methods that will be described next.

The additive wavelet/Contourlet method for fusing MS and PAN images uses the wavelet [91]/contourlet [44] transform in steps 1 and 3 in Algorithm 5, and for the fusion rule in step 2, it adds the detail bands of the MS image to those corresponding of the PAN image, having previously matched the MS histogram to one of the PAN image.

The Substitutive Wavelet/Contourlet methods are quite similar to the additive ones, but instead of adding the information of the PAN image to each band of the MS image, these methods simply replace the MS detail bands with the details obtained from the PAN image (see [94] for wavelet and [98] for contourlet decomposition).

A number of hybrid methods have been developed to attempt to combine the best aspects of classical methods and wavelet and contourlet transforms. Research has mainly focused on incorporating the IHS, PCA and BT into wavelet and contourlet methods.

As we have seen, some of the most popular image pansharpening methods are based on the IHS transformation. The main drawback of these methods is the high distortion of the original spectral information present in the resulting MS images. To avoid this problem, the IHS transformation is followed by the additive wavelet or contourlet method in the so-called wavelet[91] and contourlet[99, 100]additive IHS pansharpening. If the IHS transform is followed by the substitutive wavelet method, the wavelet substitutive IHS[101] pansharpening method is obtained.

Similarly to the IHS wavelet/contourlet methods, the PCA wavelet[68, 91]/contourlet [53] methods are based on applying substitutive wavelet/contourlet methods to the first principal component (PC1) instead of applying it to the bands of the MS image. Adaptive PCA has also been applied in combination with contourlets [53].

The WiSpeR[102] method can be considered as a generalization of different wavelet-based image fusion methods. It uses a modification of the non-subsampled additive wavelet algorithm where the contribution from the PAN image to each of the fused bands depends on a factor generated from both the sensor spectral response and the physical properties of the observed object. A new contourlet pan-sharpening method named CiSper was proposed in [45] that, similarly to WiSper, weights the contribution of the PAN image to each MS band, but it uses a different method to calculate these weights and uses the non-subsampling contourlet transform instead of the wavelet transform. In order to take advantage of multiresolution analysis, the use of pansharpening based on the statistics of the image on the wavelet/contourlet domain has been suggested [103, 104]. Pansharpened images using wavelet- and contourlet-based methods are shown in Figure 6b-f.

Some authors [41, 42] state that multisensor image fusion is a trade-off between the spectral information from an MS sensor and the spatial information from a PAN sensor and that wavelet transform fusion methods easily control this trade-off. The trade-off idea, however, is just a convenient simplification, as discussed in [10], and ideal fusion methods must be able to simultaneously reach both spectral and spatial quality, and not one at the expense of the other. To do so, the physics of the capture process have to be taken into account, and the methods have to adapt to the local properties of the images.

4 Quality assessment

In the previous section, a number of different pansharpening algorithms have been described to produce images with both high spatial and spectral resolutions. The suitability of these images for various applications depends on the spectral and spatial quality of the pansharpened images. Besides visual analysis, there is a need to quantitatively assess the quality of different pansharpened images. Quantitative assessment is not easy as the images to be compared are at different spatial and spectral resolutions. Wald et al.[67] formulated that the pansharpened image should have the following properties:

  1. (1)

    Any pansharpened image once downsampled to its original spatial resolution should be as similar as possible to the original image.

  2. (2)

    Any pansharpened image should be as similar as possible to the image that a corresponding sensor would observe with the same high spatial resolution.

  3. (3)

    The MS set of pansharpened images should be as similar as possible to the MS set of images that a corresponding sensor would observe with the same high spatial resolution.

These three properties have been reduced to two properties: consistency property and synthesis property [105]. The consistency property is the same as the first property, and the synthesis property combines the second and third properties defined by [67]. The synthesis property emphasizes the synthesis at an actual higher spatial and spectral resolution. Note that the reference image for the pansharpening process is the MS image at the resolution of the PAN image. Since this image is not available, Wald et al.[67] proposed a protocol for quality assessment and several quantitative measures for testing the three properties. The consistency property is verified by downsampling the fused image from the higher spatial resolution h to their original spatial resolution l using suitable filters. To verify the synthesis properties, the original PAN at resolution h and MS at resolution l are downsampled to their lower resolutions l and v, respectively. Then, PAN at resolution l and MS at resolution v are fused to obtain fused MS at resolution l that can be then compared with the original MS image. The quality assessed at resolution l is assumed to be close to the quality at resolution h. This reduces the problem of reference images. However, we cannot predict the quality at higher resolution from the quality of lower resolution [106]. Recently, a set of methods have been proposed to assess the quality of the pansharpened without the requirement of a reference image. Those methods aim at providing reliable quality measures at full scale following Wald's protocol.

4.1 Visual analysis

Visual analysis is needed to check whether the objective of pansharpening has been met. The general visual quality measures are the global image quality (geometric shape, size of objects), the spatial details and the local contrast. Some visual quality parameters for testing the properties are [105]: (1) spectral preservation of features in each MS band, where the appearance of the objects in the pansharpened images is analyzed in each band based on the appearance of the same objects in the original MS images; (2) multispectral synthesis in pansharpened images, where different color composites of the fused images are analyzed and compared with that of original images to verify that MS characteristics of objects at higher spatial resolution are similar to those of the original images; and (3) synthesis of images close to actual images at high resolution as defined by the synthesis property of pansharpened images, which cannot be directly verified but can be analyzed from our knowledge of the spectra of objects present in the lower spatial resolutions.

4.2 Quantitative analysis

A set of measures have been proposed to quantitatively assess the spectral and spatial quality of the images. In this section, we will present the measures more commonly used for this purpose.

Spectral quality assessment

To measure the spectral distortion due to the pansharpening process, each merged image is compared to the reference MS image, using one or more of the following quantitative indicators:

  1. (1)

    Spectral Angle Mapper (SAM): SAM denotes the absolute value of the angle between two vectors, whose elements are the values of the pixels for the different bands of the HRMS image and the MS image at each image location. A SAM value equal to zero denotes the absence of spectral distortion, but radiometric distortion may be present (the two pixel vectors are parallel but have different lengths). SAM is measured either in degrees or in radians and is usually averaged over the whole image to yield a global measurement of spectral distortion [107].

  2. (2)

    Relative-shift mean (RM): The RM [108] of each band of the fused image helps to visualize the change in the histogram of fused image and is defined in [108] as the percentage of variation between the mean of the reference image and the pansharpened image.

  3. (3)

    Correlation coefficient (CC): The CC between each band of the reference and the pansharpened image indicates the spectral integrity of pansharpened image [62]. However, CC is insensitive to a constant gain and bias between two images and does not allow for subtle discrimination of possible pansharpening artifacts [14]. CC should be as close to 1 as possible.

  4. (4)

    Root mean square error (RMSE): The RMSE between each band of the reference and the pansharpened image measures the changes in radiance of the pixel values [67]. RMSE is a very good indicator of the spectral quality when it considered along homogeneous regions in the image [108]. RMSE should be as close to 0 as possible.

  5. (5)

    Structure Similarity Index (SSIM): SSIM [109] is a perceptual measure that combines several factors related to the way humans perceive the quality of the images. Beside luminosity and contrast distortions, structure distortion is considered in SSIM index and calculated locally in 8 × 8 square windows. The value varies between -1 and 1. Values close to 1 show the highest correspondence with the original images.

The Universal Image Quality Index (UIQI) proposed in [110] can be considered a special case for SSIM index

While these parameters only evaluate the difference in spectral information between each band of the merged and the reference image, in order to estimate the global spectral quality of the merged images, the following parameters are used:

  1. (1)

    Erreur relative globale adimensionnelle de synthése (ERGAS) index, whose English translation is relative dimensionless global error in fusion [111], is a global quality index sensitive to mean shifting and dynamic range change [112]. The lower the ERGAS value, specially a value lower than the number of bands, the higher the spectral quality of the merged images.

  2. (2)

    Mean SSIM (MSSIM) index and the average quality index (Q avg ): These indices [109, 102] are used to evaluate the overall image SSIM and UIQI quality, by averaging these measures. The higher, closer to one, the value, the higher the spectral and radiometric quality of the merged images.

  3. (3)

    Another global measure, Q4, proposed in [113] depends on the individual UIQI of each band, but also on spectral distortion, embodied by the spectral angle SAM. The problem of this index is that it may not be extended to images with a number of bands greater than four.

Spatial Quality Assessment

To assess the spatial quality of a pansharpened image, its spatial detail information must be compared to that present in the reference HR MS image. Just a few quantitative measures have been found in the literature to evaluate the spatial quality of merged images. Zhou [42] proposed the following procedure to estimate the spatial quality of the merged images: to compare the spatial information present in each band of these images with the spatial information present in the PAN image. First, a Laplacian filter is applied to the images under comparison. Second, the correlation between these two filtered images is calculated, thus obtaining the spatial correlation coefficient (SCC). However, the use of the PAN image as a reference is incorrect as demonstrated in [10, 114], and the HR MS image has to be used, as done by Otazu et al.in [102]. A high SCC indicates that many of the spatial detail information of one of the images are present in the other one. The SCC ideal value of each band of the merged image is 1.

Recently, a new spatial quality measure was suggested in [97], related to quantitative edge analysis. The authors claim that a good pansharpening technique should retain all the edges present in the PAN image in the sharpened image [97]. Thus, a Sobel edge operator is applied on the image in order to detect its edges and then compared with the edges of the PAN image. However, the concept behind this index is false since the reference image is not the PAN but the HRMS [114].

Additionally, some spectral quality measures have been adapted to spatial quality assessment. Pradhan et al.[97] suggested the use of structural information in SSIM measure between panchromatic and pansharpened images as a spatial quality measure. Lillo-Saavedra et al.[115] proposed to use the spatial ERGAS index that includes in its definition the spatial RMSE calculated between each fused spectral band and the image obtained by adjusting the histogram of the original PAN image to the histogram of the corresponding band of the fused MS image.

Although an exhaustive comparison of all the aforementioned pansharpening methods is out of the scope of this paper, for the sake of reference, we have included, in Table 1, the figures of merit for some of the pansharpened images from the observed multispectral image shown in Figure 2a already presented in this paper. The two best values for each measure have been highlighted in Table 1. From the obtained results, it is clear that BT, HPF and Price methods, depicted in Figures 2c, d and 3a, respectively, suffer the highest spectral distortions, with the lowest SSIM and MSSIM values and the highest ERGAS value. In this case, the IHS method (Figure 2b) and IHS-Wavelet (Figure 6c) produce good numerical results, but note that these results have been obtained considering only the first three bands, the ones involved in the IHS transform. Methods based in multiresolution approaches, GLP (Figure 6a), additive wavelets (Figure 6b), and the method described in [103] (Figure 6f) provide the best results with lower spectral distortion and higher SCC values. These results are consistent with the ones reported in [116] where the multiresolution methods perform better than other methods. Note that the lower values of SSIM for the GLP and AW methods are due to the ratio between band 4, which has a spatial resolution of 20 m per pixel, and the PAN image, which covers 5 m per pixel, while the three first bands have a spatial resolution of 10 m per pixel.

Table 1 Numerical results with the presented pansharpened methods utilizing the multispectral image in Figure 2a

For a comparison between some of the reported methods, the reader is referred, for instance, to the results of the 2006 GRS-S Data-Fusion Contest [116] where a set of eight methods, mainly CS and MRA based, were tested on a common set of images or [10] where the authors discuss from a theoretical point of view the strengths and weakness of CS, RSC and ARSIS concept implementations (which includes HFI and multiresolution families) and perform a comparison based on the fusion rule and the effect of this rule on the spectral and spatial distortion. Another comparative analysis was developed in [14], where a general image fusion (GIF) framework for pansharpening IKONOS images is proposed and the performance of several image fusion method is analyzed based on the way they compute, from the LR MS image, an LR approximation of the PAN image, and how the modulation coefficients for the detail information are defined. This work has been extended in [15] to consider also context adaptive methods in the so-called extended GIF protocol. Recently, a comparison of pansharpening methods was carried out in [117] based on their performance on automatic classification and visual interpretation applications.

4.3 Quality assessment without a reference

Quantitative quality of data fusion methods can be provided when using reference images, usually obtained by degrading all available data to a coarser resolution and carrying out fusion from such data. A set of global indices capable of measuring the quality of pansharpened MS images and working at full scale without performing any preliminary degradation of the data have been recently proposed.

The Quality with No Reference (QNR) index [118] comprises two indices, one pertaining to spectral distortion and the other to spatial distortion. As proposed in [118] and [119], the two distortions may be combined to yield a unique global quality measure. While the QNR measure proposed in [118] is based on the UIQI index, the one proposed in [119] is based on the measure of the mutual information (MI) between the different images. The spectral distortion index defined in [118] can be derived from the difference of inter-band UIQI values calculated among all the fused MS bands and the resampled LR MS bands. The spatial distortion index defined in [118] is based on differences between the UIQI of each band of the fused image and the PAN image and the UIQI of each band of the LR MS image with a low-resolution version of the PAN image. This LR image is obtained by filtering the PAN image with a low-pass filter with normalized frequency cutoff at the resolution ratio between MS and PAN images, followed by decimation. The QNR index is obtained by the combination of the spectral and spatial distortion indices into one single measure varying from zero to one. The maximum value is only obtained when there is no spectral and spatial distortion between the images. The main advantage of the proposed index is that, in spite of lack of a reference data set, the global quality of a fused image can be assessed at the full scale of the PAN image.

The QNR method proposed in [119] is based on the MI measure instead of UIQI. The mutual information between resampled original and fused MS bands is used to measure the spectral quality, while the mutual information between the PAN image and the fused bands yields a measure of spatial quality.

Another protocol was proposed by Khan et al.[69] to assess spectral and spatial quality at full scale. For assessing spectral quality, the MTF of each spectral channel is used to low-pass filter the fused image. This filtered image, once it has been decimated, will give a degraded LR MS image. For comparing the degraded and original low-resolution MS images, the Q4 index [113] is used. Note that the MTF filters for each sensor are different and the exact filter response is not usually provided by the instrument manufacturers. However, the filter gain at Nyquist cutoff frequency may be derived from on-orbit measurements. Using this information and assuming that the frequency response of each filter is approximately Gaussian shaped, MTF filters for each sensor of each satellite can be estimated. To assess the spatial quality of the fused image, the high-pass complement of the MTF filters is used to extract the high-frequency information from the MS images at both high (fused) and low (original) resolutions. In addition, the PAN image is downscaled to the resolution of the original MS image, and high-frequency information is extracted from high- and low-resolution PAN images. The UIQI value is calculated between the details of the each MS band and the details of the PAN image at the two resolutions. The average of the absolute differences in the UIQI values across scale of each band produces the spatial index.

5 Conclusion

In this paper, we have provided a complete overview of the different methods proposed in the literature to tackle the pansharpening problem and classified them into different categories according to the main technique they use. As previously described in Sections 3.1 and 3.2, the classical CS and RSC methods provide pansharpened images of adequate quality for some applications but usually they introduce high spectral distortion. Their results highly depend on the correlation between each spectral band and the PAN image. A clear trend in the CS family, as we explained in Section 3.1, is to use transformations of the MS image, so that the transformed image mimics the PAN image. In this sense, a linear combination of the MS image is usually used by weighting each MS band with weights obtained either from the spectral response of the sensor or by minimizing, in the MMSE sense, the difference between the PAN and this linear combination. By using this technique, the spectral distortion can be significantly reduced. Another already mentioned important research area is the local analysis of the images, producing methods that inject structures in the pansharpened image depending on the local properties of the image.

The high-frequency injection family, described in Section 3.3, can be considered the predecessor of the methods based on the ARSIS concept. HFI methods low-pass filter the image using different filters. As we have seen, the use of the MTF of the sensor as the low-pass filter is preferable since, in our opinion, introducing sensor characteristics into the fusion rule will make the method more realistic.

The described methods based on the statistics of the image provide a flexible and powerful way to model the image capture system as well as incorporating the knowledge available about the HR MS image. Those methods allow, as explained in Section 3.4, to accurately model the relationship between the HR MS image and the MS and PAN images incorporating the physics of the sensors (MTF, spectral response, or noise properties, for instance) and the conditions in which the images were taken. Still, the models used are not very sophisticated, thus presenting an open research area in this family.

The multiresolution analysis, as already mentioned, is one of the most successful approaches for the pansharpening problem. Most of those techniques have been previously classified into techniques relevant to the ARSIS concept. Decomposing the images at different resolution levels allows to inject the details of the PAN image into the MS one depending on the context. From the methods described in Section 3.5, we can see that the generalized Laplacian pyramid and redundant shift-invariant wavelet and contourlet transforms are the most popular multiresolution techniques applied to this fusion problem. From our point of view, the combination of multiresolution analysis with techniques that take into account the physics of the capture process will provide prominent methods in the near future.

Finally, we want to stress, again, the importance of a good protocol to assess the quality of the pansharpened image. In this sense, Wald's protocol, described in Section 4, is the most suitable assessment algorithm if no reference image is available. Besides visual inspection, numerical indices give a way to rank different methods and give an idea of their performance. Recent advances in full-scale quality measures as the ones presented in Section 4.3 set the trend for new measures specific for pansharpening that have to be considered.

References

  1. SPOT Web Page. [Online] [http://www.spotimage.fr]

  2. Landsat 7 Web Page. [Online] [http://geo.arc.nasa.gov/sge/landsat/]

  3. IKONOS Web Page. [Online] [http://www.geoeye.com/CorpSite/products-and-services/imagery-sources/]

  4. QuickBird Web Page. [Online] [http://www.digitalglobe.com]

  5. OrbView Web Page. [Online] [http://www.orbital.com/SatellitesSpace/ImagingDefense/]

  6. Liu JG: Smoothing filter-based intensity modulation: A spectral preserve image fusion technique for improving spatial details. Int J Remote Sensing 2000, 21: 3461-3472. 10.1080/014311600750037499

    Article  Google Scholar 

  7. Vijayaraj V: A quantitative analysis of pansharpened images. Master's thesis, Mississippi State Univ 2004.

    Google Scholar 

  8. Schowengerdt RA: Remote Sensing: Models and Methods for Image Processing. 3rd edition. Orlando, FL: Academic; 1997.

    Google Scholar 

  9. Pohl C, Genderen JLV: Multi-sensor image fusion in remote sensing: Concepts, methods, and applications. Int J Remote Sens 1998,19(5):823-854. 10.1080/014311698215748

    Article  Google Scholar 

  10. Thomas C, Ranchin T, Wald L, Chanussot J: Synthesis of multispectral images to high spatial resolution: A critical review of fusion methods based on remote sensing physics. IEEE Trans Geosci Remote Sens 2008, 46: 1301-1312.

    Article  Google Scholar 

  11. Ehlers M, Klonus S, Astrand P: Quality assessment for multi-sensor multi-date image fusion. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, ser Part B4 2008, XXXVII: 499-505.

    Google Scholar 

  12. Bretschneider T, Kao O: Image fusion in remote sensing. Technical University of Clausthal, Germany;

  13. Ranchin T, Aiazzi B, Alparone L, Baronti S, Wald L: Image fusion: The ARSIS concept and some successful implementation schemes. ISPRS Journal of Photogrammetry & Remote Sensing 2003, 58: 4-18. 10.1016/S0924-2716(03)00013-3

    Article  Google Scholar 

  14. Wang Z, Ziou D, Armenakis C, Li D, Li Q: A comparative analysis of image fusion methods. IEEE Trans Geosci Remote Sens 2005,43(6):1391-1402.

    Article  Google Scholar 

  15. Aiazzi B, Baronti S, Lotti F, Selva M: A comparison between global and contextadaptive pansharpening of multispectral images. IEEE Geoscience and Remote Sensing Letters 2009,6(2):302-306.

    Article  Google Scholar 

  16. Richards JA, Jia X: Remote Sensing Digital Image Analysis: An Introduction. 4th edition. Secaucus, NJ, USA: Springer-Verlag New York, Inc; 2005.

    Google Scholar 

  17. Parkinson CL, Ward A, King MD, Eds: Earth Science Reference Handbook A Guide to NASA's Earth Science Program and Earth Observing Satellite Missions. National Aeronautics and Space Administration 2006.

    Google Scholar 

  18. Yang Y, Gao X: Remote sensing image registration via active contour model. Int J Electron Commun 2009, 63: 227-234. 10.1016/j.aeue.2008.01.003

    Article  Google Scholar 

  19. Zitova B, Flusser J: Image registration methods: A survey. Image and Vision Computing 2003, 21: 977-1000. 10.1016/S0262-8856(03)00137-9

    Article  Google Scholar 

  20. Gonzalez RC, Woods RE: Digital image processing. 3rd edition. Prentice Hall; 2008.

    Google Scholar 

  21. Takehana S, Kashimura M, Ozawa S: Predictive interpolation of remote sensing images using vector quantization. Communications, Computers and Signal Processing, 1993., IEEE Pacific Rim Conference on 1993, 1: 51-54.

    Article  Google Scholar 

  22. Teoh KK, Ibrahim H, Bejo SK: Investigation on several basic interpolation methods for the use in remote sensing application. Proceedings of the 2008 IEEE Conference on Innovative Technologies 2008.

    Google Scholar 

  23. Dou W, Chen Y: An improved IHS image fusion method with high spectral fidelity. The Int Archiv of the Photogramm, Rem Sensing and Spat Inform Sciences 2008, XXXVII: 1253-1256. part.B7

    Google Scholar 

  24. Wong FH, Orth R: Registration of SEASAT/LANDSAT composite images to UTM coordinates. Proceedings of the Sixth Canadian Syinposium on Remote Sensing 1980, 161-164.

    Google Scholar 

  25. Rebillard P, Nguyen PT: An exploration of co-registered SIR-A, SEASAT and Landsat images. Proceeidngs International Symposium on RS of Environment, RS for Exploration Geology 1982, 109-118.

    Google Scholar 

  26. Simard R: Improved spatial and altimetric information from SPOT composite imagery. Proceedings International Symposium of Photogrammetry and Remote Sensing) 1982, 434-440.

    Google Scholar 

  27. Crist EP: Comparison of coincident Landsat-4 MSS and TM data over an agricultural region. Proceedings of the 50th Annual Meeting ASP-ACSM Symposium, ASPRS 1984, 508-517.

    Google Scholar 

  28. Cliche G, Bonn F, Teillet P: Integration of the SPOT panchromatic channel into its multispectral mode for image sharpness enhancement. Photogramm Eng Remote Sens 1985,51(3):311-316.

    Google Scholar 

  29. Welch R, Ehlers M: Merging multiresolution SPOT HRV and landsat TM data. Photogramm Eng Remote Sens 1987,53(3):301-303.

    Google Scholar 

  30. Haydn R, Dalke GW, Henkel J: Application of the IHS color transform to the processing of multisensor data and image enhancement. lnternational Symposium on Remote Sensing of Arid and Semi-Arid Lands 1982, 599-616.

    Google Scholar 

  31. Chavez PS, Bowell JA Jr: Comparison of the spectral information content of Landsat Thematic Mapper and SPOT for three different sites in the Phoenix, Arizona region. Photogramm Eng Remote Sens 1988,54(12):1699-1708.

    Google Scholar 

  32. Gillespie AR, Kahle AB, Walker RE: Color enhancement of highly correlated images. II. Channel Ratio and "Chromaticity" Transformation Techniques. Remote Sensing Of Environment 1987, 22: 343-365. 10.1016/0034-4257(87)90088-5

    Article  Google Scholar 

  33. Chavez PS, Kwarteng AY: Extracting spectral contrast in Landsat thematic mapper image data using selective principal component analysis. Photogramm Eng Remote Sens 1989,55(3):339-348.

    Google Scholar 

  34. Chavez P, Sides S, Anderson J: Comparison of three different methods to merge multiresolution and multispectral data: Landsat TM and SPOT panchromatic. Photogramm Eng Remote Sens 1991,57(3):295-303.

    Google Scholar 

  35. Price JC: Combining panchromatic and multispectral imagery from dual resolution satellite instruments. Remote Sensing Of Environment 1987, 21: 119-128. 10.1016/0034-4257(87)90049-6

    Article  Google Scholar 

  36. Park J, Kang M: Spatially adaptive multi-resolution multispectral image fusion. Int J Remote Sensing 2004,25(23):5491-5508. 10.1080/01431160412331270830

    Article  Google Scholar 

  37. Mascarenhas NDA, Banon GJF, Candeias ALB: Multispectral image data fusion under a Bayesian approach. Int J Remote Sensing 1996, 17: 1457-1471. 10.1080/01431169608948717

    Article  Google Scholar 

  38. Vega M, Mateos J, Molina R, Katsaggelos A: Super-resolution of multispectral images. The Computer Journal 2008, 1: 1-15.

    Google Scholar 

  39. Aiazzi B, Alparone L, Baronti S, Garzelli A: Context-driven fusion of high spatial and spectral resolution images based on oversampled multiresolution analysis. IEEE Trans Geosc Remote Sens 2002,40(10):2300-2312. 10.1109/TGRS.2002.803623

    Article  Google Scholar 

  40. Garzelli A, Nencini F: Interband structure modeling for pan-sharpening of very high-resolution multispectral images. Information Fusion 2005, 6: 213-224. 10.1016/j.inffus.2004.06.008

    Article  Google Scholar 

  41. Mallat SG: A theory for multiresolution signal decomposition: The wavelet representation. IEEE Transactions On Pattern Analysis And Machine Intelligence 1989,11(7):674-693. 10.1109/34.192463

    Article  MATH  Google Scholar 

  42. Zhou J, Civco DL, Silander JA: A wavelet transform method to merge Landsat TM and SPOT panchromatic data. Int J Remote Sens 1998,19(4):743-757. 10.1080/014311698215973

    Article  Google Scholar 

  43. da Cunha AL, Zhou J, Do MN: The nonsubsampled contourlet transform: Theory, design, and applications. IEEE Trans. Image Process 2006,15(10):3089-3101.

    Article  Google Scholar 

  44. Gonzalo M, Lillo-Saavedra C: Multispectral images fusion by a joint multidirectional and multiresolution representation. Int J Remote Sens 2007,28(18):4065-4079. 10.1080/01431160601105884

    Article  Google Scholar 

  45. Amro I, Mateos J: Multispectral image pansharpening based on the contourlet transform. Journal of Physics Conference Series 2010,206(1):1-3.

    Google Scholar 

  46. Ranchln T, Wald L: Fusion of high spatial and spectral resolution images: The ARSIS concept and its implementation. Photogramm Eng Remote Sens 2000, 66: 49-61.

    Google Scholar 

  47. Tu TM, Su SC, Shyu HC, Huang PS: A new look at IHS-like image fusion methods. Inf Fusion 2001,2(3):177-186. 10.1016/S1566-2535(01)00036-7

    Article  Google Scholar 

  48. Sheftigara VK: A generalized component substitution technique for spatial enhancement of multispectral lmages using a higher resolution data set. Photogramm Eng Remote Sens 1992,58(5):561-567.

    Google Scholar 

  49. Dou W, Chen Y, Li X, Sui DZ: A general framework for component substitution image fusion: An implementation using the fast image fusion method. Computers & Geosciences 2007, 33: 219-228. 10.1016/j.cageo.2006.06.008

    Article  Google Scholar 

  50. Aiazzi B, Baronti S, Selva M: Improving component substitution pansharpening through multivariate regression of MS + Pan data. IEEE Trans Geosc Remote Sens 2007,45(10):3230-3239.

    Article  Google Scholar 

  51. Alparone L, Aiazzi B, Garzelli SBA, Nencini F: A critical review of fusion methods for true colour display of very high resolution images of urban areas. 1st EARSeL Workshop of the SIG Urban Remote Sensing, Humboldt-Universität zu Berlin 2006.

    Google Scholar 

  52. Rahmani S, Strait M, Merkurjev D, Moeller M, Wittman T: An adaptive HIS pan-sharpening method. IEEE Geoscience And Remote Sensing Letters 2010,7(3):746-750.

    Article  Google Scholar 

  53. Shah VP, Younan NH, King RL: An efficient pan-sharpening method via a combined adaptive PCA approach and contourlets. IEEE Trans Geosc Remote Sens 2008,46(5):1323-1335.

    Article  Google Scholar 

  54. Aiazzi B, Baronti S, Selva M, Alparone L: Enhanced Gram-Schmidt spectral sharpening based on multivariate regression of MS and pan data. IEEE International Conference on Geoscience and Remote Sensing Symposium, IGARSS 2006 2006, 3806-3809.

    Chapter  Google Scholar 

  55. Carper WJ, Lillesand TM, Kiefer RW: The use of Intensity-Hue-Saturation transform for merging SPOT panchromatic and multispectral image data. Photogramm Eng Remote Sens 1990,56(4):459-467.

    Google Scholar 

  56. Tu TM, Huang PS, Hung CL, Chang CP: A fast intensity-hue-saturation fusion technique with spectral adjustment for IKONOS imagery. IEEE Geoscience And Remote Sensing Letters 2004,1(4):309-312. 10.1109/LGRS.2004.834804

    Article  Google Scholar 

  57. Laben CA, Brower BV: Process for enhancing the spatial resolution of multispectral imagery using pansharpening. US Patent 6 011 875 2000.

    Google Scholar 

  58. Farebrother RW: Gram-schmidt regression. Applied Statistics 1974, 23: 470-476. 10.2307/2347151

    Article  Google Scholar 

  59. SPOT Users Handbook. Centre National Etude Spatiale (CNES) and SPOT Image, Toulouse, France; 1988.

  60. Ballester C, Caselles V, Igual L, Verdera J: A variational model for P+XS image fusion. International Journal of Computer Vision 2006, 69: 43-58. 10.1007/s11263-006-6852-x

    Article  Google Scholar 

  61. Cliche G, Bonn F, Teillet P: Integration of the SPOT panchromatic channel into its multispectral mode for image sharpness enhancement. Photogrammetric Engineering & Remote Sensing 1985,51(3):311-316.

    Google Scholar 

  62. Vijayaraj V, O'Hara CG, Younan NH: Quality analysis of pansharpened images. Proc IEEE Int Geosc Remote Sens Symp IGARSS '04 2004, 1: 20-24.

    Google Scholar 

  63. Alparone L, Facheris L, Baronti S, Garzelli A, Nencini F: Fusion of multispectral and SAR images by intensity modulation. Proceedings of the 7th International Conference on Information Fusion 2004, 637-643.

    Google Scholar 

  64. Schowengerdt RA: Reconstruction of multispatial, multispectral image data using spatial frequency contents. Photogrammetric Engineering & Remote Sensing 1980,46(10):1325-1334.

    Google Scholar 

  65. Chavez PS Jr: Digital merging of Landsat TM and digitized NHAP data for 1:24,000 scale image mapping. Photogramm Eng Remote Sens 1986,52(10):1637-1646.

    Google Scholar 

  66. Tsai VJD: Frequency-based fusion of multiresolution images. 2003 IEEE International Geoscience and Remote Sensing Symposium IGARSS 2003 2003, 6: 3665-3667.

    Article  Google Scholar 

  67. Wald L, Ranchin T, Mangolini M: Fusion of satellite images of different spatial resolutions: Assessing the quality of resulting images. Photogramm Eng Remote Sens 1997, 63: 691-699.

    Google Scholar 

  68. González-Audícana M, Saleta J, García Catalán R, García R: Fusion of multispectral and panchromatic images using improved IHS and PCA mergers based on wavelet decomposition. IEEE Trans Geosc Remote Sens 2004,42(6):1291-1298.

    Article  Google Scholar 

  69. Khan MM, Alparone L, Chanussot J: Pansharpening quality assessment using the modulation transfer functions of instruments. IEEE Trans Geosc Remote Sens 2009,47(11):3880-3891.

    Article  Google Scholar 

  70. Thomas C, Wald L: A MTF-based distance for the assessment of geometrical quality of fused products. 2006 9th International Conference on Information Fusion 2006, 1-7.

    Google Scholar 

  71. Price JC: Combining multispectral data of differing spatial resolution. IEEE Trans Geosc Remote Sens 1999,37(3):1199-1203. 10.1109/36.763272

    Article  Google Scholar 

  72. Punska O: Bayesian approaches to multi-sensor data fusion. Master's thesis, University of Cambridge; 1999.

    Google Scholar 

  73. Fasbender D, Radoux J, Bogaert P: Bayesian data fusion for adaptable image pansharpening. IEEE Transactions On Geoscience And Remote Sensing 2008, 46: 1847-1857.

    Article  Google Scholar 

  74. Hardie RC, Eismann MT, Wilson GL: MAP estimation for hyperspectral image resolution enhancement using an auxiliary sensor. IEEE Trans Image Process 2004,13(9):1174-1184. 10.1109/TIP.2004.829779

    Article  Google Scholar 

  75. Molina R, Vega M, Mateos J, Katsaggelos AK: Variational posterior distribution approximation in Bayesian super resolution reconstruction of multispectral images. Applied And Computational Harmonic Analysis 2007, 12: 1-27.

    Google Scholar 

  76. Mateos M, Vega J, Molina R, Katsaggelos A: Super resolution of multispectral images using TV image models. 2th Int Conf on Knowledge-Based and Intelligent Information & Eng Sys 2008, 408-415.

    Google Scholar 

  77. Kitaw HG: Image pan-sharpening with Markov random field and simulated annealing. PhD dissertation, International Institute for Geo-information Science and Earth Observation, NL 2007.

    Google Scholar 

  78. Eismann MT, Hardie RC: Hyperspectral resolution enhancement using highresolution multispectral imagery with arbitrary response functions. IEEE Transactions On Geoscience And Remote Sensing 2005,43(3):455-465.

    Article  Google Scholar 

  79. Vega M, Mateos J, Molina R, Katsaggelos A: Super resolution of multispectral images using l1 image models and interband correlations. 2009 IEEE International Workshop on Machine Learning for Signal Processing 2009, 1-6.

    Chapter  Google Scholar 

  80. Eismann MT, Hardie RC: Application of the stochastic mixing model to hyperspectral resolution enhancement. IEEE Transactions On Geoscience And Remote Sensing 2004,42(9):1924-1933.

    Article  Google Scholar 

  81. Katsaggelos A, Molina R, Mateos J: Super Resolution Of Images And Video. Synthesis Lectures on Image, Video, and Multimedia Processing, Morgan & Claypool; 2007.

    Google Scholar 

  82. Molina R, Mateos J, Katsaggelos AK, Milla RZ: A new super resolution Bayesian method for pansharpening Landsat ETM+ imagery. 9th International Symposium on Physical Measurements and Signatures in Remote Sensing (ISPMSRS) 2005, 280-283.

    Google Scholar 

  83. Rong GZ, Bin W, Ming ZL: Remote sensing image fusion based on Bayesian linear estimation. Science in China Series F: Information Sciences 2007,50(2):227-240. 10.1007/s11432-007-0008-7

    Article  MathSciNet  Google Scholar 

  84. Niu W, Zhu J, Gu W, Chu J: Four statistical approaches for multisensor data fusion under non-Gaussian noise. IITA International Conference on Control, Automation and Systems Engineering 2009.

    Google Scholar 

  85. Brandenburg L, Meadows H: Shaping filter representation of nonstationary colored noise. IEEE Transactions on Information Theory 1971, 17: 26-31. 10.1109/TIT.1971.1054585

    Article  MATH  Google Scholar 

  86. Burt PJ, Adelson EH: The Laplacian pyramid as a compact image code. IEEE Transactions On Communications 1983,COM-3l(4):532-540.

    Article  Google Scholar 

  87. Zhang J: Multi-source remote sensing data fusion: Status and trends. International Journal of Image and Data Fusion 2010,1(1):5-24. 10.1080/19479830903561035

    Article  Google Scholar 

  88. Alparone L, Baronti S, Garzelli A: Assessment of image fusion algorithms based on noncritically-decimated pyramids and wavelets. Proc IEEE 2001 International Geoscience and Remote Sensing Symposium IGARSS '01 2001, 2: 852-854.

    Google Scholar 

  89. Kim MG, Dinstein I, Shaw L: A prototype filter design approach to pyramid generation. IEEE Trans on Pattern Anal and Machine Intell 1993,15(12):1233-1240. 10.1109/34.250842

    Article  Google Scholar 

  90. Aiazzi B, Alparone L, Baronti S, Garzelli A: Spectral information extraction by means of MS+PAN fusion. Proceedings of ESA-EUSC 2004 - Theory and Applications of Knowledge-Driven Image Information Mining with Focus on Earth Observation 2004, 20.1.

    Google Scholar 

  91. González-Audícana M, Otazu X: Comparison between Mallat's and the a'trous discrete wavelet transform based algorithms for the fusion of multispectral and panchromatic images. Int J Remote Sens 2005,26(3):595-614. 10.1080/01431160512331314056

    Article  Google Scholar 

  92. Garguet-Duport B, Girel J, Chassery JM, Pautou G: The use of multiresolution analysis and wavelets transform for merging SPOT panchromatic and multispectral image data. Photogramm Eng Remote Sens 1996,62(9):1057-1066.

    Google Scholar 

  93. Yocky DA: Image merging and data fusion by means of the discrete two-dimensional wavelet transform. Optical Society of America 1995,12(9):1834-1841. 10.1364/JOSAA.12.001834

    Article  Google Scholar 

  94. Nuñez J, Otazu X, Fors O, Prades A, Pala V, Arbiol R: Multiresolution-based image fusion with additive wavelet decomposition. IEEE Trans Geosci Remote Sens 1999,37(3):1204-1211. 10.1109/36.763274

    Article  Google Scholar 

  95. Chibani Y, Houacine A: The joint use of IHS transform and redundant wavelet decomposition for fusing multispectral and panchromatic image. Int J Remote Sensing 2002,23(18):3821-3833. 10.1080/01431160110107626

    Article  Google Scholar 

  96. Amolins K, Zhang Y, Dare P: Wavelet based image fusion techniques - an introduction, review and comparison. ISPRS J Photogramm 2007, 249-263.

    Google Scholar 

  97. Pradhan PS, King RL, Younan NH, Holcomb DW: Estimation of the number of decomposition levels for a wavelet-based multiresolution multisensor image fusion. IEEE Trans Geosc Remote Sens 2006,44(12):3674-3686.

    Article  Google Scholar 

  98. Song M, Chen X, Guo P: A fusion method for multispectral and panchromatic images based on HSI and contourlet transformation. Proc 10th Workshop on Image Analysis for Multimedia Interactive Services WIAMIS '09 2009, 77-80.

    Google Scholar 

  99. ALEjaily AM, Rube IAE, Mangoud MA: Fusion of remote sensing images using contourlet transform. Springer Science 2008, 213-218.

    Google Scholar 

  100. Yang S, Wang M, Lu YX, Qi W, Jiao L: Fusion of multiparametric SAR images based on SW-nonsubsampled contourlet and PCNN. Signal Processing 2009, 89: 2596-2608. 10.1016/j.sigpro.2009.04.027

    Article  MATH  Google Scholar 

  101. Wu J, Huang H, Liu J, Tian J: Remote sensing image data fusion based on HIS and local deviation of wavelet transformation. Proc IEEE Int Conf on Robotics and Biomimetics ROBIO 2004 2004, 564-568.

    Google Scholar 

  102. Otazu X, González-Audícana M, Fors O, Núñez J: Introduction of sensor spectral response into image fusion methods: Application to wavelet-based methods. IEEE Trans Geosci Remote Sens 2005,43(10):2376-2385.

    Article  Google Scholar 

  103. Amro I, Mateos J, Vega M: General contourlet pansharpening method using Bayesian inference. 2010 European Signal Processing Conference (EUSIPCO-2010) 2010, 294-298.

    Google Scholar 

  104. Zhang Y, Backer SD, Scheunders P: Bayesian fusion of multispectral and hyperspectral image in wavelet domain. IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2008, 5: 69-72.

    Google Scholar 

  105. Meenakshisundaram V: Quality assessment of IKONOS and Quickbird fused images for urban mapping. Master's thesis, University of Calgary; 2005.

    Google Scholar 

  106. Wald L: Data Fusion Definitions and Architectures: Fusion of Images of Different Spatial Resolutions. Les Presses de l'Ecole des Mines, Paris; 2002.

    Google Scholar 

  107. Nencini F, Garzelli A, Baronti S, Alparone L: Remote sensing image fusion using the curvelet transform. Information Fusion 2007, 8: 143-156. 10.1016/j.inffus.2006.02.001

    Article  Google Scholar 

  108. Vijayaraj V, Younan NH, O'Hara CG: Quantitative analysis of pansharpened images. Optical Engineering 2006,45(4):046202. 10.1117/1.2195987

    Article  Google Scholar 

  109. Wang Z, Bovik AC, Sheikh HR, Simoncelli EP: Image quality assessment: From error visibility to structural similarity. IEEE Trans Image Process 2004, 13: 600-612. 10.1109/TIP.2003.819861

    Article  Google Scholar 

  110. Wang Z, Bovik AC: A universal image quality index. IEEE Signal Process Lett 2002,9(3):81-84. 10.1109/97.995823

    Article  Google Scholar 

  111. Wald L: Quality of high resolution synthesized images: Is there a simple criterion? Proc Int Conf Fusion of Earth Data 2000, 1: 99-105.

    Google Scholar 

  112. Du Q, Younan NH, King R, Shah VP: On the performance evaluation of pansharpening techniques. IEEE Trans Geosc Remote Sens L 2007,4(4):518-522.

    Article  Google Scholar 

  113. Alparone L, Baronti S, Garzelli A, Nencini F: A global quality measurement of pan-sharpened multispectral imagery. IEEE Geoscience And Remote Sensing Letters 2004,1(4):313-317. 10.1109/LGRS.2004.836784

    Article  Google Scholar 

  114. Thomas C, Wald L: Comparing distances for quality assessment of fused images. Proceedings of the 26th EARSeL Symposium "New Strategies for European Remote Sensing" 2007, 101-111.

    Google Scholar 

  115. Lillo-Saavedra M, Gonzalo C, Arquero A, Martinez E: Fusion of multispectral and panchromatic satellite sensor imagery based on tailored filtering in the fourier domain. Int J Remote Sens 2005, 26: 1263-1268. 10.1080/01431160412331330239

    Article  Google Scholar 

  116. Alparone L, Wald L, Chanussot J, Gamba P, Bruce LM: Comparison of pansharpening algorithms: Outcome of the 2006 GRS-S data-fusion contest. IEEE Trans Geosc Remote Sens 2007,45(10):3012-3021.

    Article  Google Scholar 

  117. Jinghui Y, Jixian Z, Haitao L, Yushan S, Pengxian P: Pixel level fusion methods for remote sensing images: A current review. In ISPRS TC VII Symposium 100 Years ISPRS Edited by: W Wagner, B Szkely. 2010, 680-686.

    Google Scholar 

  118. Alparone L, Aiazzi B, Baronti S, Garzelli A, Nencini F, Selva M: Multispectral and panchromatic data fusion assessment without reference. Photogrammetric Engineering & Remote Sensing 2008, 74: 193-200.

    Article  Google Scholar 

  119. Alparone L, Aiazzi B, Baronti S, Garzelli A, Nencini F: A new method for MS + Pan image fusion assessment without reference. Proc IEEE Int Conf Geoscience and Remote Sensing Symp IGARSS 2006 2006, 3802-3805.

    Chapter  Google Scholar 

Download references

Acknowledgements

This work has been supported by the Consejería de Innovación, Ciencia y Empresa of the Junta de Andalucía under contract P07-TIC-02698 and the Spanish research programme Consolider Ingenio 2010: MIPRCV (CSD2007-00018).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Javier Mateos.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Amro, I., Mateos, J., Vega, M. et al. A survey of classical methods and new trends in pansharpening of multispectral images. EURASIP J. Adv. Signal Process. 2011, 79 (2011). https://doi.org/10.1186/1687-6180-2011-79

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/1687-6180-2011-79

Keywords