Multifocus image fusion using artificial neural networks

https://doi.org/10.1016/S0167-8655(02)00029-6Get rights and content

Abstract

Optical lenses, particularly those with long focal lengths, suffer from the problem of limited depth of field. Consequently, it is often difficult to obtain good focus for all objects in the picture. One possible solution is to take several pictures with different focus points, and then combine them together to form a single image. This paper describes an application of artificial neural networks to this pixel level multifocus image fusion problem based on the use of image blocks. Experimental results show that the proposed method outperforms the discrete wavelet transform based approach, particularly when there is a movement in the objects or misregistration of the source images.

Introduction

Optical lenses, particularly those with long focal lengths, suffer from the problem of limited depth of field. Consequently, the image obtained will not be in focus everywhere, i.e., if one object in the scene is in focus, another one will be out of focus. A possible way to alleviate this problem is by image fusion (Zhang and Blum, 1999), in which several pictures with different focus points are combined to form a single image. This fused image will then hopefully contain all relevant objects in focus (Li et al., 1995; Seales and Dutta, 1996).

The simplest image fusion method just takes the pixel-by-pixel average of the source images. This, however, often leads to undesirable side effects such as reduced contrast. In recent years, various alternatives based on multiscale transforms have been proposed. The basic idea is to perform a multiresolution decomposition on each source image, then integrate all these decompositions to produce a composite representation. The fused image is finally reconstructed by performing an inverse multiresolution transform. Examples of this approach include the Laplacian pyramid (Burt and Andelson, 1983), the gradient pyramid (Burt and Kolczynski, 1993), the ratio-of-low-pass pyramid (Toet et al., 1989) and the morphological pyramid (Matsopoulos et al., 1994). More recently, the discrete wavelet transform (DWT) (Chipman et al., 1995; Koren et al., 1995; Li et al., 1995; Yocky, 1995, Zhang and Blum, 1999) has also been used. In general, DWT is superior to the previous pyramid-based methods (Li et al., 1995). First, the wavelet representation provides directional information while pyramids do not. Second, the wavelet basis functions can be chosen to be orthogonal and so, unlike the pyramid-based methods, DWT does not carry redundant information across different resolutions. Upon fusion of the wavelet coefficients, the maximum selection rule is typically used, as large absolute wavelet coefficients often correspond to salient features in the images. Fig. 1 shows a schematic diagram for the image fusion process based on DWT.

While these methods often perform satisfactorily, their multiresolution decompositions and consequently the fusion results are shift-variant because of an underlying down-sampling process. When there is a slight camera/object movement or when there is misregistration of the source images, their performance will thus quickly deteriorate. One possible remedy is to use the shift-invariant discrete wavelet frame transform (Unser, 1995). However, the implementation is more complicated and the algorithm is also more demanding in terms of both memory and time.

In this paper, we propose a pixel level multifocus image fusion method based on the use of image blocks and artificial neural networks. The implementation is computationally simple and can be realized in real-time. Experimental results show that it outperforms the DWT-based method. The rest of this paper is organized as follows. The proposed fusion scheme will be described in Section 2. Experiments will be presented in Section 3, and the last section gives some concluding remarks.

Section snippets

Neural network based multifocus image fusion

Fig. 2 shows a schematic diagram of the proposed multifocus image fusion method. Here, we consider the processing of just two source images, though the algorithm can be extended straightforwardly to handle more than two. Moreover, the source images are assumed to have been registered.

The basic fusion algorithm will be described in Section 2.1. The input features to the neural networks will be discussed in Section 2.2. Section 2.3 contains a brief introduction to the two neural network models

Demonstration of the effectiveness of the features

In this section, we first experimentally demonstrate the effectiveness of the three features proposed in Section 2.2 (namely SF,VI and EG) in representing the clarity level of an image. An image block of size 64×64 (Fig. 4(a)) is extracted from the “Lena” image. Fig. 4(b)–(e) show the degraded versions by blurring with a Gaussian of radius 0.5, 0.8, 1.0 and 1.5, respectively. As can be seen from Table 1, when the image becomes more blurred, all the three feature values diminish accordingly.

Conclusion

In this paper, we combine the idea of image blocks and artificial neural networks for pixel level multifocus image fusion. Features indicating the clarity of an image block are extracted and fed into the neural network, which then learns to determine which source image is clearer at that particular physical location. Two neural network models, namely the PNN and RBFN, have been used. Experimental results show that this method outperforms the DWT-based approach, particularly when there is object

References (19)

  • H. Li et al.

    Multisensor image fusion using the wavelet transform

    Graphical Models Image Processing

    (1995)
  • R.E. Bellman

    Adaptive Control Processes

    (1961)
  • C.M. Bishop

    Neural Networks for Pattern Recognition

    (1995)
  • P.T. Burt et al.

    The Laplacian pyramid as a compact image code

    IEEE Trans. Comm.

    (1983)
  • P.J. Burt et al.

    Enhanced image capture through fusion

  • J. Canny

    A computational approach to edge detection

    IEEE Trans. Pattern Recognition Machine Anal.

    (1986)
  • L.J. Chipman et al.

    Wavelets and image fusion

  • A.M. Eskicioglu et al.

    Image quality measures and their performance

    IEEE Trans. Comm.

    (1995)
  • J. Hertz et al.

    Introduction to the Theory of Neural Computation

    (1991)
There are more references available in the full text version of this article.

Cited by (0)

View full text