Elsevier

Signal Processing

Volume 92, Issue 12, December 2012, Pages 2837-2847
Signal Processing

Denoising by second order statistics

https://doi.org/10.1016/j.sigpro.2012.04.015Get rights and content

Abstract

A standard approach for deducing a variational denoising method is the maximum a posteriori strategy. Here, the denoising result is chosen in such a way that it maximizes the conditional distribution function of the reconstruction given its observed noisy version. Unfortunately, this approach does not imply that the empirical distribution of the reconstructed noise components follows the statistics of the assumed noise model. In this paper, we show for additive noise models how to overcome this drawback by applying an additional transformation to the random vector modeling the noise. This transformation is then incorporated into the standard denoising approach and leads to a more sophisticated data fidelity term, which forces the removed noise components to have the desired statistical properties. The good properties of our new approach are demonstrated for additive Gaussian noise by numerical examples. Our method shows to be especially well suited for data containing high frequency structures, where other denoising methods which assume a certain smoothness of the signal fail to restore the small structures.

Highlights

► We present a new approach for deducing data fidelity terms for variational denoising methods. ► Our approach extends the classical MAP approach by an additional variable transformation. ► Hereby, the removed noise is forced to follow the statistics of the assumed noise model. ► Examples for additive Gaussian noise show the good properties of our new approach. ► It is demonstrated to be particularly well suited for data containing high frequency structures.

Introduction

Measured signals and images are often corrupted by noise which makes their denoising and reconstruction a central aim in digital signal and image processing. Especially for data of low quality reliable and robust reconstruction methods are required. In the last decades many methods have been proposed for denoising corrupted data. A commonly applied approach is to solve a variational problem, where one has to minimize a functional consisting of a data fidelity term and a regularization term. The functional is usually deduced by a maximum a posteriori strategy, which requires some knowledge about the noise statistics and the distribution of the original data. In the literature, e.g., when considering detector noise or in case of high photon counts, where the Poisson distribution can be well approximated by a Gaussian one, it is often assumed that the corrupted data follows an additive noise model. This means that our given noisy data gRN is modeled as g=f0+ε0,where f0RN is the unknown noise-free data and the noise vector ε0RN is a realization of a random vector E:ΩRN defined with respect to a continuous probability space (Ω,F,P). As usual, Ω represents here the sample space, F denotes the σ-algebra and P:F[0,1] represents the probability measure. The vectors g and f0 are assumed to be realizations of independent N-dimensional random vectors G:ΩRN and F:ΩRN, respectively, so that G=F+E.

To deduce an estimate f^MAP of f0 by a maximum a posteriori (MAP) strategy, one usually setsf^MAPargminfRN{logpF|G(f|g)},cf., e.g., [1], [2], [3], where pF|G(f|g) is the conditional probability density function for observing f given G=g. By Bayes' theorem we know thatpF|G(f|g)=pG|F(g|f)pF(f)pG(g),here pG|F is the so-called likelihood, which is usually closely related to the density of the noise, pF is some a priori density of F and pG is the density of G. Since we consider additive noise, it holds that pG|F(g|f)=pE(gf)=pE(ε), where εgf and pE denotes the density of E. Moreover, inserting (2) in (1) yieldsf^MAP=argminfRN{logpE(gf)logpF(f)},here the terms logpE(gf) and logpF(f) imply that we search for the most likely vectors ε^MAP=gf^MAP and f^MAP under the condition that g=f^MAP+ε^MAP. If the components Ei of the random vector E are pairwise independent and identically distributed (i.i.d.) as it is often assumed, thenlogpE(gf)=logi=1NpEi(gifi)=i=1NlogpEi(gifi).For the special case that EiN(0,σ2),i=1,,N, this leads tologpE(gf)=i=1Nlog12πσexp(gifi)22σ2=Nlog12πσ+12σ2gf22.To determine logpF(f), at least some estimate of the a priori density pF is required. Assuming that pF(f)=exp(cJ(f)) for some constant c>0 and a nonnegative function J:RNR, the minimization problem (3) with (5) is finally equivalent tof^MAP=argminfRN12gf22+λJ(f)withλcσ2>0,here the amount of filtering is controlled by the parameter λ, which steers the influence of the two terms within the functional. If J is assumed to be J(f)Df22, where D is a discrete first derivative operator, we obtain by this approach the regularization method proposed by Tikhonov and Miller (TM) in [4], which we will shortly call MAP-TM. By this choice for J the initial signal is assumed to have small first derivatives, i.e., to be of a certain degree of smoothness (in H1 for the continuous setting). Unfortunately, if the signal contains jumps, the TM regularization will oversmooth them. To overcome this drawback, J is often set to J(f)Df1, which is the discrete one-dimensional version of the total variation regularizer (TV). The corresponding denoising method (6) leads to the classical approach of Rudin et al. [5], which is well known for its discontinuity preserving properties. In the following, we will refer to this method as MAP-TV and we will use it as well as the MAP-TM approach as reference methods for our numerical experiments.

Now, if we forget about the regularization term for a moment and have again a closer look at our data fidelity term logpE(gf) in (4), where E is assumed to be i.i.d., we see that this data fidelity term is minimal whenever all components εi=gifi maximize pEi(εi). Consequently, without the regularization term or equivalently for λ=0, our reconstructed noise vector ε^ would be a constant vector of value argmaxepEi(e) and thus, f^=gargmaxepEi(e). These estimates may seem reasonable for a signal length N close to one. However, since the vector E is i.i.d., we may expect for larger N that the empirical distribution of the components of our estimated noise vector ε^ resembles the distribution of Ei,i=1,,N. In principal, to check how good a set of samples coincides with a given distribution we could for example apply the Kolmogorov–Smirnov [6] or the Anderson–Darling test [7].

Outline. In the following, we show that it is possible to modify the standard MAP approach so that the reconstructed noise vector is forced to resemble the statistical properties of the assumed noise model. To this purpose, a suitable variable transformation is applied to the random vector E before computing the MAP estimates. In Section 2 our new approach is presented and we investigated two different transformations with respect to their benefits and shortcomings. These transformations incorporate estimates of higher moments of E into the resulting minimization problems to force the reconstructed noise vector ε^ to have the desired statistics. In Section 3 we discuss a first implementation of our approach for one-dimensional data and present numerical results. Finally, we summarize our new findings and finish with concluding remarks in Section 4.

Related work. The idea of using higher-order statistics for restoring corrupted data can for example be found in blind source separation techniques, cf. [8], [9], [10]. Moreover, it has been used for wavelet based denoising methods as, e.g., presented in [11], [12], [13] and in approaches combining the empirical mode decomposition with higher-order statistical estimates, see, e.g., [14]. In contrast to the works of Hofinger [15], [16] we introduce here a representation of the noise distribution that depends both on moments and especially on the correlation of the random variables Ei. We also embed noise correlation, cf. [17], in a concise formalism that allows to achieve de-correlated estimates εi of the original noise components if the Ei are independent, a result that is achieved to some extent ad hoc with non-local means [18], [19] according to empirical studies.

Section snippets

A new denoising approach

For simplicity, we assume in the following that the random variables Ei are again i.i.d. with expectation value E(Ei)=μ and variance Var(Ei)=σ2. Hence, the components of the vector ε can be considered as samples of the same random variable. Computing the MAP estimator f^MAP and the corresponding noise vector ε^MAP=gf^MAP from Eq. (3) is equivalent to solving the minimization problemargminfRN,εRN{logpE(ε)logpF(f)}subjecttog=f+εfor the given noisy data gRN. Since the term logpE(ε) does not

Minimizaton problem

To demonstrate the capability of our new denoising approach introduced in (8), (14) we proceed with numerical examples. In the following, we want to denoise signals corrupted by additive white Gaussian noise by minimizing the second order statistics functionalJ(ε)=Jmean(1)(ε)+Jvar(2)(ε)+2K1Jcov(2)(ε)+λgε22,λ>0with respect to ε so that f^=gε^ is our reconstruction of the original signal f0. Here, the prior term logpF(f) from (8) is set to λgε22 and guarantees that the reconstructed

Conclusions

We have shown that the standard maximum likelihood estimation approach for denoising signals can be generalized by introducing an additional transformation of the random variables modeling the noise. This transformation allows to consider also pixel correlations within the noise vectors and helps to obtain a reconstructed noise vector, which resembles the statistical properties of the assumed noise model. The transformation of our choice leads to a nonconvex minimization problem. A local

References (30)

  • D. Boes et al.

    Probability and Statistical Inference

    (1974)
  • W. Press et al.

    Numerical Recipes in FORTRAN: The Art of Scientific Computing

    (1992)
  • A. Belouchrani et al.

    A blind source separation technique using second-order statistics

    IEEE Transactions on Signal Processing

    (1997)
  • Y. Li, Y. Yang, Wavelet thresholding method using higher-order statistics for seismic signal de-noising, in: Second...
  • G. Tsolis et al.

    Signal denoising using empirical mode decomposition and higher order statistics

    International Journal of Signal Processing, Image Processing and Pattern Recognition

    (2011)
  • Cited by (0)

    View full text