Separation of statistically dependent sources using an L2-distance non-Gaussianity measure

doi:10.1016/j.sigpro.2006.02.032

Signal Processing

Volume 86, Issue 11, November 2006, Pages 3404-3420

https://doi.org/10.1016/j.sigpro.2006.02.032 Get rights and content

Abstract

We provide a solution to the BSS problem for the special case of statistically dependent sources. We propose the MaxNG algorithm based on the maximization of a non-Gaussianity (NG) measure which is equivalent to minimizing the Shannon entropy of source estimates. We compare our algorithm against a strategy commonly used which is based on the minimization of mutual information (MinMI). It is shown that, for uncorrelated sources, both strategies arrive at similar solutions but when sources are dependent (correlated), better results are obtained using MaxNG. In order to measure NG, we use a non-parametric density estimation technique, namely Parzen windows, and $L^{2}$ -Euclidean distance in the space of density functions. A wide set of simulations based on real world data with complex dependence structures is presented, showing that our MaxNG algorithm successfully separates the sources, even when the original sources are strongly dependent for which traditional MinMI algorithms, such as ICA, usually fail. Many experimental results are provided to evaluate the performance of our algorithm for two and six sources. Comparisons of MaxNG with some popular BSS algorithms are provided. The main conclusion of the present work is that, our NG measure provides a useful tool for separating dependent signals since original sources usually represent local maxima of this measure.

Introduction

The classical problem of obtaining the best estimates of $M$ input signals from their $M$ observed linear mixtures is commonly referred to as blind source separation (BSS) in the literature. This issue was studied comprehensively in the past years especially for the case of independent sources which leads to the so called independent component analysis (ICA) [1]. A precise mathematical framework for ICA was stated by Comon in [2] and many algorithms were developed by researchers using the concept of contrast functions (objective functions to be minimized) mainly based on approximations of mutual information (MI) [2], [3], [4], [5].

On the other hand, very few works exist for the case of dependent sources, for which the mathematical foundation lacks. There is a large variety of applications that require considering sources that usually exhibit slight or strong dependence.

Cardoso [6] has shown that a strong relationship exists among MI, correlation and non-Gaussianity (NG) of source estimates. He showed that, if source estimates are not restricted to the uncorrelated case, then the minimization of MI is not equivalent to the maximization of NG (his formula is reproduced in Appendix A). Our present work is inspired by this theoretic result as well as by other results obtained in the past using minimum entropy methods which are mentioned in Section 2.

In this paper, we focus on the separation of dependent sources and we propose an algorithm based on a local maximization of a measure of NG by using the $L^{2}$ -Euclidean distance. Besides, using a non-parametric technique with a Gaussian kernel for the estimation of densities, we build an objective function to be maximized locally in order to separate signals.

This paper is organized as follows: in Section 1, a brief review on the classical BSS model and previous work are presented; in Section 2, the maximum NG (minimum entropy) method is outlined, and justified; in Section 3, a new and practical way of computing a NG measure using Parzen windows technique is presented; in Section 4, the MaxNG algorithm for separating $M$ dependent sources is formulated and also, an illustrative example with two sources is presented in order to show how the algorithm works. Finally, in Section 5, extensive simulation results are presented using real world signals with complex dependence structures and distributions. The MaxNG strategy is compared with MinMI; the separation efficiency of MaxNG is evaluated as the data sample size $N$ is increased, and comparisons with other algorithms are provided. The performance evaluation of our algorithm for six sources is also included. In Section 6, our main conclusions are given.

Section snippets

Classic BSS problem and previous results

The mathematical framework of BSS is based on assuming the existence of $M$ input signals $s_{0}$ , $s_{1}, \dots, s_{M - 1}$ with zero-mean $(E (s_{i}) = 0)$ and unit-variance $(E (s_{i}^{2}) = 1)$ . A set of $M$ linear mixtures (outputs) $x_{0}$ , $x_{1}, \dots, x_{M - 1}$ can be written as $x_{i} (t) = \sum_{j = 0}^{M - 1} a_{ij} s_{j} (t)$ , its matrix representation is given by $x (t) = A s (t),$ where $s (t) = [s_{0} s_{1} \dots s_{M - 1}]^{T}$ and $x (t) = [x_{0} x_{1} \dots x_{M - 1}]^{T}$ are $M \times 1$ column vectors and $A$ is a $M \times M$ invertible matrix which describes the mixing of signals and is called the mixing matrix. When the only available

Relaxing independence of sources: the maximization of NG approach

The Gaussian distribution has the maximum Shannon differential entropy (maximum uncertainty) over all the continuous distributions defined on the real line with the same variance [13]. This fact makes the Gaussianity measure a very useful tool for the characterization of data. The more Gaussian the data are, the less structured they are and the less information they are able to reveal. The idea of measuring the relevance of a projection through a NG measure of the projected data dates back to

A measure of NG based on the $L^{2}$ -Euclidean distance

Let us now introduce a natural measure of NG based on the $L^{2}$ -Euclidean distance of an estimated pdf to the normal (Gaussian) pdf. Considering a continuous random variable $y$ with zero-mean and unit-variance, we define our NG measure of a pdf $p_{y}$ denoted by $Γ (p_{y})$ , as following: $Γ (p_{y}) = \int [Φ (y) - p_{y} (y)]^{2} d y,$ where the integral is defined in Lebesgue sense and is taken on all the range of variable $y$ , and $Φ (y)$ is the Gaussian pdf: $Φ (y) = N (0, 1) = \frac{1}{\sqrt{2 π}} \exp (- \frac{1}{2} y^{2}) .$ Clearly, Eq. (2) is the square of the distance between

Separation of dependent sources

In this section we present the details of our MaxNG algorithm which is based on the maximum NG method defined in Section 2. In this section, we also include a clarifying example for two dependent sources.

Minimum MI versus maximum NG strategies comparison

In this section we provide a comparison between MinMI and MaxNG criteria for the separation of “real world” dependent sources. We have performed a total of $300$ simulations for different sources and different levels of dependence. Original sources of length $N = 512$ were extracted from pixel columns of various satellite images. By selecting different column offsets between signals we have a control of the level of dependence. Using known mixing matrices $A$ , mixtures were generated using Eq. (1) and

Conclusions and discussion

The experimental results presented in this paper show that maximizing NG, which is equivalent to searching minimum entropy solutions, represents a powerful tool for separating sources from their linear mixtures, even when the classical independence constraint is relaxed.

Particularly, we have shown that, when original sources are dependent, MinMI algorithms like most traditional ICA algorithms may fail in separating signals and a better strategy is MaxNG. We have provided a new and practical way

Acknowledgments

C. Caiafa acknowledges financial support from Facultad de Ingenieria, Universidad de Buenos Aires, Argentina (Beca Peruilh). A. N. Proto thanks the hospitality of the Comision Nacional de Actividades Espaciales and the International Centre for Theoretical Physics (Trieste, Italy). We especially thank Dr. Ercan Kuruoglu who did the proofreading and provided valuable comments on this work. This research is supported by Grant BID-ANPCyT PICT 02-13533.

References (32)

P. Comon
Independent component analysis, a new concept?
Signal Processing
(1994)
A. Hyvärinen et al.
Independent component analysis: algorithms and applications
Neural Networks
(2000)
R.A. Wiggins
Minimum entropy deconvolution
Geoexploration
(1978)
T.C. Christofides et al.
Central limit theorem for dependent multidimensionally indexed random variables
Statist. Probab. Lett.
(2003)
J.P. Romano et al.
A more general central limit theorem for m-dependent random variables with unbounded m
Statist. Probab. Lett.
(2000)
M. Dudzinski
A note on the almost sure central limit theorem for some dependent random variables
Statist. Probab. Lett.
(2003)
A. Hyvärinen et al.
Independent Component Analysis
(2001)
A.J. Bell et al.
An information-maximization approach to blind separation and blind deconvolution
Neural Computation
(1995)
A. Hyvärinen
Fast and robust fixed-point algorithms for independent component analysis
IEEE Trans. Neural Networks
(1999)
R. Boscolo et al.
Independent component analysis based on nonparametric density estimation
IEEE Trans. Neural Networks
(2004)

J.F. Cardoso

correlation and gaussianity in independent component analysis

J. Machine Learning Res.

(2003)

F.R. Bach et al.

Kernel independent component analysis

J. Machine Learning Res.

(2002)

A. Cichocki, S. Amari, K. Siwek, T. Tanaka, et al., ICALAB toolboxes,...

A. Cichocki et al.

Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications

(2002)

A. Hyvärinen

Beyond independent components

A. Hyvärinen et al.

Topographic independent component analysis

Neural Computation

(2001)

Cited by (37)

A new convolutive source separation approach for independent/dependent source components
2020, Digital Signal Processing: A Review Journal
Citation Excerpt :
They showed by simulations that their method gives better results, than classical ICA ones, when the sources are highly correlated. Extensions and theoretical study of [20] have been given in [21]. Algebraic approaches, using temporal informations and relations between source components, have been also proposed in [22], [23] and [24].
In this paper, a new source separation approach, for linear convolutive mixtures of independent/dependent source components, is presented. It consists in minimizing an appropriate separation criterion, measuring the difference between the nonparametric copula density of the estimated sources and semiparametric copula densities modeling the dependency structure of the source components. The proposed approach represents an efficient tool for separating linear convolutive mixtures, especially, when the source components are statistically dependent, if prior information about the dependency structure of the source components is available.
An efficient algorithm for harmonic retrieval by combining blind source separation with wavelet packet decomposition
2015, Digital Signal Processing: A Review Journal
Citation Excerpt :
Instead, it is assumed that the source signals can be divided into couples, triplets, or in general i-tuples, such that the source signals inside a given i-tuple may be dependent on each other, but dependencies among different i-tuples are not allowed. Based on this basic extension of the ICA model, there have emerged lots of DCA models and corresponding algorithms, these methods can be divided into two classes, the first one is called statistical approach, such as independent subspace analysis [20], variance dependent BSS [21], tree-dependent component analysis [22], sub-band decomposition ICA (SDICA) [23], maximum non-Gaussianity method [24,25], time–frequency method [26] and so on; another class called determined approach which includes non-negative based method [27] and bounded component analysis (BCA) [28,29]. Among these algorithms, SDICA assumes that each source signal is represented as the sum of some independent sub-components and dependent sub-components, which have different frequency bands.
In the present paper, we propose an efficient framework and algorithm for one dimensional harmonic retrieval problem in additive colored Gaussian or non-Gaussian noise when the frequencies of the harmonic signals are closely spaced in frequency domain. Our framework utilizes the wavelet packet (WP) method to the blind source separation (BSS) based harmonic retrieval model. Firstly, we establish the BSS based harmonic retrieval model in additive noise using only one mixed channel signal, at the same time, the fundamental principle of BSS based harmonics retrieval algorithm is analyzed in detail. Then, the harmonic retrieval algorithm is developed mainly using the WP decomposition approach, where the criterion is formed as the cumulant based approximation of the mutual information (MI) for the selection of optimal sub-bands of WP decomposition with the least-dependent components between the same nodes. Simulation results show that the proposed algorithm is able to retrieve the harmonic source signals and yield good performance.
Wavelet packets approach to blind separation of statistically dependent sources
2008, Neurocomputing
Sub-band decomposition independent component analysis (SDICA) assumes that wide-band source signals can be dependent but some of their sub-components are independent. Thus, it extends applicability of standard independent component analysis (ICA) through the relaxation of the independence assumption. In this paper, firstly, we introduce novel wavelet packets (WPs) based approach to SDICA obtaining adaptive sub-band decomposition of the wideband signals. Secondly, we introduce small cumulant based approximation of the mutual information (MI) as a criterion for the selection of the sub-band with the least-dependent components. Although MI is estimated for measured signals only, we have provided a proof that shows that index of the sub-band with least dependent components of the measured signals will correspond with the index of the sub-band with least dependent components of the sources. Unlike in the case of the competing methods, we demonstrate consistent performance in terms of accuracy and robustness as well as computational efficiency of WP SDICA algorithm.
Blind spectral unmixing by local maximization of non-Gaussianity
2008, Signal Processing
Citation Excerpt :
Since, for several reasons, our sources are not independent of one another, using blind separation methods based on independent component analysis may not be adequate to our purposes. The MaxNG dependent component analysis algorithm was verified to be more appropriate than other currently used techniques, such as FastICA and JADE, to extract dependent sources from linear mixtures thereof [13]. In this paper, we extend the data model by introducing additive Gaussian noise and provide some theoretical grounds for MaxNG.
We approach the estimation of material percentages per pixel (endmember fractional abundances) in hyperspectral remote-sensed images as a blind source separation problem. This task is commonly known as spectral unmixing. Classical techniques require the knowledge of the existing materials and their spectra, which is an unrealistic situation in most cases. In contrast to recently presented blind techniques based on independent component analysis, we implement here a dependent component analysis strategy, namely the MaxNG (maximum non-Gaussianity) algorithm, which is capable to separate even strongly dependent signals. We prove that, when the abundances verify a separability condition, they can be extracted by searching the local maxima of non-Gaussianity. We also provide enough theoretical as well as experimental facts that indicate that this condition holds true for endmember abundances. In addition, we discuss the implementation of MaxNG in a noisy scenario, we introduce a new technique for the removal of scale ambiguities of estimated sources, and a new fast algorithm for the calculation of a Parzen windows-based NG measure. We compare MaxNG to commonly used independent component analysis algorithms, such as FastICA and JADE. We analyze the efficiency of MaxNG in terms of the number of sensor channels, the number of available samples and other factors, by testing it on synthetically generated as well as real data. Finally, we present some examples of application of our technique to real images captured by the MIVIS airborne imaging spectrometer. Our results show that MaxNG is a good tool for spectral unmixing in a blind scenario.
Separation of recto-verso documents using copula based dependent source separation
2020, Proceedings of the IEEE Sensor Array and Multichannel Signal Processing Workshop
Efficient blind nonparametric dependent signal extraction algorithm for determined and underdetermined mixtures
2019, International Journal of Intelligent Systems Technologies and Applications

View all citing articles on Scopus

View full text

Separation of statistically dependent sources using an L2-distance non-Gaussianity measure

Abstract

Introduction

Section snippets

Classic BSS problem and previous results

Relaxing independence of sources: the maximization of NG approach

A measure of NG based on the L2-Euclidean distance

Separation of dependent sources

Minimum MI versus maximum NG strategies comparison

Conclusions and discussion

Acknowledgments

Signal Processing

Neural Networks

Geoexploration

Statist. Probab. Lett.

Statist. Probab. Lett.

Statist. Probab. Lett.

Independent Component Analysis

An information-maximization approach to blind separation and blind deconvolution

Neural Computation

Fast and robust fixed-point algorithms for independent component analysis

IEEE Trans. Neural Networks

Independent component analysis based on nonparametric density estimation

IEEE Trans. Neural Networks

correlation and gaussianity in independent component analysis

J. Machine Learning Res.

Kernel independent component analysis

J. Machine Learning Res.

Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications

Beyond independent components

Topographic independent component analysis

Neural Computation

Separation of statistically dependent sources using an $L^{2}$ -distance non-Gaussianity measure

A measure of NG based on the $L^{2}$ -Euclidean distance