Skip to main content

Über dieses Buch

This book constitutes the proceedings of the 10th International Conference on Latent Variable Analysis and Signal Separation, LVA/ICA 2012, held in Tel Aviv, Israel, in March 2012. The 20 revised full papers presented together with 42 revised poster papers, 1 keynote lecture, and 2 overview papers for the regular, as well as for the special session were carefully reviewed and selected from numerous submissions. Topics addressed are ranging from theoretical issues such as causality analysis and measures, through novel methods for employing the well-established concepts of sparsity and non-negativity for matrix and tensor factorization, down to a variety of related applications ranging from audio and biomedical signals to precipitation analysis.



General LVA/ICA Theory, Methods and Extensions

Block Component Analysis, a New Concept for Blind Source Separation

The fact that the decomposition of a matrix in a minimal number of rank-1 terms is not unique, leads to a basic indeterminacy in factor analysis. Factors and loadings are only unique under certain assumptions. Working in a multilinear framework has the advantage that the decomposition of a higher-order tensor in a minimal number of rank-1 terms (its Canonical Polyadic Decomposition (CPD)) is unique under mild conditions. We have recently introduced Block Term Decompositions (BTD) of a higher-order tensor. BTDs write a given tensor as a sum of terms that have low multilinear rank, without having to be rank-1. In this paper we explain how BTDs can be used for factor analysis and blind source separation. We discuss links with Canonical Polyadic Analysis (CPA) and Independent Component Analysis (ICA). Different variants of the approach are illustrated with examples.

Lieven De Lathauwer

Partially Linear Estimation with Application to Image Deblurring Using Blurred/Noisy Image Pairs

We address the problem of estimating a random vector


from two sets of measurements




, such that the estimator is linear in


. We show that the partially linear minimum mean squared error (PLMMSE) estimator requires knowing only the second-order moments of




, making it of potential interest in various applications. We demonstrate the utility of PLMMSE estimation in recovering a signal, which is sparse in a unitary dictionary, from noisy observations of it and of a filtered version of it. We apply the method to the problem of image enhancement from blurred/noisy image pairs. In this setting the PLMMSE estimator performs better than denoising or deblurring alone, compared to state-of-the-art algorithms. Its performance is slightly worse than joint denoising/deblurring methods, but it runs an order of magnitude faster.

Tomer Michaeli, Daniel Sigalov, Yonina C. Eldar

Causal Discovery for Linear Non-Gaussian Acyclic Models in the Presence of Latent Gaussian Confounders

LiNGAM has been successfully applied to casual inferences of some real world problems. Nevertheless, basic LiNGAM assumes that there is no latent confounder of the observed variables, which may not hold as the confounding effect is quite common in the real world. Causal discovery for LiNGAM in the presence of latent confounders is a more significant and challenging problem. In this paper, we propose a cumulant-based approach to the pairwise causal discovery for LiNGAM in the presence of latent confounders. The method assumes that the latent confounder is Gaussian distributed and statistically independent of the disturbances. We give a theoretical proof that in the presence of latent Gaussian confounders, the causal direction of the observed variables is identifiable under the mild condition that the disturbances are both super-gaussian or sub-gaussian. Experiments on synthesis data and real world data have been conducted to show the effectiveness of our proposed method.

Zhitang Chen, Laiwan Chan

Alleviating the Influence of Weak Data Asymmetries on Granger-Causal Analyses

We introduce the concepts of weak and strong asymmetries in multivariate time series in the context of causal modeling. Weak asymmetries are by definition differences in univariate properties of the data, which are not necessarily related to causal relationships between time series. Nevertheless, they might still mislead (in particular Granger-) causal analyses. We propose two general strategies to overcome the negative influence of weak asymmetries in causal modeling. One is to assess the confidence of causal predictions using the antisymmetry-symmetry ratio, while the other one is based on comparing the result of a causal analysis to that of an equivalent analysis of time-reversed data. We demonstrate that Granger Causality applied to the SiSEC challenge on causal analysis of simulated EEG data greatly benefits from our suggestions.

Stefan Haufe, Vadim V. Nikulin, Guido Nolte

Online PLCA for Real-Time Semi-supervised Source Separation

Non-negative spectrogram factorization algorithms such as probabilistic latent component analysis (PLCA) have been shown to be quite powerful for source separation. When training data for all of the sources are available, it is trivial to learn their dictionaries beforehand and perform supervised source separation in an online fashion. However, in many real-world scenarios (e.g. speech denoising), training data for one of the sources can be hard to obtain beforehand (e.g. speech). In these cases, we need to perform semi-supervised source separation and learn a dictionary for that source during the separation process. Existing semi-supervised separation approaches are generally offline, i.e. they need to access the entire mixture when updating the dictionary. In this paper, we propose an online approach to adaptively learn this dictionary and separate the mixture over time. This enables us to perform online semi-supervised separation for real-time applications. We demonstrate this approach on real-time speech denoising.

Zhiyao Duan, Gautham J. Mysore, Paris Smaragdis

Cramér-Rao Bound for Circular Complex Independent Component Analysis

Despite an increased interest in complex independent component analysis (ICA) during the last two decades, a closed-form expression for the Cramér-Rao bound (CRB) of the complex ICA problem has not yet been established. In this paper, we fill this gap for the noiseless case and circular sources. The CRB depends on the distributions of the sources only through two characteristic values which can be easily calculated. In addition, we study the CRB for the family of circular complex generalized Gaussian distributions (GGD) in more detail and compare it to simulation results using several ICA estimators.

Benedikt Loesch, Bin Yang

Complex Non-Orthogonal Joint Diagonalization Based on LU and LQ Decompositions

In this paper, we propose a class of complex non-orthogonal joint diagonalization (NOJD) algorithms with successive rotations. The proposed methods consider LU or LQ decompositions of the mixing matrices, and propose to solve the NOJD problem via two successive stages: L-stage and U (or Q)-stage. Moreover, as the manifolds of target matrices in these stages could be appropriately parameterized by a sequence of simple elementary triangular or unitary matrices, which depend on only one or two parameters, the high-dimensional minimization problems could be replaced by a sequence of lower-dimensional ones. As such, the proposed algorithms are of simple closed-form in each iteration, and do not require the target matrices to be Hermitian nor positive definite. Simulations are provided to compare the proposed methods to other complex NOJD methods.

Ke Wang, Xiao-Feng Gong, Qiu-Hua Lin

Exact and Approximate Quantum Independent Component Analysis for Qubit Uncoupling

As a tool for solving the Blind Quantum Source Separation problem introduced in our previous papers, we here propose the concept of Quantum Independent Component Analysis (QICA). Starting from quantum bits (qubits) with cylindrical-symmetry Heisenberg coupling, quantum-to-classical conversion yields an original nonlinear mixing model, which leads us to develop QICA methods dedicated to this model. Our first method consists in minimizing the mutual information of the outputs of our nonlinear separating system. It is attractive because it yields an exact solution, without any spurious points thanks to the (Q)ICA separability of the considered model. The second proposed method is a simpler approximation of the first one. It is based on a truncated expansion of differential entropy (or negentropy), derived from the Edgeworth expansion of probability density functions.

Yannick Deville, Alain Deville

A Matrix Joint Diagonalization Approach for Complex Independent Vector Analysis

Independent Vector Analysis (IVA) is a special form of Independent Component Analysis (ICA) in terms of group signals. Most IVA algorithms are developed via optimizing certain contrast functions. The main difficulty of these contrast function based approaches lies in estimating the unknown distribution of sources. On the other hand, tensorial approaches are efficient and richly available to the standard ICA problem, but unfortunately have not been explored considerably for IVA. In this paper, we propose a matrix joint diagonalization approach to solve the complex IVA problem. A conjugate gradient algorithm on an appropriate manifold setting is developed and investigated by several numerical experiments.

Hao Shen, Martin Kleinsteuber

Algebraic Solutions to Complex Blind Source Separation

The linear BSS problem can be solved under certain conditions via a joint diagonalization approach of only two matrices. Algebraic solutions, i.e. solutions that only involve eigenvalue decompositions or singular value decompositions, are of particular interest as efficient eigensolvers exist. Success of these methods depends significantly on particular properties of the sources, such as non-stationarity, non-whiteness, non-Gaussianity, and non-circularity. In this work, we propose alternative algebraic solutions to solve the complex BSS problem, which generalize the existing approaches. For example, applicability of SUT is limited to the positive definiteness of the covariance matrix, whereas our approach allows to exploit alternative information, such as autocorrelation and pseudo-autocorrelation, to solve the complex BBS problem.

Hao Shen, Martin Kleinsteuber

On the Separation Performance of the Strong Uncorrelating Transformation When Applied to Generalized Covariance and Pseudo-covariance Matrices

Traditionally, the strong uncorrelating transformation (SUT) is applied to the zero-lag sample autocovariance and pseudo- autocovariance matrices of the observed mixtures for separating complex-valued stationary sources. The performance of the SUT in that context has been recently analyzed. In this work we extend the analysis to the case where the SUT is applied to “generalized” covariance and pseudo-covariance matrices - which are prescribed by an arbitrary symmetric, positive definite matrix, termed an “association matrix”. The analysis applies not only to stationary sources, but also to sources with arbitrary complex-valued temporal covariance and pseudo-covariance. As we show, the use of generalized covariance and pseudo-covariance matrices for the SUT entails a potential for significant improvement in the resulting separation performance, as we also demonstrate in simulation.

Arie Yeredor

A Canonical Correlation Analysis Based Method for Improving BSS of Two Related Data Sets

We consider an extension of ICA and BSS for separating mutually dependent and independent components from two related data sets. We propose a new method which first uses canonical correlation analysis for detecting subspaces of independent and dependent components. Different ICA and BSS methods can after this be used for final separation of these components. Our method has a sound theoretical basis, and it is straightforward to implement and computationally not demanding. Experimental results on synthetic and real-world fMRI data sets demonstrate its good performance.

Juha Karhunen, Tele Hao, Jarkko Ylipaavalniemi

A Probability-Based Combination Method for Unsupervised Clustering with Application to Blind Source Separation

Unsupervised clustering algorithms can be combined to improve the robustness and the quality of the results, e.g. in blind source separation. Before combining the results of these clustering methods the corresponding clusters have to be aligned, but usually it is not known which clusters of the employed methods correspond to each other. In this paper, we present a method to avoid this correspondence problem using probability theory. We also present an application of our method in blind source separation. Our approach is better expandable than other state-of-the-art separation algorithms while leading to slightly better results.

Julian Mathias Becker, Martin Spiertz, Volker Gnann

Charrelation Matrix Based ICA

Charrelation matrices are a generalization of the covariance matrix, encompassing statistical information beyond second order while maintaining a convenient 2-dimensional structure. In the context of ICA, charrelation matrices-based separation was recently shown to potentially attain superior performance over commonly used methods. However, this approach is strongly dependent on proper selection of the parameters (termed


) which parameterize the charrelation matrices. In this work we derive a data-driven criterion for proper selection of the set of processing-points. The proposed criterion uses the available mixtures samples to quantify the resulting separation errors’ covariance matrix in terms of the processing points. Minimizing the trace of this matrix with respect to the processing points enables to optimize (asymptotically) the selection of these points, thereby yielding better separation results than other methods, as we demonstrate in simulation.

Alon Slapak, Arie Yeredor

Contrast Functions for Independent Subspace Analysis

We consider the Independent Subspace Analysis problem from the point of view of contrast functions, showing that contrast functions are able to partially solve the ISA problem. That is, basic ICA can solve the ISA problem up to within-subspace separation/analysis. We define sub- and super-Gaussian subspaces and extend to ISA a previous result on freedom of ICA from local optima. We also consider new types of dependent densities that satisfy or violate the entropy power inequality (EPI) condition.

Jason A. Palmer, Scott Makeig

Distributional Convergence of Subspace Estimates in FastICA: A Bootstrap Study

Independent component analysis (ICA) is possibly the most widespread approach to solve the blind source separation (BSS) problem. Many different algorithms have been proposed, together with an extensive body of work on the theoretical foundations and limits of the methods.

One practical concern about the use of ICA with real-world data is the reliability of its estimates. Variations of the estimates may stem from the inherent stochastic nature of the algorithm, or deviations from the theoretical assumptions. To overcome this problem, some approaches use bootstrapped estimates. The bootstrapping also allows identification of subspaces, since multiple separated components can share a common pattern of variation, when they belong to the same subspace. This is a desired ability, since real-world data often violates the strict independence assumption.

Based on empirical process theory, it can be shown that FastICA and bootstrapped FastICA are consistent and asymptotically normal. In the context of subspace analysis, the normal convergence is not satisfied. This paper shows such limitation, and how to circumvent it, when one can estimate the canonical directions within the subspace.

Jarkko Ylipaavalniemi, Nima Reyhani, Ricardo Vigário

New Online EM Algorithms for General Hidden Markov Models. Application to the SLAM Problem

In this contribution, new online EM algorithms are proposed to perform inference in general hidden Markov models. These algorithms update the parameter at some deterministic times and use Sequential Monte Carlo methods to compute approximations of filtering distributions. Their convergence properties are addressed in [9] and [10]. In this paper, the performance of these algorithms are highlighted in the challenging framework of Simultaneous Localization and Mapping.

Sylvain Le Corff, Gersende Fort, Eric Moulines

The Role of Whitening for Separation of Synchronous Sources

The separation of synchronous sources (SSS) is a relevant problem in the analysis of electroencephalogram (EEG) and magnetoencephalogram (MEG) synchrony. Previous experimental results, using pseudo-real MEG data, showed empirically that prewhitening improves the conditioning of the SSS problem. Simulations with synthetic data also suggest that the mixing matrix is much better conditioned after whitening is performed. Unlike in Independent Component Analysis (ICA), synchronous sources can be correlated. Thus, the reasoning used to motivate whitening in ICA is not directly extendable to SSS. In this paper, we analytically derive a tight upper bound for the condition number of the equivalent mixing matrix after whitening. We also present examples with simulated data, showing the correctness of this bound on sources with sub- and super-gaussian amplitudes. These examples further illustrate the large improvements in the condition number of the mixing matrix obtained through prewhitening, thus motivating the use of prewhitening in real applications.

Miguel Almeida, Ricardo Vigário, José Bioucas-Dias

Simultaneous Diagonalization of Skew-Symmetric Matrices in the Symplectic Group

Many source separation algorithms rely on the approximate simultaneous diagonalization of matrices. While there exist very efficient algorithms for symmetric matrices, the skew-symmetric case turned out to be more difficult. Here we show how the often used whitening/rotation approach for symmetric matrices can be translated to this case. While the former leads to orthogonal transformations in Euclidean space, the latter leads to symplectic transformations in symplectic space. It is demonstrated that the resulting algorithm is more stable than a naïve diagonalization that does not respect the symplectic structure of the problem.

Frank C. Meinecke

Joint Block Diagonalization Algorithms for Optimal Separation of Multidimensional Components

This paper deals with non-orthogonal joint block diagonalization. Two algorithms which minimize the Kullback-Leibler divergence between a set of real positive-definite matrices and a block-diagonal transformation thereof are suggested. One algorithm is based on the relative gradient, and the other is based on a quasi-Newton method. These algorithms allow for the optimal, in the mean square error sense, blind separation of multidimensional Gaussian components. Simulations demonstrate the convergence properties of the suggested algorithms, as well as the dependence of the criterion on some of the model parameters.

Dana Lahat, Jean-François Cardoso, Hagit Messer

On Computation of Approximate Joint Block-Diagonalization Using Ordinary AJD

Approximate joint block diagonalization (AJBD) of a set of matrices has applications in blind source separation, e.g., when the signal mixtures contain mutually independent subspaces of dimension higher than one. The main message of this paper is that certain ordinary approximate joint diagonalization (AJD) methods (which were originally derived for “degenerate” subspaces of dimension 1) can also be used successfully for AJBD, but not all are suitable equally well. In particular, we prove that when the set is exactly jointly block-diagonalizable, perfect block-diagonalization is attainable by the recently proposed AJD algorithm “U-WEDGE” (uniformly weighted exhaustive diagonalization with Gaussian iteration) - but this basic consistency property is not shared by some other popular AJD algorithms. In addition, we show using simulation, that in the more general noisy case, the subspace identification accuracy of U-WEDGE compares favorably to competitors.

Petr Tichavský, Arie Yeredor, Zbyněk Koldovský

Joint Diagonalization of Several Scatter Matrices for ICA

Procedures such as FOBI that jointly diagonalize two matrices with the independence property have a long tradition in ICA. These procedures have well-known statistical properties, for example they are prone to failure if the sources have multiple identical values on the diagonal. In this paper we suggest to diagonalize jointly


 ≥ 2 scatter matrices having the independence property. For the joint diagonalization we suggest a novel algorithm which finds the correct direction in an deflation based manner, one after another. The method is demonstrated in a small simulation study.

Klaus Nordhausen, Harold W. Gutch, Hannu Oja, Fabian J. Theis

To Infinity and Beyond: On ICA over Hilbert Spaces

The original Independent Component Analysis (ICA) problem of blindly separating a mixture of a finite number of real-valued statistically independent one-dimensional sources has been extended in a number of ways in recent years. These include dropping the assumption that all sources are one-dimensional and some extensions to the case where the sources are not real-valued. We introduce an extension in a further direction, no longer assuming only a finite number of sources, but instead allowing infinitely many. We define a notion of independent sources for this case and show separability of ICA in this framework.

Harold W. Gutch, Fabian J. Theis

Sparsity, Sparse Coding and Dictionary Learning

Regularized Sparse Representation for Spectrometric Pulse Separation and Counting Rate Estimation

One of the objectives of nuclear spectroscopy is to estimate the varying counting rate activity of unknown radioactive sources. When this activity is high, however, nonparalyzable detectors suffer from a type of distortion called pile-up effect, when pulses created from different sources tend to overlap. This distortion leads to an underestimation of the activity, which explains the interest of methods for individual pulse separation. We suggest in this paper a two-step method for a better counting rate estimation: the signal is first approximated using a block-sparse regression method, allowing to separate individual pulses quite well. We then estimate their arrival times and plug them into a known activity estimator. Results on simulations and real data illustrate the efficiency of the proposed approach.

Tom Trigano, Yann Sepulcre

Some Uniqueness Results in Sparse Convolutive Source Separation

The fundamental problems in the traditional frequency domain approaches to convolutive blind source separation are 1) arbitrary permutations and 2) arbitrary scaling in each frequency bin of the estimated filters or sources. These ambiguities are corrected by taking into account some specific properties of the filters or sources, or both. This paper focusses on the filter permutation problem, assuming the absence of the scaling ambiguity, investigating the use of temporal sparsity of the filters as a property to aid permutation correction. Theoretical and experimental results bring out the potential as well as the extent to which sparsity can be used as a hypothesis to formulate a well posed permutation problem.

Alexis Benichoux, Prasad Sudhakar, Fréderic Bimbot, Rémi Gribonval

Ternary Sparse Coding

We study a novel sparse coding model with discrete and symmetric prior distribution. Instead of using continuous latent variables distributed according to heavy tail distributions, the latent variables of our approach are discrete. In contrast to approaches using binary latents, we use latents with three states (-1, 0, and 1) following a symmetric and zero-mean distribution. While using discrete latents, the model thus maintains important properties of standard sparse coding models and of its recent variants. To efficiently train the parameters of our probabilistic generative model, we apply a truncated variational EM approach (Expectation Truncation). The resulting learning algorithm infers all model parameters including the variance of data noise and data sparsity. In numerical experiments on artificial data, we show that the algorithm efficiently recovers the generating parameters, and we find that the applied variational approach helps in avoiding local optima. Using experiments on natural image patches, we demonstrate large-scale applicability of the approach and study the obtained Gabor-like basis functions.

Georgios Exarchakis, Marc Henniges, Julian Eggert, Jörg Lücke

Closed-Form EM for Sparse Coding and Its Application to Source Separation

We define and discuss a novel sparse coding algorithm based on closed-form EM updates and continuous latent variables. The underlying generative model consists of a standard ‘spike-and-slab’ prior and a Gaussian noise model. Closed-form solutions for E- and M-step equations are derived by generalizing probabilistic PCA. The resulting EM algorithm can take all modes of a potentially multimodal posterior into account. The computational cost of the algorithm scales exponentially with the number of hidden dimensions. However, with current computational resources, it is still possible to efficiently learn model parameters for medium-scale problems. Thus, the algorithm can be applied to the typical range of source separation tasks. In numerical experiments on artificial data we verify likelihood maximization and show that the derived algorithm recovers the sparse directions of standard sparse coding distributions. On source separation benchmarks comprised of realistic data we show that the algorithm is competitive with other recent methods.

Jörg Lücke, Abdul-Saboor Sheikh

Convolutive Underdetermined Source Separation through Weighted Interleaved ICA and Spatio-temporal Source Correlation

This paper presents a novel method for underdetermined acoustic source separation of convolutive mixtures. Multiple complex-valued Independent Component Analysis adaptations jointly estimate the mixing matrix and the temporal activities of multiple sources in each frequency. A structure based on a recursive temporal weighting of the gradient enforces each ICA adaptation to estimate mixing parameters related to sources having a disjoint temporal activity. Permutation problem is reduced imposing a multiresolution spatio-temporal correlation of the narrow-band components. Finally, aligned mixing parameters are used to recover the sources through



-norm minimization and a post-processing based on a single channel Wiener filtering. Promising results obtained over a public dataset show that the proposed method is an effective solution to the underdetermined source separation problem.

Francesco Nesta, Maurizio Omologo

Dictionary Learning with Large Step Gradient Descent for Sparse Representations

This work presents a new algorithm for dictionary learning. Existing algorithms such as MOD and K-SVD often fail to find the best dictionary because they get trapped in a local minimum. Olshausen and Field’s Sparsenet algorithm relies on a fixed step projected gradient descent. With the right step, it can avoid local minima and converge towards the global minimum. The problem then becomes to find the right step size. In this work we provide the expression of the optimal step for the gradient descent but the step we use is twice as large as the optimal step. That large step allows the descent to bypass local minima and yields significantly better results than existing algorithms. The algorithms are compared on synthetic data. Our method outperforms existing algorithms both in approximation quality and in perfect recovery rate if an oracle support for the sparse representation is provided.

Boris Mailhé, Mark D. Plumbley

Separation of Sparse Signals in Overdetermined Linear-Quadratic Mixtures

In this work, we deal with the problem of nonlinear blind source separation (BSS). We propose a new method for BSS in overdetermined linear-quadratic (LQ) mixtures. By exploiting the assumption that the sources are sparse in a transformed domain, we define a framework for canceling the nonlinear part of the mixing process. After that, separation can be conducted by linear BSS algorithms. Experiments with synthetic data are performed to assess the viability of our proposal.

Leonardo T. Duarte, Rafael A. Ando, Romis Attux, Yannick Deville, Christian Jutten

Collaborative Filtering via Group-Structured Dictionary Learning

Structured sparse coding and the related structured dictionary learning problems are novel research areas in machine learning. In this paper we present a new application of structured dictionary learning for collaborative filtering based recommender systems. Our extensive numerical experiments demonstrate that the presented method outperforms its state-of-the-art competitors and has several advantages over approaches that do not put structured constraints on the dictionary elements.

Zoltán Szabó, Barnabás Póczos, András Lőrincz

Group Polytope Faces Pursuit for Recovery of Block-Sparse Signals

Polytope Faces Pursuit is an algorithm that solves the standard sparse recovery problem. In this paper, we consider the case of block structured sparsity, and propose a novel algorithm based on the Polytope Faces Pursuit which incorporates this prior knowledge. The so-called Group Polytope Faces Pursuit is a greedy algorithm that adds one group of dictionary atoms at a time and adopts a path following approach based on the geometry of the polar polytope associated with the dual linear program. The complexity of the algorithm is of similar order to Group Orthogonal Matching Pursuit. Numerical experiments demonstrate the validity of the algorithm and illustrate that in certain cases the proposed algorithm outperforms the Group Orthogonal Matching Pursuit algorithm.

Aris Gretsistas, Mark D. Plumbley

Non-negative and Other Factorizations

Nonnegative Matrix Factorization via Generalized Product Rule and Its Application for Classification

Nonnegative Matrix Factorization (NMF) is broadly used as a mathematical tool for processing tasks of tabulated data. In this paper, an extension of NMF based on a generalized product rule, defined with a nonlinear one-parameter function and its inverse, is proposed. From a viewpoint of subspace methods, the extended NMF constructs flexible subspaces which plays an important role in classification tasks. Experimental results on benchmark datasets show that the proposed extension improves classification accuracies.

Yu Fujimoto, Noboru Murata

An Algebraic Method for Approximate Rank One Factorization of Rank Deficient Matrices

In this paper we consider the problem of finding approximate common rank one factors for a set of matrices. Instead of jointly diagonalizing the matrices, we perform calculations directly in the problem intrinsic domain: we present an algorithm, AROFAC, which searches the approximate linear span of the matrices using an indicator function for the rank one factors, finding specific single sources. We evaluate the feasibility of this approach by discussing simulations on generated data and a neurophysiological dataset. Note however that our contribution is intended to be mainly conceptual in nature.

Franz J. Király, Andreas Ziehe, Klaus-Robert Müller

Bayesian Non-negative Matrix Factorization with Learned Temporal Smoothness Priors

We combine the use of a Bayesian NMF framework to add temporal smoothness priors, with a supervised prior learning of the smoothness parameters on a database of solo musical instruments. The goal is to separate main instruments from realistic mono musical mixtures. The proposed learning step allows a better initialization of the spectral dictionaries and of the smoothness parameters. This approach is shown to outperform the separation results compared to the unsupervised version.

Mathieu Coïc, Juan José Burred

On Connection between the Convolutive and Ordinary Nonnegative Matrix Factorizations

A connection between the convolutive nonnegative matrix factorization (NMF) and the conventional NMF has been established. As a result, we can convey arbitrary alternating update rules for NMF to update rules for CNMF. In order to illustrate the novel derivation method, a multiplicative algorithm and a new ALS algorithm for CNMF are derived. The experiments confirm validity and high performance of our method and of the proposed algorithm.

Anh Huy Phan, Andrzej Cichocki, Petr Tichavský, Zbyněk Koldovský

On Revealing Replicating Structures in Multiway Data: A Novel Tensor Decomposition Approach

A novel tensor decomposition is proposed to make it possible to identify replicating structures in complex data, such as textures and patterns in music spectrograms. In order to establish a computational framework for this paradigm, we adopt a multiway (tensor) approach. To this end, a novel tensor product is introduced, and the subsequent analysis of its properties shows a perfect match to the task of identification of recurrent structures present in the data. Out of a whole class of possible algorithms, we illuminate those derived so as to cater for orthogonal and nonnegative patterns. Simulations on texture images and a complex music sequence confirm the benefits of the proposed model and of the associated learning algorithms.

Anh Huy Phan, Andrzej Cichocki, Petr Tichavský, Danilo P. Mandic, Kiyotoshi Matsuoka

Audio Separation and Analysis

An On-Line NMF Model for Temporal Pattern Learning: Theory with Application to Automatic Speech Recognition

Convolutional non-negative matrix factorization (CNMF) can be used to discover recurring temporal (sequential) patterns in sequential vector non-negative data such as spectrograms or posteriorgrams. Drawbacks of this approach are the rigidity of the patterns and that it is intrinsically a batch method. However, in speech processing, like in many other applications, the patterns show a great deal of time warping variation and recognition should be on-line (possibly with some processing delay). Therefore, time-coded NMF (TC-NMF) is proposed as an alternative to CNMF to locate temporal patterns in time. TC-NMF is motivated by findings in neuroscience. The sequential data are first processed by a bank of filters such as leaky integrators with different time constants. The responses of these filters are modeled jointly by a constrained NMF. Algorithms for learning, decoding and locating patterns in time are proposed and verified with preliminary ASR experiments.

Hugo Van Hamme

Low-Latency Instrument Separation in Polyphonic Audio Using Timbre Models

This research focuses on the removal of the singing voice in polyphonic audio recordings under real-time constraints. It is based on time-frequency binary masks resulting from the combination of azimuth, phase difference and absolute frequency spectral bin classification and harmonic-derived masks. For the harmonic-derived masks, a pitch likelihood estimation technique based on Tikhonov regularization is proposed. A method for target instrument pitch tracking makes use of supervised timbre models. This approach runs in real-time on off-the-shelf computers with latency below 250ms. The method was compared to a state of the art Non-negative Matrix Factorization (NMF) offline technique and to the ideal binary mask separation. For the evaluation we used a dataset of multi-track versions of professional audio recordings.

Ricard Marxer, Jordi Janer, Jordi Bonada

Real-Time Speech Separation by Semi-supervised Nonnegative Matrix Factorization

In this paper, we present an on-line semi-supervised algorithm for real-time separation of speech and background noise. The proposed system is based on Nonnegative Matrix Factorization (NMF), where fixed speech bases are learned from training data whereas the noise components are estimated in real-time on the recent past.

Experiments with spontaneous conversational speech and real-life non-stationary noise show that this system performs as well as a supervised NMF algorithm exploiting noise components learned from the same noise environment as the test sample. Furthermore, it outperforms a supervised system trained on different noise conditions.

Cyril Joder, Felix Weninger, Florian Eyben, David Virette, Björn Schuller

An Audio-Video Based IVA Algorithm for Source Separation and Evaluation on the AV16.3 Corpus

The machine cocktail party problem has been researched for several decades. Although many blind source separation schemes have been proposed to address this problem, few of them are tested by using a real room audio video recording. In this paper, we propose an audio video based independent vector analysis (AVIVA) method, and test it with other independent vector analysis methods by using a real room recording dataset, i.e. the AV16.3 corpus. Moreover, we also use a new method based on pitch difference detection for objective evaluation of the separation performance of the algorithms when applied on the real dataset which confirms advantages of using the visual modality with IVA.

Yanfeng Liang, Jonathon Chambers

Non-negative Matrix Factorization Based Noise Reduction for Noise Robust Automatic Speech Recognition

In this paper, we propose a noise reduction method based on non-negative matrix factorization (NMF) for noise-robust automatic speech recognition (ASR). Most noise reduction methods applied to ASR front-ends have been developed for suppressing background noise that is assumed to be stationary rather than non-stationary. Instead, the proposed method attenuates non-target noise by a hybrid approach that combines a Wiener filtering and an NMF technique. This is motivated by the fact that Wiener filtering and NMF are suitable for reduction of stationary and non-stationary noise, respectively. It is shown from ASR experiments that an ASR system employing the proposed approach improves the average word error rate by 11.9%, 22.4%, and 5.2%, compared to systems employing the two-stage mel-warped Wiener filter, the minimum mean square error log-spectral amplitude estimator, and NMF with a Wiener post-filter, respectively.

Seon Man Kim, Ji Hun Park, Hong Kook Kim, Sung Joo Lee, Yun Keun Lee

Audio Imputation Using the Non-negative Hidden Markov Model

Missing data in corrupted audio recordings poses a challenging problem for audio signal processing. In this paper we present an approach that allows us to estimate missing values in the time-frequency domain of audio signals. The proposed approach, based on the Non-negative Hidden Markov Model, enables more temporally coherent estimation for the missing data by taking into account both the spectral and temporal information of the audio signal. This approach is able to reconstruct highly corrupted audio signals with large parts of the spectrogram missing. We demonstrate this approach on real-world polyphonic music signals. The initial experimental results show that our approach has advantages over a previous missing data imputation method.

Jinyu Han, Gautham J. Mysore, Bryan Pardo

A Non-negative Approach to Language Informed Speech Separation

The use of high level information in source separation algorithms can greatly constrain the problem and lead to improved results by limiting the solution space to semantically plausible results. The automatic speech recognition community has shown that the use of high level information in the form of language models is crucial to obtaining high quality recognition results. In this paper, we apply language models in the context of speech separation. Specifically, we use language models to constrain the recently proposed non-negative factorial hidden Markov model. We compare the proposed method to non-negative spectrogram factorization using standard source separation metrics and show improved results in all metrics.

Gautham J. Mysore, Paris Smaragdis

Temporally-Constrained Convolutive Probabilistic Latent Component Analysis for Multi-pitch Detection

In this paper, a method for multi-pitch detection which exploits the temporal evolution of musical sounds is presented. The proposed method extends the shift-invariant probabilistic latent component analysis algorithm by introducing temporal constraints using multiple Hidden Markov Models, while supporting multiple-instrument spectral templates. Thus, this model can support the representation of sound states such as attack, sustain, and decay, while the shift-invariance across log-frequency can be utilized for multi-pitch detection in music signals that contain frequency modulations or tuning changes. For note tracking, pitch-specific Hidden Markov Models are also employed in a postprocessing step. The proposed system was tested on recordings from the RWC database, the MIREX multi-F0 dataset, and on recordings from a Disklavier piano. Experimental results using a variety of error metrics, show that the proposed system outperforms a non-temporally constrained model. The proposed system also outperforms state-of-the art transcription algorithms for the RWC and Disklavier datasets.

Emmanouil Benetos, Simon Dixon

A Latently Constrained Mixture Model for Audio Source Separation and Localization

We present a method for audio source separation and localization from binaural recordings. The method combines a new generative probabilistic model with time-frequency masking. We suggest that device-dependent relationships between point-source positions and interaural spectral cues may be learnt in order to constrain a mixture model. This allows to capture subtle separation and localization features embedded in the auditory data. We illustrate our method with data composed of two and three mixed speech signals in the presence of reverberations. Using standard evaluation metrics, we compare our method with a recent binaural-based source separation-localization algorithm.

Antoine Deleforge, Radu Horaud

Multiple Instrument Mixtures Source Separation Evaluation Using Instrument-Dependent NMF Models

This work makes use of instrument-dependent models to separate the different sources of multiple instrument mixtures. Three different models are applied: (a) basic spectral model with harmonic constraint, (b) source-filter model with harmonic-comb excitation and (c) source-filter model with multi-excitation per instrument. The parameters of the models are optimized by an augmented NMF algorithm and learnt in a training stage. The models are presented in [1], here the experimental setting for the application to source separation is explained. The instrument-dependent NMF models are first trained and then a test stage is performed. A comparison with other state-of-the-art software is presented. Results show that source-filter model with multi-excitation per instrument outperforms the other compared models.

Francisco J. Rodriguez-Serrano, Julio J. Carabias-Orti, Pedro Vera-Candeas, Tuomas Virtanen, Nicolas Ruiz-Reyes

Complex Extension of Infinite Sparse Factor Analysis for Blind Speech Separation

We present a method of blind source separation (BSS) for speech signals using a complex extension of infinite sparse factor analysis (ISFA) in the frequency domain. Our method is robust against delayed signals that usually occur in real environments, such as reflections, short-time reverberations, and time lags of signals arriving at microphones. ISFA is a conventional non-parametric Bayesian method of BSS, which has only been applied to time domain signals because it can only deal with real signals. Our method uses complex normal distributions to estimate source signals and mixing matrix. Experimental results indicate that our method outperforms the conventional ISFA in the average signal-to-distortion ratio (SDR).

Kohei Nagira, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno

A General Framework for Online Audio Source Separation

We consider the problem of online audio source separation. Existing algorithms adopt either a sliding block approach or a stochastic gradient approach, which is faster but less accurate. Also, they rely either on spatial cues or on spectral cues and cannot separate certain mixtures. In this paper, we design a general online audio source separation framework that combines both approaches and both types of cues. The model parameters are estimated in the Maximum Likelihood (ML) sense using a Generalised Expectation Maximisation (GEM) algorithm with multiplicative updates. The separation performance is evaluated as a function of the block size and the step size and compared to that of an offline algorithm.

Laurent S. R. Simon, Emmanuel Vincent

Sound Recognition in Mixtures

In this paper, we describe a method for recognizing sound sources in a mixture. While many audio-based content analysis methods focus on detecting or classifying target sounds in a discriminative manner, we approach this as a regression problem, in which we estimate the relative proportions of sound sources in the given mixture. Using source separation ideas based on probabilistic latent component analysis, we directly estimate these proportions from the mixture without actually separating the sources. We also introduce a method for learning a transition matrix to temporally constrain the problem. We demonstrate the proposed method on a mixture of five classes of sounds and show that it is quite effective in correctly estimating the relative proportions of the sounds in the mixture.

Juhan Nam, Gautham J. Mysore, Paris Smaragdis

SiSEC 2011 Evaluation Campaign

The 2011 Signal Separation Evaluation Campaign (SiSEC2011): - Audio Source Separation -

This paper summarizes the audio part of the 2011 community-based Signal Separation Evaluation Campaign (SiSEC2011). Four speech and music datasets were contributed, including datasets recorded in noisy or dynamic environments and a subset of the SiSEC2010 datasets. The participants addressed one or more tasks out of four source separation tasks, and the results for each task were evaluated using different objective performance criteria. We provide an overview of the audio datasets, tasks and criteria. We also report the results achieved with the submitted systems, and discuss organization strategies for future campaigns.

Shoko Araki, Francesco Nesta, Emmanuel Vincent, Zbyněk Koldovský, Guido Nolte, Andreas Ziehe, Alexis Benichoux

The 2011 Signal Separation Evaluation Campaign (SiSEC2011): - Biomedical Data Analysis -

This paper summarizes the bio part of the 2011 community based Signal Separation Evaluation Campaign (SiSEC2011). Two different data sets were given. In the first task, participants were asked to estimate the causal relations of underlying sources from simulated bivariate EEG data. In the second task, participants were asked to reconstruct signaling pathways or parts of it from the microarray expression profiles. The results for each task were evaluated using different objective performance criteria. We provide an overview of the biomedical datasets, tasks and criteria, and we report on the achieved results.

Guido Nolte, Dominik Lutter, Andreas Ziehe, Francesco Nesta, Emmanuel Vincent, Zbyněk Koldovský, Alexis Benichoux, Shoko Araki

Improved Perceptual Metrics for the Evaluation of Audio Source Separation

We aim to predict the perceived quality of estimated source signals in the context of audio source separation. Recently, we proposed a set of metrics called PEASS that consist of three computation steps: decomposition of the estimation error into three components, measurement of the salience of each component via the PEMO-Q auditory-motivated measure, and combination of these saliences via a nonlinear mapping trained on subjective opinion scores. The parameters of the decomposition were shown to have little influence on the prediction performance. In this paper, we evaluate the impact of the parameters of PEMO-Q and the nonlinear mapping on the prediction performance. By selecting the optimal parameters, we improve the average correlation with mean opinion scores (MOS) from 0.738 to 0.909 in a cross-validation setting. The resulting improved metrics are used in the context of the 2011 Signal Separation Evaluation Campaign (SiSEC).

Emmanuel Vincent

Musical Audio Source Separation Based on User-Selected F0 Track

A system for user-guided audio source separation is presented in this article. Following previous works on time-frequency music representations, the proposed User Interface allows the user to select the desired audio source, by means of the assumed fundamental frequency (F0) track of that source. The system then automatically refines the selected F0 tracks, estimates and separates the corresponding source from the mixture. The interface was tested and the separation results compare positively to the results of a fully automatic system, showing that the F0 track selection improves the separation performance.

Jean-Louis Durrieu, Jean-Philippe Thiran

A GMM Sound Source Model for Blind Speech Separation in Under-determined Conditions

This paper focuses on blind speech separation in under-determined conditions, that is, in the case when there are more sound sources than microphones. We introduce a sound source model based on the Gaussian mixture model (GMM) to represent a speech signal in the time-frequency domain, and derive rules for updating the model parameters using the auxiliary function method. Our GMM sound source model consists of two kinds of Gaussians: sharp ones representing harmonic parts and smooth ones representing nonharmonic parts. Experimental results reveal that our method outperforms the method based on non-negative matrix factorization (NMF) by 0.7dB in the signal-to-distortion ratio (SDR), and by 1.7dB in the signal-to-interference ratio (SIR). This means that our method effectively removes interference coming from other talkers.

Yasuharu Hirasawa, Naoki Yasuraoka, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno

Model-Driven Speech Enhancement for Multisource Reverberant Environment (Signal Separation Evaluation Campaign (SiSEC) 2011)

We present a low complexity speech enhancement technique for real-life multi-source environments. Assuming that the speaker identity is known a priori, we present the idea of incorporating speaker model to enhance a target signal corrupted in non-stationary noise in a reverberant scenario. Based on experiments, this helps to improve the limited performance of noise-tracking based speech enhancement methods under unpredictable and non-stationary noise scenarios. Using pre-trained speaker model captures a constrained subspace for target speech and is capable to provide enhanced speech estimate by rejecting the non-stationary noise sources. Experimental results on Signal Separation Evaluation Campaign (SiSEC) showed that the proposed approach is successful in canceling the interference signal in the noisy input and providing an enhanced output signal.

Pejman Mowlaee, Rahim Saeidi, Rainer Martin

Semi-blind Source Separation Based on ICA and Overlapped Speech Detection

We propose a semi-blind method for separation of stereo recordings of several sources. The method begins with computation of a set of cancellation filters for potential fixed positions of the sources. These filters are computed from one-source-only intervals selected upon cross-talk detection. Each source in some of the fixed positions is canceled by the corresponding filter, by which the other sources are separated. The former source can be then separated by adaptive suppression of the separated sources. To select the appropriate cancellation filter, we use Independent Component Analysis. The performance of the proposed method is verified on real-world SiSEC data with two fixed and/or moving sources.

Jiří Málek, Zbyněk Koldovský, Petr Tichavský

Nonparametric Modelling of ECG: Applications to Denoising and to Single Sensor Fetal ECG Extraction

In this work, we tackle the problem of fetal electrocardiogram (ECG) extraction from a single sensor. The proposed method is based on non-parametric modelling of the ECG signal described thanks to its second order statistics. Each assumed source in the mixture is thus modelled as a second order process thanks to its covariance function. This modelling allows to reconstruct each source by maximizing the related posterior distribution. The proposed method is tested on synthetic data to evaluate its performance behavior to denoise ECG. It is then applied on real data to extract fetal ECG from a single maternal abdominal sensor.

Bertrand Rivet, Mohammad Niknazar, Christian Jutten

Other Applications

Nesterov’s Iterations for NMF-Based Supervised Classification of Texture Patterns

Nonnegative Matrix Factorization (NMF) is an efficient tool for a supervised classification of various objects such as text documents, gene expressions, spectrograms, facial images, and texture patterns. In this paper, we consider the projected Nesterov’s method for estimating nonnegative factors in NMF, especially for classification of texture patterns. This method belongs to a class of gradient (first-order) methods but its convergence rate is determined by





). The classification experiments for the selected images taken from the UIUC database demonstrate a high efficiency of the discussed approach.

Rafal Zdunek, Zhaoshui He

Detection of Aliasing in Image Sequences Using Nonlinear Factor Analysis

In computational imaging, reconstructing a single high-reso- lution scene from multiple low-resolution aliased images is most efficient if done only over those regions where significant aliasing occurs. This paper presents a framework for detecting pixel locations exhibiting the most-prominent effects of aliasing in a sequence of subpixel-shifted images. The process employs nonlinear factor analysis of the image sequence, in which the latent variables are the relative position offsets for each image in the sequence, followed by outlier detection on the error residuals from the joint estimation process. Numerical examples illustrate the capabilities of the methodology.

Scott C. Douglas

Geometrical Method Using Simplicial Cones for Overdetermined Nonnegative Blind Source Separation: Application to Real PET Images

This paper presents a geometrical method for solving the overdetermined Nonnegative Blind Source Separation (N-BSS) problem. Considering each column of the mixed data as a point in the data space, we develop a Simplicial Cone Shrinking Algorithm for Unmixing Nonnegative Sources (SCSA-UNS). The proposed method estimates the mixing matrix and the sources by fitting a simplicial cone to the scatter plot of the mixed data. It requires weak assumption on the sources distribution, in particular the independence of the different sources is not necessary. Simulations on synthetic data show that SCSA-UNS outperforms other existing geometrical methods in noiseless case. Experiment on real Dynamic Positon Emission Tomography (PET) images illustrates the efficiency of the proposed method.

Wendyam S. B. Ouedraogo, Antoine Souloumiac, Meriem Jaidane, Christian Jutten

Multi-domain Feature of Event-Related Potential Extracted by Nonnegative Tensor Factorization: 5 vs. 14 Electrodes EEG Data

As nonnegative tensor factorization (NTF) is particularly useful for the problem of underdetermined linear transform model, we performed NTF on the EEG data recorded from 14 electrodes to extract the multi-domain feature of N170 which is a visual event-related potential (ERP), as well as 5 typical electrodes in occipital-temporal sites for N170 and in frontal-central sites for vertex positive potential (VPP) which is the counterpart of N170, respectively. We found that the multi-domain feature of N170 from 5 electrodes was very similar to that from 14 electrodes and more discriminative for different groups of participants than that of VPP from 5 electrodes. Hence, we conclude that when the data of typical electrodes for an ERP are decomposed by NTF, the estimated multi-domain feature of this ERP keeps identical to its counterpart extracted from the data of all electrodes used in one ERP experiment.

Fengyu Cong, Anh Huy Phan, Piia Astikainen, Qibin Zhao, Jari K. Hietanen, Tapani Ristaniemi, Andrzej Cichocki

The Use of Linear Feature Projection for Precipitation Classification Using Measurements from Commercial Microwave Links

High frequency electromagnetic waves are highly influenced by atmospheric conditions, namely wireless microwave links with carrier frequency of tens of GHz can be used for precipitation monitoring. In the scope of this paper we present a novel detection/classification system capable of detecting wet periods, with the ability to classify the precipitation type as rain or sleet, given an attenuation signal from spatially distributed wireless commercial microwave links. Fade (attenuation) dynamics was selected as a

discriminating feature

providing the data for classification. Linear Feature Extraction method is formulated; thereafter, the efficiency is evaluated based on real data. The detection/classification system is based on the Fisher’s

linear discriminant


likelihood ratio test

. Its performance is demonstrated using actual Received Signal Level measurements from a cellular backhaul network in the northern part of Israel. In particular, the use of the raw data as well as its derivatives to achieve better classification performance is suggested.

Dani Cherkassky, Jonatan Ostrometzky, Hagit Messer

Bayesian Inference of Latent Causes in Gene Regulatory Dynamics

In the study of gene regulatory networks, more and more quantitative data becomes available. However, few of the players in such networks are observed, others are latent. Focusing on the inference of multiple such latent causes, we arrive at a blind source separation problem. Under the assumptions of independent sources and Gaussian noise, this condenses to a Bayesian independent component analysis problem with a natural dynamic structure. We here present a method for the inference in networks with linear dynamics, with a straightforward extension to the nonlinear case. The proposed method uses a maximum a posteriori estimate of the latent causes, with additional prior information guaranteeing independence. We illustrate the feasibility of our method on a toy example and compare the results with standard approaches.

Sabine Hug, Fabian J. Theis

Bayesian Fuzzy Clustering of Colored Graphs

With the increasing availability of interaction data stemming form fields as diverse as systems biology, telecommunication or social sciences, the task of mining and understanding the underlying graph structures becomes more and more important. Here we focus on data with different types of nodes; we subsume this meta information in the color of a node. An important first step is the unsupervised clustering of nodes into communities, which are of the same color and highly connected within but sparsely connected to the rest of the graph. Recently we have proposed a fuzzy extension of this clustering concept, which allows a node to have membership in multiple clusters. The resulting gradient descent algorithm shared many similarities with the multiplicative update rules from nonnegative matrix factorization. Two issues left open were the determination of the number of clusters of each color, as well as the non-defined integration of additional prior information. In this contribution we resolve these issues by reinterpreting the factorization in a Bayesian framework, which allows the ready inclusion of priors. We integrate automatic relevance determination to automatically estimate group sizes. We derive a maximum-a-posteriori estimator, and illustrate the feasibility of the approach on a toy as well as a protein-complex hypergraph, where the resulting fuzzy clusters show significant enrichment of distinct gene ontology categories.

Fabian J. Theis


Weitere Informationen

Premium Partner