Skip to main content

2017 | Buch

Latent Variable Analysis and Signal Separation

13th International Conference, LVA/ICA 2017, Grenoble, France, February 21-23, 2017, Proceedings

herausgegeben von: Petr Tichavský, Massoud Babaie-Zadeh, Olivier J.J. Michel, Nadège Thirion-Moreau

Verlag: Springer International Publishing

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

This book constitutes the proceedings of the 13th International Conference on Latent Variable Analysis and Signal Separation, LVA/ICA 2017, held in Grenoble, France, in Feburary 2017. The 53 papers presented in this volume were carefully reviewed and selected from 60 submissions. They were organized in topical sections named: tensor approaches; from source positions to room properties: learning methods for audio scene geometry estimation; tensors and audio; audio signal processing; theoretical developments; physics and bio signal processing; latent variable analysis in observation sciences; ICA theory and applications; and sparsity-aware signal processing.

Inhaltsverzeichnis

Frontmatter

Tensor Approaches

Frontmatter
Higher-Order Block Term Decomposition for Spatially Folded fMRI Data

The growing use of neuroimaging technologies generates a massive amount of biomedical data that exhibit high dimensionality. Tensor-based analysis of brain imaging data has been proved quite effective in exploiting their multiway nature. The advantages of tensorial methods over matrix-based approaches have also been demonstrated in the context of functional magnetic resonance imaging (fMRI) data analysis. However, such methods can become ineffective in demanding scenarios, involving, e.g., strong noise and/or significant overlapping of activated regions. This paper aims at investigating the possible gains that can be obtained from a better exploitation of the spatial dimension, through a higher (than 3)-order tensor modeling of the fMRI signals. In this context, a higher-order Block Term Decomposition (BTD) is applied, for the first time in fMRI analysis. Its effectiveness in handling strong instances of noise is demonstrated via extensive simulation results.

Christos Chatzichristos, Eleftherios Kofidis, Yiannis Kopsinis, Manuel Morante Moreno, Sergios Theodoridis
Modeling Parallel Wiener-Hammerstein Systems Using Tensor Decomposition of Volterra Kernels

Providing flexibility and user-interpretability in nonlinear system identification can be achieved by means of block-oriented methods. One of such block-oriented system structures is the parallel Wiener-Hammerstein system, which is a sum of Wiener-Hammerstein branches, consisting of static nonlinearities sandwiched between linear dynamical blocks. Parallel Wiener-Hammerstein models have more descriptive power than their single-branch counterparts, but their identification is a non-trivial task that requires tailored system identification methods. In this work, we will tackle the identification problem by performing a tensor decomposition of the Volterra kernels obtained from the nonlinear system. We illustrate how the parallel Wiener-Hammerstein block-structure gives rise to a joint tensor decomposition of the Volterra kernels with block-circulant structured factors. The combination of Volterra kernels and tensor methods is a fruitful way to tackle the parallel Wiener-Hammerstein system identification task. In simulation experiments, we were able to reconstruct very accurately the underlying blocks under noisy conditions.

Philippe Dreesen, David T. Westwick, Johan Schoukens, Mariya Ishteva
Fast Nonnegative Matrix Factorization and Completion Using Nesterov Iterations

In this paper, we aim to extend Nonnegative Matrix Factorization with Nesterov iterations (Ne-NMF)—well-suited to large-scale problems—to the situation when some entries are missing in the observed matrix. In particular, we investigate the Weighted and Expectation-Maximization strategies which both provide a way to process missing data. We derive their associated extensions named W-NeNMF and EM-W-NeNMF, respectively. The proposed approaches are then tested on simulated nonnegative low-rank matrix completion problems where the EM-W-NeNMF is shown to outperform state-of-the-art methods and the W-NeNMF technique.

Clément Dorffer, Matthieu Puigt, Gilles Delmaire, Gilles Roussel
Blind Source Separation of Single Channel Mixture Using Tensorization and Tensor Diagonalization

This paper deals with estimation of structured signals such as damped sinusoids, exponentials, polynomials, and their products from single channel data. It is shown that building tensors from this kind of data results in tensors with hidden block structure which can be recovered through the tensor diagonalization. The tensor diagonalization means multiplying tensors by several matrices along its modes so that the outcome is approximately diagonal or block-diagonal of 3-rd order tensors. The proposed method can be applied to estimation of parameters of multiple damped sinusoids, and their products with polynomial.

Anh-Huy Phan, Petr Tichavský, Andrzej Cichocki
High-Resolution Subspace-Based Methods: Eigenvalue- or Eigenvector-Based Estimation?

In subspace-based methods for mulditimensional harmonic retrieval, the modes can be estimated either from eigenvalues or eigenvectors. The purpose of this study is to find out which way is the best. We compare the state-of-the art methods N-D ESPRIT and IMDF, propose a modification of IMDF based on least-squares criterion, and derive expressions of the first-order perturbations for these methods. The theoretical expressions are confirmed by the computer experiments.

Konstantin Usevich, Souleymen Sahnoun, Pierre Comon

From Source Positions to Room Properties: Learning Methods for Audio Scene Geometry Estimation

Frontmatter
Speaker Tracking on Multiple-Manifolds with Distributed Microphones

Speaker tracking in a reverberant enclosure with an ad hoc network of multiple distributed microphones is addressed in this paper. A set of prerecorded measurements in the enclosure of interest is used to construct a data-driven statistical model. The function mapping the measurement-based features to the corresponding source position represents complex unknown relations, hence it is modelled as a random Gaussian process. The process is defined by a covariance function which encapsulates the relations among the available measurements and the different views presented by the distributed microphones. This model is intertwined with a Kalman filter to capture both the smoothness of the source movement in the time-domain and the smoothness with respect to patterns identified in the set of available prerecorded measurements. Simulation results demonstrate the ability of the proposed method to localize a moving source in reverberant conditions.

Bracha Laufer-Goldshtein, Ronen Talmon, Sharon Gannot
VAST: The Virtual Acoustic Space Traveler Dataset

This paper introduces a new paradigm for sound source localization referred to as virtual acoustic space traveling (VAST) and presents a first dataset designed for this purpose. Existing sound source localization methods are either based on an approximate physical model (physics-driven) or on a specific-purpose calibration set (data-driven). With VAST, the idea is to learn a mapping from audio features to desired audio properties using a massive dataset of simulated room impulse responses. This virtual dataset is designed to be maximally representative of the potential audio scenes that the considered system may be evolving in, while remaining reasonably compact. We show that virtually-learned mappings on this dataset generalize to real data, overcoming some intrinsic limitations of traditional binaural sound localization methods based on time differences of arrival.

Clément Gaultier, Saurabh Kataria, Antoine Deleforge
Sketching for Nearfield Acoustic Imaging of Heavy-Tailed Sources

We propose a probabilistic model for acoustic source localization with known but arbitrary geometry of the microphone array. The approach has several features. First, it relies on a simple nearfield acoustic model for wave propagation. Second, it does not require the number of active sources. On the contrary, it produces a heat map representing the energy of a large set of candidate locations, thus imaging the acoustic field. Second, it relies on a heavy-tail $$\alpha $$-stable probabilistic model, whose most important feature is to yield an estimation strategy where the multichannel signals need to be processed only once in a simple online procedure, called sketching. This sketching produces a fixed-sized representation of the data that is then analyzed for localization. The resulting algorithm has a small computational complexity and in this paper, we demonstrate that it compares favorably with state of the art for localization in realistic simulations of reverberant environments.

Mathieu Fontaine, Charles Vanwynsberghe, Antoine Liutkus, Roland Badeau
Acoustic DoA Estimation by One Unsophisticated Sensor

We show how introducing known scattering can be used in direction of arrival estimation by a single sensor. We first present an analysis of the geometry of the underlying measurement space and show how it enables localizing white sources. Then, we extend the solution to more challenging non-white sources like speech by including a source model and considering convex relaxations with group sparsity penalties. We conclude with numerical simulations using an unsophisticated sensing device to validate the theory.

Dalia El Badawy, Ivan Dokmanić, Martin Vetterli
Acoustic Source Localization by Combination of Supervised Direction-of-Arrival Estimation with Disjoint Component Analysis

Analysis and processing in reverberant, multi-source acoustic environments encompasses a multitude of techniques that estimate from sensor signals a spatially resolved “image” of acoustic space, a high-level representation of physical sources that consolidates several source components into a single sound object, and the estimation of filter parameters that would permit enhancement of target and attenuation of interfering signal components.The contribution of the present manuscript is the introduction of a combination of different algorithms from the field of supervised learning, unsupervised subspace decomposition and multi-channel signal enhancement to accomplish these goals.Specifically, we propose a system that (1) uses a bank of trained support vector machine classifiers to estimate source activity probability for each spatial position and (2) employs disjoint component analysis (DCA) to obtain from this probabilistic spatial source activity map those components that pertain to individual sound objects. We conclude with a brief outline for (3) estimation of multi-channel filter parameters based on DCA components in order to perform target source enhancement.We illustrate the proposed method with decomposition results obtained with a four-channel hearing aid geometry setup that comprises two localized sources plus isotropic background noise in an anechoic environment.

Jörn Anemüller, Hendrik Kayser

Tensors and Audio

Frontmatter
An Initialization Method for Nonlinear Model Reduction Using the CP Decomposition

Every parametric model lies on the trade-off line between accuracy and interpretability. Increasing the interpretability of a model, while keeping the accuracy as good as possible, is of great importance for every existing model today. Currently, some nonlinear models in the field of block-oriented modeling are hard to interpret, and need to be simplified. Therefore, we designed a model-reduction technique based on the Canonical Polyadic tensor Decomposition, which can be used for a special type of static nonlinear multiple-input-multiple-output models. We analyzed how the quality of the model varies as the model order is reduced. This paper introduces a special initialization and compares it with a randomly chosen initialization point.Using the method based on tensor decompositions ensures smaller errors than when using the brute-force optimization method. The resulting simplified model is thus able to keep its accuracy as high as possible.

Gabriel Hollander, Philippe Dreesen, Mariya Ishteva, Johan Schoukens
Audio Zoom for Smartphones Based on Multiple Adaptive Beamformers

Some recent smartphones have offered the so-called audio zoom feature which allows to focus sound capture in the front direction while attenuating progressively surrounding sounds along with video zoom. This paper proposes a complete implementation of such function involving two major steps. First, targeted sound source is extracted by a novel approach that combines multiple adaptive beamformers having different look directions with a post-processing algorithm. Second, spatial zooming effect is created by leveraging the microphone signals and the enhanced target source. Subjective test with real-world audio recordings using a mock-up simulating an usual shape of the smartphone confirms the rich user experience obtained by the proposed system.

Ngoc Q. K. Duong, Pierre Berthet, Sidkièta Zabre, Michel Kerdranvat, Alexey Ozerov, Louis Chevallier
Complex Valued Robust Multidimensional SOBI

Complex valued random variables and time series are common in various applications, for example in wireless communications, radar applications and magnetic resonance imaging. These applications often involve the famous blind source separation problem. However, the observations rarely fully follow specific models and robust methods that allow deviations from the model assumptions and endure outliers are required. We propose a new algorithm, robust multidimensional eSAM-SOBI, for complex valued blind source separation. The algorithm takes into account possible multidimensional spatial or temporal dependencies, whereas traditional SOBI-like procedures only consider dependencies in a single direction. In applications like functional magnetic resonance imaging, the dependencies are indeed not only one-dimensional. We provide a simulation study with complex valued data to illustrate the better performance of the methods that utilize multidimensional autocovariance in the presence of two-dimensional dependency. Moreover, we also examine the performance of the multidimensional eSAM-SOBI in the presence of outliers.

Niko Lietzén, Klaus Nordhausen, Pauliina Ilmonen
Ego Noise Reduction for Hose-Shaped Rescue Robot Combining Independent Low-Rank Matrix Analysis and Multichannel Noise Cancellation

In this paper, we present an ego noise reduction method for a hose-shaped rescue robot, developed for search and rescue operations in large-scale disasters. It is used to search for victims in disaster sites by capturing their voices with its microphone array. However, ego noises are mixed with voices, and it is difficult to differentiate them from a call for help from a disaster victim. To solve this problem, we here propose a two-step noise reduction method involving the following: (1) the estimation of both speech and ego noise signals from observed multichannel signals by multichannel nonnegative matrix factorization (NMF) with the rank-1 spatial constraint, and (2) the application of multichannel noise cancellation to the estimated speech signal using reference signals. Our evaluations show that this approach is effective for suppressing ego noise.

Narumi Mae, Masaru Ishimura, Shoji Makino, Daichi Kitamura, Nobutaka Ono, Takeshi Yamada, Hiroshi Saruwatari
Some Theory on Non-negative Tucker Decomposition

Some theoretical difficulties that arise from dimensionality reduction for tensors with non-negative coefficients is discussed in this paper. A necessary and sufficient condition is derived for a low non-negative rank tensor to admit a non-negative Tucker decomposition with a core of the same non-negative rank. Moreover, we provide evidence that the only algorithm operating mode-wise, minimizing the dimensions of the features spaces, and that can guarantee the non-negative core to have low non-negative rank requires identifying on each mode a cone with possibly a very large number of extreme rays. To illustrate our observations, some existing algorithms that compute the non-negative Tucker decomposition are described and tested on synthetic data.

Jeremy E. Cohen, Pierre Comon, Nicolas Gillis
A New Algorithm for Multimodal Soft Coupling

In this paper, the problem of multimodal soft coupling under the Bayesian framework when variance of probabilistic model is unknown is investigated. Similarity of shared factors resulted from Nonnegative Matrix Factorization (NMF) of multimodal data sets is controlled in a soft manner by using a probabilistic model. In previous works, it is supposed that the probabilistic model and its parameters are known. However, this assumption does not always hold. In this paper it is supposed that the probabilistic model is already known but its variance is unknown. So the proposed algorithm estimates the variance of the probabilistic model along with the other parameters during the factorization procedure. Simulation results with synthetic data confirm the effectiveness of the proposed algorithm.

Farnaz Sedighin, Massoud Babaie-Zadeh, Bertrand Rivet, Christian Jutten
Adaptive Blind Separation of Instantaneous Linear Mixtures of Independent Sources

In many applications, there is a need to blindly separate independent sources from their linear instantaneous mixtures while the mixing matrix or source properties are slowly or abruptly changing in time. The easiest way to separate the data is to consider off-line estimation of the model parameters repeatedly in time shifting window. Another popular method is the stochastic natural gradient algorithm, which relies on non-Gaussianity of the separated signals and is adaptive by its nature. In this paper, we propose an adaptive version of two blind source separation algorithms which exploit non-stationarity of the original signals. The results indicate that the proposed algorithms slightly outperform the natural gradient in the trade-off between the algorithm’s ability to quickly adapt to changes in the mixing matrix and the variance of the estimate when the mixing is stationary.

Ondřej Šembera, Petr Tichavský, Zbyněk Koldovský
Source Separation, Dereverberation and Noise Reduction Using LCMV Beamformer and Postfilter

The problem of source separation, dereverberation and noise reduction using a microphone array is addressed in this paper. The observed speech is modeled by two components, namely the early speech (including the direct path and some early reflections) and the late reverberation. The minimum mean square error (MMSE) estimator of the early speech components of the various speakers is derived, which jointly suppresses the noise and the overall reverberation from all speakers. The overall time-varying level of the reverberation is estimated using two different estimators, an estimator based on a temporal model and an estimator based on a spatial model. The experimental study consists of measured acoustic transfer functions (ATFs) and directional noise with various signal-to-noise ratio levels. The separation, dereverberation and noise reduction performance is examined in terms of perceptual evaluation of speech quality (PESQ) and signal-to-interference plus noise ratio improvement.

Ofer Schwartz, Sebastian Braun, Sharon Gannot, Emanuël A. P. Habets
Toward Rank Disaggregation: An Approach Based on Linear Programming and Latent Variable Analysis

This work presents an unsupervised approach to the problem of rank disaggregation, which can be defined as the task of decomposing a set of rankings provided by different people (or entities). To accomplish this task, we first discuss the problem of rank aggregation and how it can be solved via linear programming. Then, we introduce a disaggregation method based on rank aggregation and inspired by decomposition methods such as principal component analysis (PCA). The results are preliminary but may pave the way for a better understating of relevant features found in applications such as group decision.

Vincent Vigneron, Leonardo Tomazeli Duarte
A Proximal Approach for Nonnegative Tensor Decomposition

This communication deals with N-th order tensor decompositions. More precisely, we are interested in the (Canonical) Polyadic Decomposition. In our case, this problem is formulated under a variational approach where the considered criterion to be minimized is composed of several terms: one accounting for the fidelity to data and others that can represent not only regularization (such as sparsity prior) but also hard constraints (such as nonnegativity). The resulting optimization problem is solved by using the Block-Coordinate Variable Metric Forward-Backward (BC-VMFB) algorithm. The robustness and efficiency of the suggested approach is illustrated on realistic synthetic data such as those encountered in the context of environmental data analysis and fluorescence spectroscopy. Our simulations are performed on 4-th order tensors.

Xuan Vu, Caroline Chaux, Nadège Thirion-Moreau, Sylvain Maire
Psychophysical Evaluation of Audio Source Separation Methods

Source separation evaluation is typically a top-down process, starting with perceptual measures which capture fitness-for-purpose and followed by attempts to find physical (objective) measures that are predictive of the perceptual measures. In this paper, we take a contrasting bottom-up approach. We begin with the physical measures provided by the Blind Source Separation Evaluation Toolkit (BSS Eval) and we then look for corresponding perceptual correlates. This approach is known as psychophysics and has the distinct advantage of leading to interpretable, psychophysical models. We obtained perceptual similarity judgments from listeners in two experiments featuring vocal sources within musical mixtures. In the first experiment, listeners compared the overall quality of vocal signals estimated from musical mixtures using a range of competing source separation methods. In a loudness experiment, listeners compared the loudness balance of the competing musical accompaniment and vocal. Our preliminary results provide provisional validation of the psychophysical approach.

Andrew J. R. Simpson, Gerard Roma, Emad M. Grais, Russell D. Mason, Christopher Hummersone, Mark D. Plumbley

Audio Signal Processing

Frontmatter
On the Use of Latent Mixing Filters in Audio Source Separation

In this paper, we consider the underdetermined convolutive audio source separation (UCASS) problem. In the STFT domain, we consider both source signals and mixing filters as latent random variables, and we propose to estimate each source image, i.e. each individual source-filter product, by its posterior mean. Although, this is a quite straightforward application of the Bayesian estimation theory, to our knowledge, there exist no similar study in the UCASS context. In this paper, we discuss the interest of this estimator in this context and compare it with the conventional Wiener filter in a semi-oracle configuration.

Laurent Girin, Roland Badeau
Discriminative Enhancement for Single Channel Audio Source Separation Using Deep Neural Networks

The sources separated by most single channel audio source separation techniques are usually distorted and each separated source contains residual signals from the other sources. To tackle this problem, we propose to enhance the separated sources to decrease the distortion and interference between the separated sources using deep neural networks (DNNs). Two different DNNs are used in this work. The first DNN is used to separate the sources from the mixed signal. The second DNN is used to enhance the separated signals. To consider the interactions between the separated sources, we propose to use a single DNN to enhance all the separated sources together. To reduce the residual signals of one source from the other separated sources (interference), we train the DNN for enhancement discriminatively to maximize the dissimilarity between the predicted sources. The experimental results show that using discriminative enhancement decreases the distortion and interference between the separated sources.

Emad M. Grais, Gerard Roma, Andrew J. R. Simpson, Mark D. Plumbley
Audiovisual Speech Separation Based on Independent Vector Analysis Using a Visual Voice Activity Detector

In this paper, we present a way of improving the Independent Vector Analysis in the context of blind separation of convolutive mixtures of speech signals. The periods of activity and inactivity of one or more speech signals are first detected using a binary visual voice activity detector based on lip movements and then fed into a modified Independent Vector Analysis algorithm to achieve the separation. Presented results show that this approach improves separation and identification of sources in a determined case with a higher convergence rate, and is also able to enhance a specific source in an underdetermined mixture.

Pierre Narvor, Bertrand Rivet, Christian Jutten
Monoaural Audio Source Separation Using Deep Convolutional Neural Networks

In this paper we introduce a low-latency monaural source separation framework using a Convolutional Neural Network (CNN). We use a CNN to estimate time-frequency soft masks which are applied for source separation. We evaluate the performance of the neural network on a database comprising of musical mixtures of three instruments: voice, drums, bass as well as other instruments which vary from song to song. The proposed architecture is compared to a Multilayer Perceptron (MLP), achieving on-par results and a significant improvement in processing time. The algorithm was submitted to source separation evaluation campaigns to test efficiency, and achieved competitive results.

Pritish Chandna, Marius Miron, Jordi Janer, Emilia Gómez

Theoretical Developments

Frontmatter
On the Behaviour of the Estimated Fourth-Order Cumulants Matrix of a High-Dimensional Gaussian White Noise

This paper is devoted to the study of the traditional estimator of the fourth-order cumulants matrix of a high-dimensional multivariate Gaussian white noise. If M represents the dimension of the noise and N the number of available observations, it is first established that this $$M^{2} \times M^{2}$$ matrix converges towards 0 in the spectral norm sens provided $$\frac{M^{2}\,\log N}{N} \rightarrow 0$$. The behaviour of the estimated fourth-order cumulants matrix is then evaluated in the asymptotic regime where M and N converge towards $$+\infty $$ in such a way that $$\frac{M^{2}}{N}$$ converges towards a constant. In this context, it is proved that the matrix does not converge towards 0 in the spectral norm sense, and that its empirical eigenvalue distribution converges towards a shifted Marcenko-Pastur distribution. It is finally claimed that the largest and the smallest eigenvalue of the cumulant matrix converges almost surely towards the rightend and the leftend points of the support of the Marcenko-Pastur distribution.

Pierre Gouédard, Philippe Loubaton
Caveats with Stochastic Gradient and Maximum Likelihood Based ICA for EEG

Stochastic gradient (SG) is the most commonly used optimization technique for maximum likelihood based approaches to independent component analysis (ICA). It is in particular the default solver in public implementations of Infomax and variants. Motivated by experimental findings on electroencephalography (EEG) data, we report some caveats which can impact the results and interpretation of neuroscience findings. We investigate issues raised by controlling the step size in gradient updates combined with early stopping conditions, as well as initialization choices which can artificially generate biologically plausible brain sources, so called dipolar sources. We provide experimental evidence that pushing the convergence of Infomax using non stochastic solvers can reduce the number of highly dipolar components and provide a mathematical explanation of this fact. Results are presented on public EEG data.

Jair Montoya-Martínez, Jean-François Cardoso, Alexandre Gramfort
Approximate Joint Diagonalization According to the Natural Riemannian Distance

In this paper, we propose for the first time an approximate joint diagonalization (AJD) method based on the natural Riemannian distance of Hermitian positive definite matrices. We turn the AJD problem into an optimization problem with a Riemannian criterion and we developp a framework to optimize it. The originality of this criterion arises from the diagonal form it targets. We compare the performance of our Riemannian criterion to the classical ones based on the Frobenius norm and the log-det divergence, on both simulated data and real electroencephalographic (EEG) signals. Simulated data show that the Riemannian criterion is more accurate and allows faster convergence in terms of iterations. It also performs well on real data, suggesting that this new approach may be useful in other practical applications.

Florent Bouchard, Jérôme Malick, Marco Congedo
Gaussian Processes for Source Separation in Overdetermined Bilinear Mixtures

In this work, we consider the nonlinear Blind Source Separation (BSS) problem in the context of overdetermined Bilinear Mixtures, in which a linear structure can be employed for performing separation. Based on the Gaussian Process (GP) framework, two approaches are proposed: the predictive distribution and the maximization of the marginal likelihood. In both cases, separation can be achieved by assuming that the sources are Gaussian and temporally correlated. The results with synthetic data are favorable to the proposal.

Denis G. Fantinato, Leonardo T. Duarte, Bertrand Rivet, Bahram Ehsandoust, Romis Attux, Christian Jutten
Model-Independent Method of Nonlinear Blind Source Separation

Consider a time series of signal measurements x(t), where x has two components. This paper shows how to process the local distributions of measurement velocities in order to construct a two-component mapping, u(x). If the measurements are linear or nonlinear combinations of statistically independent variables, u(x) must be an unmixing function. In other words, the measurement data are separable if and only if $$u_{1}[x(t)]$$ and $$u_{2}[x(t)]$$ are statistically independent of one another. The method is analytic, constructive, and model-independent. It is illustrated by blindly recovering the separate utterances of two speakers from nonlinear combinations of their waveforms.

David N. Levin

Physics and Bio Signal Processing

Frontmatter
The 2016 Signal Separation Evaluation Campaign

In this paper, we report the results of the 2016 community-based Signal Separation Evaluation Campaign (SiSEC 2016). This edition comprises four tasks. Three focus on the separation of speech and music audio recordings, while one concerns biomedical signals. We summarize these tasks and the performance of the submitted systems, as well as provide a small discussion concerning future trends of SiSEC.

Antoine Liutkus, Fabian-Robert Stöter, Zafar Rafii, Daichi Kitamura, Bertrand Rivet, Nobutaka Ito, Nobutaka Ono, Julie Fontecave
Multimodality for Rainfall Measurement

The need for accurate monitoring of rainfall, essential for many fields such as: hydrology, transportation and agriculture, calls for optimal use of all available resources. However, as the existing monitoring equipment is diverse, and different tools provide measurements of different nature, fusing these measurements is a challenging task. At one extreme, rain gauges provide local, direct measurements of the accumulated rainfall, and at the other end, satellite observations provide remote images of clouds, from which rainfall is estimated. In between, weather radar measures reflectivity which is non-linearly related to rainfall. In light of the new opportunities introduced by the use of physical measurements from cellular communication networks for rainfall monitoring, I first review the approaches for fusion of different rainfall direct and indirect measurements, distinguishing it from data assimilation, widely used in meteorology. I will then suggest a unified approach to the problem, combining parametric and non-parametric tools, and will present preliminary results.

Hagit Messer
Particle Flow SMC-PHD Filter for Audio-Visual Multi-speaker Tracking

Sequential Monte Carlo probability hypothesis density (SMC-PHD) filtering has been recently exploited for audio-visual (AV) based tracking of multiple speakers, where audio data are used to inform the particle distribution and propagation in the visual SMC-PHD filter. However, the performance of the AV-SMC-PHD filter can be affected by the mismatch between the proposal and the posterior distribution. In this paper, we present a new method to improve the particle distribution where audio information (i.e. DOA angles derived from microphone array measurements) is used to detect new born particles and visual information (i.e. histograms) is used to modify the particles with particle flow (PF). Using particle flow has the benefit of migrating particles smoothly from the prior to the posterior distribution. We compare the proposed algorithm with the baseline AV-SMC-PHD algorithm using experiments on the AV16.3 dataset with multi-speaker sequences.

Yang Liu, Wenwu Wang, Jonathon Chambers, Volkan Kilic, Adrian Hilton

Latent Variable Analysis in Observation Sciences

Frontmatter
Estimation of the Intrinsic Dimensionality in Hyperspectral Imagery via the Hubness Phenomenon

As hyperspectral images are high-dimensional data sets containing a lot of redundancy, a first important step in many applications such as spectral unmixing or dimensionality reduction is estimation of the intrinsic dimensionality of the data set. We present a new method for estimation of the intrinsic dimensionality in hyperspectral images based upon the hubness phenomenon, which is the observation that indegree distributions in a K-nearest neighbor graph will become skewed as the intrinsic dimensionality of the data set rises. The proposed technique is based upon comparing the indegree distributions of artificially generated data sets with the one from the target data set, and identifying the best match with some histogram metric. We show that this method obtains superior results compared to many alternatives, and does not suffer from the effects of interband and spectral correlations.

Rob Heylen, Mario Parente, Paul Scheunders
A Blind Identification and Source Separation Method Based on Subspace Intersections for Hyperspectral Astrophysical Data

This paper presents a geometric method for solving the Blind Source Separation problem. The method is based on a weak sparsity assumption: for each source, there should exist at least one pair of zones that share only this source. The process consists first in finding the pairs of zones sharing a unique source with an original geometric approach. Each pair of zones, having a mono-dimensional intersection, yields an estimate of a column of the mixing matrix up to a scale factor. All intersections are identified by Singular Value Decomposition. The intersections corresponding to the same column of the mixing matrix are then grouped by a clustering algorithm so as to derive a single estimate of each column. The sources are finally reconstructed from the observed vectors and mixing parameters with a least square algorithm. Various tests on synthetic and real hyperspectral astrophysical data illustrate the efficiency of this approach.

Axel Boulais, Yannick Deville, Olivier Berné
Estimating the Number of Endmembers to Use in Spectral Unmixing of Hyperspectral Data with Collaborative Sparsity

Spectral Unmixing (SU) in hyperspectral remote sensing aims at recovering the signatures of the pure materials in the scene (endmembers) and their abundances in each pixel of the image. The usual SU chain does not take spectral variability (SV) into account, and relies on the estimation of the Intrinsic Dimensionality (ID) of the data, related to the number of endmembers to use. However, the ID can be significantly overestimated in difficult scenarios, and sometimes does not correspond to the desired scale and application dependent number of endmembers. Spurious endmembers are then frequently included in the model. We propose an algorithm for SU incorporating SV, using collaborative sparsity to discard the least explicative endmembers in the whole image. We compute an algorithmic regularization path for this problem to select the optimal set of endmembers using a statistical criterion. Results on simulated and real data show the interest of the approach.

Lucas Drumetz, Guillaume Tochon, Jocelyn Chanussot, Christian Jutten
Sharpening Hyperspectral Images Using Plug-and-Play Priors

This paper addresses the problem of fusing hyperspectral (HS) images of low spatial resolution and multispectral (MS) images of high spatial resolution into images of high spatial and spectral resolution. By assuming that the target image lives in a low dimensional subspace, the problem is formulated with respect to the latent representation coefficients. Our major contributions are: (i) using patch-based spatial priors, learned from the MS image, for the latent images of coefficients; (ii) exploiting the so-called plug-and-play approach, wherein a state-of-the-art denoiser is plugged into the iterations of a variable splitting algorithm.

Afonso Teodoro, José Bioucas-Dias, Mário Figueiredo
On Extracting the Cosmic Microwave Background from Multi-channel Measurements

Extracting a sky map of the Cosmic Microwave Background (CMB) from multi-channel measurements can be seen as a component separation problem in a special context: only one component is of interest (the CMB) and its column in the mixing matrix and its probability distribution are known with high accuracy. The purpose of this paper is not to present a new algorithm but rather to discuss, on a purely theoretical basis, the impact of the statistical modeling of the components in a simple case. To do so, we analyze a model of noise-free CMB observations contaminated by coherent components. We show that the maximum likelihood estimate of the CMB in this model does not depend of the model of the contamination.

Jean-François Cardoso

ICA Theory and Applications

Frontmatter
Kernel-Based NPLS for Continuous Trajectory Decoding from ECoG Data for BCI Applications

In this paper, nonlinearity is introduced to linear neural activity decoders to improve continuous hand trajectory prediction for Brain-Computer Interface systems. For decoding the high-dimensional data-tensor, a kernel regression was coupled with multilinear PLS (NPLS). Two ways to introduce nonlinearity were studied: a generalized linear model with kernel link function and kernel regression in the NPLS latent variables space (inside or outside the NPLS iterations). The efficiency of these approaches was tested on the publically available database of the simultaneous recordings of three-dimensional hand trajectories and epidural electrocorticogram (ECoG) signals of a Japanese macaque. Compared to linear methods, nonlinearity did not significantly improve the prediction accuracy but did significantly improve the smoothness of the prediction.

Sarah Engel, Tetiana Aksenova, Andrey Eliseyev
On the Optimal Non-linearities for Gaussian Mixtures in FastICA

In independent component analysis we assume that the observed vector is a linear transformation of a latent vector of independent components, our objective being the estimation of the latter. Deflation-based FastICA estimates the components one-by-one by repeatedly maximizing the expected value of some function measuring non-Gaussianity, the derivative of which is called the non-linearity. Under some weak assumptions, the asymptotically optimal non-linearity for extracting sources with a specific density is given by the location score function of the density. In this paper we look into the consequences of this result from the viewpoint of estimating Gaussian location and scale mixtures. As one of our results we justify the common use of hyperbolic tangent, tanh, as a non-linearity in blind clustering by showing that it is optimal for estimating certain Gaussian mixtures. Finally, simulations are used to show that the asymptotic optimality results hold in various settings also for finite samples.

Joni Virta, Klaus Nordhausen
Fast Disentanglement-Based Blind Quantum Source Separation and Process Tomography: A Closed-Form Solution Using a Feedback Classical Adapting Structure

We here extend Blind (i.e. unsupervised) Quantum Source Separation and Process Tomography methods. Considering disentanglement-based approaches, we introduce associated optimization algorithms which are much faster than the previous ones, since they reduce the number of source quantum state preparations required for adaptation by a factor of $$10^3$$ typically. This is achieved by unveiling the parametric forms of the optimized cost functions, which allows us to derive a closed-form solution for their optimum.

Yannick Deville, Alain Deville
Blind Separation of Cyclostationary Sources with Common Cyclic Frequencies

We propose a new method for blind source separation of cyclostationary sources, whose cyclic frequencies are unknown and may share one or more common cyclic frequencies. The suggested method exploits the second-order cyclostationarity statistics of observation signals to build a set of matrices which has a particular algebraic structure. We also introduce an automatic point selection procedure for the determination of these matrices to be joint diagonalized in order to identify the mixing matrix and recover the source signals as a result. The non-unitary joint diagonalization is ensured by Broyden-Fletcher-Goldfarb-Shanno (BFGS) method which is the most commonly used update strategy for implementing a quasi-newton technique. Numerical simulations are provided to demonstrate the usefulness of the proposed method in the context of digital communications and to compare it with another method based upon an unitary joint diagonalization algorithm.

Amine Brahmi, Hicham Ghennioui, Christophe Corbier, M’hammed Lahbabi, François Guillet
Adaptation of a Gaussian Mixture Regressor to a New Input Distribution: Extending the C-GMR Framework

This paper addresses the problem of the adaptation of a Gaussian Mixture Regression (MGR) to a new input distribution, using a limited amount of input-only examples. We propose a new model for GMR adaptation, called Joint GMR (J-GMR), that extends the previously published framework of Cascaded GMR (C-GMR). We provide an exact EM training algorithm for the J-GMR. We discuss the merits of the J-GMR with respect to the C-GMR and illustrate its performance with experiments on speech acoustic-to-articulatory inversion.

Laurent Girin, Thomas Hueber, Xavier Alameda-Pineda
Efficient Optimization of the Adaptive ICA Function with Estimating the Number of Non-Gaussian Sources

We propose a new method for efficiently estimating the number of non-Gaussian sources in independent component analysis (ICA). While PCA can find only a few principal components incrementally in the order of significance, ICA has to estimate all the sources after giving the number of them in advance. Then, the appropriate number of sources is determined after the estimation if necessary. Here, we use the adaptive ICA function (AIF), which has been derived by using a simple probabilistic model. It is previously proved that the optimization of AIF with the Gram-Schmidt orthonormalization can find all the sources in descending order of the degree of non-Gaussianity. In this paper, we propose an efficient method for optimizing AIF in the deflation approach by combining fast ICA with the stochastic optimization. In addition, we propose a threshold for determining whether an estimated source is Gaussian or not, which is derived by utilizing the Fisher information of the probabilistic model of AIF. By terminating the optimization when the currently estimated source is Gaussian, the number of sources is estimated efficiently. The experimental results on blind image separation problems verify the usefulness of the proposed method.

Yoshitatsu Matsuda, Kazunori Yamaguchi
Feasibility of WiFi Site-Surveying Using Crowdsourced Data

Pedestrian dead reckoning (PDR) trajectories suffer from a significant amount of drift over time, especially when relying on low-cost commercial sensors. For indoor positioning, high level fusion algorithms refine trajectories thanks to kind of map information: e.g. WiFi fingerprinting, and blue prints. Map availability is then of great concern for efficient use of positioning algorithms in practical situations, and could rely on crowdsourced data, i.e. big quantities of data shared by users. In this paper, crowdsourced data include uncertain estimated positions and noisy RSSI (Received Signal Strength Indicator) measurements in order to estimate the spatial distribution of RSSI levels. Using a simple model for a PDR trajectory, we study how a WiFi map can be derived. Simulation results on a corridor use-case illustrate the approach.

Sylvain Leirens, Christophe Villien, Bruno Flament
On Minimum Entropy Deconvolution of Bi-level Images

Minimum Entropy Deconvolution (MED) is a sparse blind deconvolution method that searches for a deconvolution filter that leads to the most sparse output, assuming that the desired signal is originally sparse. The present work establishes sufficient conditions for the blind deconvolution of sparse images. Then, based on a measure of sparsity given by the ratio of $$L_p$$-norms, we derive a gradient based algorithm for the blind deconvolution of bi-level images, more specifically, for the blind deconvolution of blurred QR Codes. Finally, simulation results are presented considering both synthetic and real data and shows the possibility of achieving really good results by the light of a very simple algorithm.

K. Nose-Filho, A. K. Takahata, R. Suyama, R. Lopes, J. M. T. Romano
A Joint Second-Order Statistics and Density Matching-Based Approach for Separation of Post-Nonlinear Mixtures

In the context of Post-Nonlinear (PNL) mixtures, source separation can be performed in a two-stage approach, which encompasses a nonlinear and a linear compensation part. In the former part, it is usually required the knowledge of all the source distributions. In this work, we propose a less restrictive approach, where only one source distribution is needed to be known – here, chosen to be a colored Gaussian. The other sources are only required to present a time structure. The method combines, in a joint-based approach, the use of the second-order statistics (SOS) and the matching of distributions, which shows to be less costly than the classical method of computing the marginal entropy for all sources. The simulation results are favorable to the proposal.

Denis G. Fantinato, Leonardo T. Duarte, Paolo Zanini, Bertrand Rivet, Romis Attux, Christian Jutten
Optimal Measurement Times for Observing a Brownian Motion over a Finite Period Using a Kalman Filter

This article deals with the optimization of the schedule of measures for observing a random process in time using a Kalman filter, when the length of the process is finite and fixed, and a fixed number of measures are available. The measure timetable plays a critical role for the accuracy of this estimator. Two different criteria of optimality of a timetable (not necessarily regular) are considered: the maximal and the mean variance of the estimator. Both experimental and theoretical methods are used for the problem of minimizing the mean variance. The theoretical methods are based on studying the cost function as a rational function. An analytical formula of the optimal instant of measure is obtained in the case of one measure. Its properties are studied. An experimental solution is given for a particular case with $$n>1$$ measures.

Alexandre Aksenov, Pierre-Olivier Amblard, Olivier Michel, Christian Jutten
On Disjoint Component Analysis

Disjoint Component Analysis (DCA) is a recent blind source separation approach which is based on the assumption that the original sources have disjoint supports. In DCA, the recovery process is carried out by maximizing the disjoint support of the estimated sources. In the present work, we provide sufficient conditions for the separation of both disjoint and quasi-disjoint signals. In addition, we propose an effective DCA criterion to evaluate the level of superposition of the recovered sources. The minimization of such criterion is implemented by an algorithm based on Givens rotations. Finally, simulation results are presented in order to assess the performance of the proposed method.

K. Nose-Filho, L. T. Duarte, J. M. T. Romano

Sparsity-Aware Signal Processing

Frontmatter
Accelerated Dictionary Learning for Sparse Signal Representation

Learning sparsifying dictionaries from a set of training signals has been shown to have much better performance than pre-designed dictionaries in many signal processing tasks, including image enhancement. To this aim, numerous practical dictionary learning (DL) algorithms have been proposed over the last decade. This paper introduces an accelerated DL algorithm based on iterative proximal methods. The new algorithm efficiently utilizes the iterative nature of DL process, and uses accelerated schemes for updating dictionary and coefficient matrix. Our numerical experiments on dictionary recovery show that, compared with some well-known DL algorithms, our proposed one has a better convergence rate. It is also able to successfully recover underlying dictionaries for different sparsity and noise levels.

Fateme Ghayem, Mostafa Sadeghi, Massoud Babaie-Zadeh, Christian Jutten
BSS with Corrupted Data in Transformed Domains

Most techniques of Blind Source Separation (BSS) are highly sensitive to the presence of gross errors while these last are ubiquitous in many real-world applications. This mandates the development of robust BSS methods, especially to handle the determined case for which there is currently no strategy able to separate the outliers from the sources contributions. We propose a new method which exploits the difference of structural contents that is naturally exhibited by the sources and the outliers in many applications to accurately separate the two contributions. More precisely, we exploit the sparse representations of the signals in two adapted and different dictionaries to estimate jointly the mixing matrix, the sources and the outliers. Preliminary results show the good accuracy of the proposed algorithm in various settings.

Cécile Chenot, Jérôme Bobin
Singing Voice Separation Using RPCA with Weighted -norm

In this paper, we present an extension of robust principal component analysis (RPCA) with weighted $$l_{1}$$-norm minimization for singing voice separation. While the conventional RPCA applies a uniform weight between the low-rank and sparse matrices, we use different weighting parameters for each frequency bin in a spectrogram by estimating the variance ratio between the singing voice and accompaniment. In addition, we incorporate the results of vocal activation detection into the formation of the weighting matrix, and use it in the final decomposition framework. From the experimental results using the DSD100 dataset, we found that proposed algorithm yields a meaningful improvement in the separation performance compared to the conventional RPCA.

Il-Young Jeong, Kyogu Lee
Multimodal Approach to Remove Ocular Artifacts from EEG Signals Using Multiple Measurement Vectors

This paper deals with the extraction of eye-movement artifacts from EEG data using a multimodal approach. The gaze signals, recorded by an eye-tracker, share a similar temporal structure with the artifacts induced in EEG recordings by ocular movements. The proposed approach consists in estimating this specific common structure using Multiple Measurement Vectors which is then used to denoise the EEG data. This method can be used on single trial data and can be extended to multitrial data subject to some additional preprocessing. Finally, the proposed method is applied to gaze and EEG experimental data and is compared with some popular algorithms for eye movement artifact correction from the literature.

Victor Maurandi, Bertrand Rivet, Ronald Phlypo, Anne Guérin–Dugué, Christian Jutten
Backmatter
Metadaten
Titel
Latent Variable Analysis and Signal Separation
herausgegeben von
Petr Tichavský
Massoud Babaie-Zadeh
Olivier J.J. Michel
Nadège Thirion-Moreau
Copyright-Jahr
2017
Electronic ISBN
978-3-319-53547-0
Print ISBN
978-3-319-53546-3
DOI
https://doi.org/10.1007/978-3-319-53547-0

Premium Partner