Skip to main content

2013 | Buch

Matrix Information Geometry

herausgegeben von: Frank Nielsen, Rajendra Bhatia

Verlag: Springer Berlin Heidelberg

insite
SUCHEN

Über dieses Buch

This book presents advances in matrix and tensor data processing in the domain of signal, image and information processing. The theoretical mathematical approaches are discusses in the context of potential applications in sensor and cognitive systems engineering.

The topics and application include Information Geometry, Differential Geometry of structured Matrix, Positive Definite Matrix, Covariance Matrix, Sensors (Electromagnetic Fields, Acoustic sensors) and Applications in Cognitive systems, in particular Data Mining.

Inhaltsverzeichnis

Frontmatter

State-of-the-art surveys & original matrix theory work:

Chapter 1. Supremum/Infimum and Nonlinear Averaging of Positive Definite Symmetric Matrices
Abstract
Mathematical morphology is a nonlinear image processing methodology based on the computation of supremum (dilation operator) and infimum (erosion operator) in local neighborhoods called structuring elements. This chapter deals with definition of supremum and infimum operators for positive definite symmetric (PDS) matrices, which are the basic ingredients for the extension mathematical morphology to PDS matrices-valued images. The problem is tackled under three different paradigms. Firstly, total orderings using lexicographic cascades of eigenvalues as well as kernelized distances to matrix references are studied. Secondly, by decoupling the shape and the orientation of the ellipsoid associated to each PDS matrix, the supremum and infimum can be obtained by using a marginal supremum/infimum for the eigenvalues and a geometric matrix mean for the orthogonal basis. Thirdly, an estimate of the supremum and infimum associated to the Löwner ellipsoids are computed as the asymptotic cases of nonlinear averaging using the original notion of counter-harmonic mean for PDS matrices. Properties of the three introduced approaches are explored in detail, including also some numerical examples.
Jesús Angulo
Chapter 2. The Riemannian Mean of Positive Matrices
Abstract
The geometric mean of two positive (definite) matrices has been studied for long and found useful in problems of operator theory, quantum mechanics and electrical engineering. For more than two positive matrices the problem of having an acceptable definition was resolved only recently. Among these the object variously called the Riemannian mean, the Karcher mean, the least squares mean, and the barycentre mean is particularly attractive because of intrinsic connections with Riemannian geometry. This mean has also been adopted for applications in image processing, elasticity and other areas. This article is a broad survey of this topic and also reports on some very recent work that is yet to appear in print.
Rajendra Bhatia
Chapter 3. The Geometry of Low-Rank Kalman Filters
Abstract
An important property of the Kalman filter is that the underlying Riccati flow is a contraction for the natural metric of the cone of symmetric positive definite matrices. The present paper studies the geometry of a low-rank version of the Kalman filter. The underlying Riccati flow evolves on the manifold of fixed rank symmetric positive semidefinite matrices. Contraction properties of the low-rank flow are studied by means of a suitable metric recently introduced by the authors.
Silvère Bonnabel, Rodolphe Sepulchre
Chapter 4. KV Cohomology in Information Geometry
Abstract
Statistical structures and some information geometry invariants are discussed from the cohomology point of view. Some comparison criteria for statistical models are studied. The KV anomaly of an algebra structure as well as the Maurer-Cartan polynomial functions of KV complexes are used to discuss the linear convexity problems for various kind of linear connections. Deformation of statistical structures is discussed as well. Through the paper the differential geometry of Hessian manifolds is involved in its KV cohomology versus.
Michel Nguiffo Boyom, Paul Mirabeau Byande
Chapter 5. Derivatives of Multilinear Functions of Matrices
Abstract
Perturbation or error bounds of functions have been of great interest for a long time. If the functions are differentiable, then the mean value theorem and Taylor’s theorem come handy for this purpose.
Priyanka Grover
Chapter 6. Jensen Divergence-Based Means of SPD Matrices
Abstract
Computing matrix means is becoming more and more important in modern signal processing involving processing of matrix-valued images. In this communication, we define the mean for a set of symmetric positive definite (SPD) matrices with respect to information-theoretic divergences as the unique minimizer of the average divergence, and compare it with the means computed using the Riemannian and Log-Euclidean metrics respectively. For the class of divergences induced by the convexity gap of a matrix functional, we present a fast iterative concave-convex optimization scheme with guaranteed convergence to efficiently approximate those divergence-based means.
Frank Nielsen, Meizhu Liu, Baba C. Vemuri
Chapter 7. Exponential Barycenters of the Canonical Cartan Connection and Invariant Means on Lie Groups
Abstract
When performing statistics on elements of sets that possess a particular geometric structure, it is desirable to respect this structure. For instance in a Lie group, it would be judicious to have a notion of a mean which is stable by the group operations (composition and inversion). Such a property is ensured for Riemannian center of mass in Lie groups endowed with a bi-invariant Riemannian metric, like compact Lie groups (e.g. rotations). However, bi-invariant Riemannian metrics do not exist for most non compact and non-commutative Lie groups. This is the case in particular for rigid-body transformations in any dimension greater than one, which form the most simple Lie group involved in biomedical image registration. In this chapter, we propose to replace the Riemannian metric by an affine connection structure on the group. We show that the canonical Cartan connections of a connected Lie group provides group geodesics which are completely consistent with the composition and inversion. With such a non-metric structure, the mean cannot be defined by minimizing the variance as in Riemannian Manifolds. However, the characterization of the mean as an exponential barycenter gives us an implicit definition of the mean using a general barycentric equation. Thanks to the properties of the canonical Cartan connection, this mean is naturally bi-invariant. We show the local existence and uniqueness of the invariant mean when the dispersion of the data is small enough. We also propose an iterative fixed point algorithm and demonstrate that the convergence to the invariant mean is at least linear. In the case of rigid-body transformations, we give a simple criterion for the global existence and uniqueness of the bi-invariant mean, which happens to be the same as for rotations. We also give closed forms for the bi-invariant mean in a number of simple but instructive cases, including 2D rigid transformations. For general linear transformations, we show that the bi-invariant mean is a generalization of the (scalar) geometric mean, since the determinant of the bi-invariant mean is the geometric mean of the determinants of the data. Finally, we extend the theory to higher order moments, in particular with the covariance which can be used to define a local bi-invariant Mahalanobis distance.
Xavier Pennec, Vincent Arsigny

Advanced matrix theory for radar processing:

Chapter 8. Medians and Means in Riemannian Geometry: Existence, Uniqueness and Computation
Abstract
This paper is a short summary of our recent work on the medians and means of probability measures in Riemannian manifolds. Firstly, the existence and uniqueness results of local medians are given. In order to compute medians in practical cases, we propose a subgradient algorithm and prove its convergence. After that, Fréchet medians are considered. We prove their statistical consistency and give some quantitative estimations of their robustness with the aid of upper curvature bounds. We also show that, in compact Riemannian manifolds, the Fréchet medians of generic data points are always unique. Stochastic and deterministic algorithms are proposed for computing Riemannian p-means. The rate of convergence and error estimates of these algorithms are also obtained. Finally, we apply the medians and the Riemannian geometry of Toeplitz covariance matrices to radar target detection.
Marc Arnaudon, Frédéric Barbaresco, Le Yang
Chapter 9. Information Geometry of Covariance Matrix: Cartan-Siegel Homogeneous Bounded Domains, Mostow/Berger Fibration and Fréchet Median
Abstract
Information Geometry has been introduced by Rao, and axiomatized by Chentsov, to define a distance between statistical distributions that is invariant to non-singular parameterization transformations. For Doppler/Array/STAP Radar Processing, Information Geometry Approach will give key role to Homogenous Symmetric bounded domains geometry. For Radar, we will observe that Information Geometry metric could be related to Kähler metric, given by Hessian of Kähler potential (Entropy of Radar Signal given by \(-log[det(R)]\)). To take into account Toeplitz structure of Time/Space Covariance Matrix or Toeplitz-Block-Toeplitz structure of Space-Time Covariance matrix, Parameterization known as Partial Iwasawa Decomposition could be applied through Complex Autoregressive Model or Multi-channel Autoregressive Model. Then, Hyperbolic Geometry of Poincaré Unit Disk or Symplectic Geometry of Siegel Unit Disk will be used as natural space to compute “p-mean” (\(p=2\) for “mean”, \(p=1\) for “median”) of covariance matrices via Karcher flow derived from Weiszfeld algorithm extension on Cartan-Hadamard manifold. This new mathematical framework will allow development of Ordered Statistic (OS) concept for Hermitian Positive Definite Covariance Space/Time Toeplitz matrices or for Space-Time Toeplitz-Block-Toeplitz matrices. We will define Ordered Statistic High Doppler Resolution CFAR (OS-HDR-CFAR) and Ordered Statistic Space-Time Adaptive Processing (OS-STAP).
Frédéric Barbaresco
Chapter 10. On the Use of Matrix Information Geometry for Polarimetric SAR Image Classification
Abstract
Polarimetric SAR images have a large number of applications. To extract a physical interpretation of such images, a classification on their polarimetric properties can be a real advantage. However, most classification techniques are developed under a Gaussian assumption of the signal and compute cluster centers using the standard arithmetical mean. This paper will present classification results on simulated and real images using a non-Gaussian signal model, more adapted to the high resolution images and a geometrical definition of the mean for the computation of the class centers. We will show notable improvements on the classification results with the geometrical mean over the arithmetical mean and present a physical interpretation for these improvements, using the Cloude-Pottier decomposition.
Pierre Formont, Jean-Philippe Ovarlez, Frédéric Pascal
Chapter 11. Doppler Information Geometry for Wake Turbulence Monitoring
Abstract
Here the concept of Doppler information geometry is summarized and introduced to evaluate the richness of Doppler velocity components of radar signal and applied for wake turbulence monitoring. With the methods of information geometry, we consider all the Toeplitz Hermitian Positive Definite covariance matrices of order n as a manifold \(\Omega _{n}\). Geometries of covariance matrices based on two kinds of radar data models are presented. Finally, a radar detector based on Doppler entropy assessment is analyzed and applied for wake turbulence monitoring. This advanced Doppler processing chain is also implemented by CUDA codes for GPU parallel computation
Zhongxun Liu, Frédéric Barbaresco

Matrix-based signal processing applications:

Chapter 12. Review of the Application of Matrix Information Theory in Video Surveillance
Abstract
Video Surveillance is an active research topic in Computer Vision. Its applications include authentication in security sensitive areas, biometric-based specific person identification, overcrowding statistics and congestion control, strange situation detection and alarming, interactive surveillance using multiple cameras and so on. Video surveillance mainly involves modeling of background, detection of motion, classification of moving objects and object tracking. Background modeling is often used to detect moving object in video acquired by a fixed camera. Subspace learning method namely Principal Component Analysis (PCA) have been used to model the background to represent online data content while reducing dimension significantly. Detection and classification of the object of interest in the image captured by the camera is a vital step for automatic activity monitoring. Linear discriminant analysis (LDA) is a well-known classical statistical technique for dimension reduction and feature extraction for classification. This chapter gives an overview of the applications of Matrix Information Theory in video surveillance viz., background modeling by PCA, face recognition/object classification by LDA and finally object tracking using a covariance-based object description. The algorithms which are discussed in this paper are somewhat related to the broad area of Matrix Information Theory.
M K Bhuyan, Malathi T
Chapter 13. Comparative Evaluation of Symmetric SVD Algorithms for Real-Time Face and Eye Tracking
Abstract
Computation of singular value decomposition (SVD) has been a topic of concern by many numerical linear algebra researchers. Fast SVD has been a very effective tool in computer vision in a number of aspects, such as: face recognition, eye tracking etc. At the present state of the art fast and fixed-point power efficient SVD algorithm needs to be developed for real-time embedded computing. The work in this paper is the genesis of an attempt to build an on-board real-time face and eye tracking system for human drivers to detect loss of attention due to drowsiness or fatigue. A major function of this on-board system is quick customization. This is carried out when a new driver comes in. The face and eye images are recorded while instructing the driver for making specific poses. The eigen faces and eigen eyes are generated at several resolution levels and stored in the on-board computer. The discriminating eigen space of face and eyes are determined and stored in the on-board flash memory for detection and tracking of face and eyes and classification of eyes (open or closed) as well. Therefore, fast SVD of image covariance matrix at various levels of resolution needs to be carried out to generate the eigen database. As a preliminary step, we review the existing symmetric SVD algorithms and evaluate their feasibility for such an application. In this article, we compare the performance of (1) Jacobi’s, (2) Hestenes’, (3) Golub-Kahan, (4) Tridiagonalization and Symmetric QR iteration and (5) Tridiagonalization and Divide and Conquer algorithms. A case study has been demonstrated as an example.
Tapan Pradhan, Aurobinda Routray, Bibek Kabi
Chapter 14. Real-Time Detection of Overlapping Sound Events with Non-Negative Matrix Factorization
Abstract
In this paper, we investigate the problem of real-time detection of overlapping sound events by employing non-negative matrix factorization techniques. We consider a setup where audio streams arrive in real-time to the system and are decomposed onto a dictionary of event templates learned off-line prior to the decomposition. An important drawback of existing approaches in this context is the lack of controls on the decomposition. We propose and compare two provably convergent algorithms that address this issue, by controlling respectively the sparsity of the decomposition and the trade-off of the decomposition between the different frequency components. Sparsity regularization is considered in the framework of convex quadratic programming, while frequency compromise is introduced by employing the beta-divergence as a cost function. The two algorithms are evaluated on the multi-source detection tasks of polyphonic music transcription, drum transcription and environmental sound recognition. The obtained results show how the proposed approaches can improve detection in such applications, while maintaining low computational costs that are suitable for real-time.
Arnaud Dessein, Arshia Cont, Guillaume Lemaitre
Chapter 15. Mining Matrix Data with Bregman Matrix Divergences for Portfolio Selection
Abstract
In the early fifties, Markowitz contributed a theory that considerably simplified portfolio choices. He narrowed down the traditional expected utility model and assumed that investors only care for mean and variance. The mean-variance portfolio theory was born. As it name suggests, mean-variance theory is predicated on simple assumptions that are unfortunately seldomly met in real life. Indeed, it is now a well-established fact that for a host of reasons financial returns do not obey Gaussian distributions. This paper first draws on ideas from econometrics, finance and statistics to derive a rigorous generalization of Markowitz’ mean-variance model to a mean-divergence model, lifted to matrix entries, grounded on exponential families of distributions, that we argue is both more realistic and better suited to further developments in learning. The generalized model turns out to heavily rely on Bregman divergences. There has recently been a burst of attention in on-line learning to learn portfolios having limited risk in Markowitz’ setting. In an on-line framework, we then tackle the problem of finding adaptive portfolio strategies based on our generalized model. We devise a learning algorithm based on new matrix generalizations of p-norms to track non stationary target portfolios with limited risk. Theoretical bounds and preliminary experiments over nearly twelve years of S\(\&\)P 500 confirm the validity of the generalized model, the capacity it brings to spot important events that would otherwise be dampened in the mean-variance model, and the potential of the algorithm. Finally, we make an in depth analysis of the matrix divergences and risk premia derived in our model that shed some theoretical light on the ways the risk premium may be blown up as the investor’s portfolio shifts away from a so-called natural market allocation which defines the best (unknown) allocation at the market’s scale.
Richard Nock, Brice Magdalou, Eric Briys, Frank Nielsen
Chapter 16. Learning Mixtures by Simplifying Kernel Density Estimators
Abstract
Gaussian mixture models are a widespread tool for modeling various and complex probability density functions. They can be estimated by various means, often using Expectation–Maximization or Kernel Density Estimation. In addition to these well known algorithms, new and promising stochastic modeling methods include Dirichlet Process mixtures and k-Maximum Likelihood Estimators. Most of the methods, including Expectation–Maximization, lead to compact models but may be expensive to compute. On the other hand Kernel Density Estimation yields to large models which are computationally cheap to build. In this chapter we present new methods to get high-quality models that are both compact and fast to compute. This is accomplished by the simplification of Kernel Density Estimator. The simplification is a clustering method based on k-means-like algorithms. Like all k-means algorithms, our method rely on divergences and centroids computation and we use two different divergences (and their associated centroids), Bregman and . Along with the description of the algorithms, we describe the pyMEF =library=, which is a Python library designed for the manipulation of mixture of exponential families. Unlike most of the other existing tools, this library allows to use any exponential family instead of being limited to a particular distribution. The generic library allows to rapidly explore the different available exponential families in order to choose the better suited for a particular application. We evaluate the proposed algorithms by building mixture models on examples from a bio-informatics application. The quality of the resulting models is measured in terms of log-likelihood and of Kullback–Leibler divergence.
Olivier Schwander, Frank Nielsen
Chapter 17. Particle Filtering on Riemannian Manifolds. Application to Covariance Matrices Tracking
Abstract
Recently, an extension of the particle filter has been proposed in [13]. This extension deals with tracking hidden states moving on a Riemannian manifold. In fact, in addition to the nonlinear dynamics, the system state is constrained to lie on a Riemannian manifold \({\mathcal M }\), which dimension is much lower than the whole embedding space dimension. The Riemannian manifold formulation of the state space model avoids the curse of dimensionality from which suffer most of the particle filter methods. Furthermore, this formulation is the only natural tool when the embedding Euclidean space cannot be defined (the state space is defined in an abstract geometric way) or when the constraints are not easily handled (space of positive definite matrices).
Hichem Snoussi
Backmatter
Metadaten
Titel
Matrix Information Geometry
herausgegeben von
Frank Nielsen
Rajendra Bhatia
Copyright-Jahr
2013
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-642-30232-9
Print ISBN
978-3-642-30231-2
DOI
https://doi.org/10.1007/978-3-642-30232-9

Neuer Inhalt