Published in:

01-08-2019 | Research Article

# Singular value decomposition of noisy data: mode corruption

Authors: Brenden P. Epps, Eric M. Krivitzky

Published in: Experiments in Fluids | Issue 8/2019

### Abstract

Although the singular value decomposition (SVD) and proper orthogonal decomposition have been widely used in fluid mechanics, Venturi (J Fluid Mech 559:215–254, 2006) and Epps and Techet (Exp Fluids 48:355–367, 2010) were among the first to consider how noise in the data affects the results of these decompositions. Herein, we extend those studies using perturbation theory to derive formulae for the 95% confidence intervals of the singular values and vectors, as well as formulae for the root mean square error (rmse) of each noisy SVD mode. Moreover, we show that the rmse is well approximated by $$\epsilon /\tilde{s}_k$$ (where $$\epsilon$$ is the rms noise and $$\tilde{s}_k$$ is the singular value), which provides a useful estimate of the overall uncertainty in each mode.

Footnotes
1
The SVD is related to the biorthogonal decomposition (Aubry 1991) and the method of empirical orthogonal functions (Loren 1956). The POD (Berkooz et al. 1993; Holmes et al. 1996, 1997) is related to the Karhunen–Loève transform (Karhunen 1946; Loève 1978), principal components analysis (Pearson 1901), the method of empirical eigenfunctions, and the method of snapshots (Sirovich 1987).

2
The number of data sites D is the number of individual pieces of data at each time step. For example, consider sampling two-dimensional velocity data on an $$I \times J$$ grid of field points; then $$D = 2IJ$$ is the total number of data sites.

3
Ideally, $$\mathbf{E}$$ contains i.i.d. noise drawn from a Gaussian distribution, but herein we also consider $$\mathbf{E}$$ containing spatially correlated noise, as occurs in PIV data.

4
In terms of the POD eigenvalues $${\tilde{\lambda }}_k = \tilde{s}_k^2$$, the threshold criterion (3) requires $${\tilde{\lambda }}_k > \epsilon ^2 TD$$.

5
Note that the reconstructed singular values $$\bar{s}_k$$ could be used in place of the noisy ones $$\tilde{s}_k$$, but we find this makes little difference in the predicted rmse.

6
Proof: since $$\mathbf{U}$$ is orthogonal ($$\mathbf{U}\mathbf{U}^\intercal = \mathbf{I}$$), we can write (41) as $$\mathbf{H}= \mathbf{U}{\varvec{{\Lambda }}} \mathbf{U}^\intercal$$. At the same time, $$\mathbf{H}= \mathbf{A}\mathbf{A}^\intercal = \mathbf{U}\mathbf{S}\mathbf{V}^\intercal \mathbf{V}\mathbf{S} \mathbf{U}^\intercal = \mathbf{U}\mathbf{S}^2 \mathbf{U}^\intercal$$.

7
Kato uses the notation: $$\chi$$, $$\mathbf{T}$$, and $$\mathbf{T}(\chi )$$ for $$\epsilon$$, $$\mathbf{H}$$, and $${\tilde{\mathbf{H}}}$$, respectively. Kato and Venturi use $$\mathbf{S}$$ for $$\mathbf{Q}$$.

8
If $$\mathbf{H}$$ has repeated eigenvalues, then Eq. (84) represents the weighted mean of such eigenvalues. In this case, the present theory then needs to be modified (via Kato’s reduction theory). However, these modifications complicate the analysis and prevent one from simplifying the results into forms as simple as, for example, equation (87).

9
Although matrix $$\mathbf{W}^{(1)}$$ refers to mode k, we have omitted the subscript k to facilitate referring to its $$im\text {th}$$ element as $$W^{(1)}_{im}$$. The $$i\text {th}$$ element of vector $${\tilde{\mathbf{u}}}_k$$ is $$\tilde{U}_{ik}$$.

10
Note that all odd “powers” of $$\mathbf{E}$$ average to zero, so $$\langle W^{(1)}_{im}U_{mk} \rangle = \langle W^{(3)}_{im}U_{mk} \rangle = \big \langle ( W^{(1)}_{im}U_{mk} ) \, ( W^{(2)}_{in}U_{nk} ) \big \rangle = \dots = 0$$.

11
Note that again all odd “powers” of $$\mathbf{E}$$ average to zero, so $$\langle N^{(1)}_{im}V_{mk} \rangle = \langle N^{(3)}_{im}V_{mk} \rangle = \dots = 0$$.

12
The author prefers to interpolate using a piecewise cubic Hermite interpolating polynomial, pchip, because it provides continuity of the function and its first derivative while not being susceptible to overshoots as in a cubic spline. In Matlab, $$g' = \texttt {pchip}(x,g, x')$$ returns g(x) evaluated at $$x'$$.

