We consider a typical single-cell massive MIMO communication system, where a BS equipped with a uniform linear array (ULA) of
M antennas communicates with multiple users through a multipath channel.
3 Following the common assumption that the pilot sequences of different users are orthogonal to each other in the time–frequency domain, and since the observations used for channel covariance estimation use only the UL pilots, without loss of generality we focus on a generic user. Figure
1 visualizes the propagation model based on multipath clusters, which is physically motivated and widely adopted in standard channel simulation tools such as QuaDriGa [
18]. During UL transmission, on each time–frequency resource block (RB)
s, the BS receives an UL user pilot carrying a measurement for the channel vector
\({\textbf {h}}[s]\). We assume that the window of
N samples collected for covariance estimation is designed such that the samples are enough spaced in the time–frequency domain and resulting in statistically independent channel snapshots
\(\{{\textbf {h}}[s]: s = [N]\}\). Meanwhile, the whole window spans a time significantly shorter than the geometry coherence time so that the WSS assumption holds (see discussion in Sect.
1) and the channel snapshots are identically distributed. The channel vectors are given by [
38]
$$\begin{aligned} {\textbf {h}}[s] = \int _{-1}^1 \rho (\xi ;s) {\textbf {a}}(\xi ) d \xi ,~s\in [N], \end{aligned}$$
(1)
where
\(\rho (\xi ;s)\) is the channel complex coefficient at the normalized AoA
\(\xi = \frac{\sin (\theta )}{\sin (\theta _{\text {max}})} \in [-1,1)\), where
\(\theta _{\text {max}} \in [0,\frac{\pi }{2}]\) is the maximum array angular aperture
4;
\({\textbf {a}}(\xi ) \in {\mathbb C}^M\) denotes the array response vector as a function of
\(\xi\), with
m-th element given by
\([{\textbf {a}}(\xi )]_m=e^{j\frac{2\pi d}{\lambda _0}m \xi \sin (\theta _{\text {max}})}\), where
d denotes the antenna spacing and
\(\lambda _0\) denotes the carrier wavelength. For convenience, we assume the antenna spacing to be
\(d = \frac{\lambda _0}{2\sin (\theta _{\text {max}})}\). Thus, the array response vector is given as
$$\begin{aligned} {\textbf {a}}(\xi ) = \left[ 1, e^{j\pi \xi },\dots , e^{j\pi (M-1)\xi }\right] ^{\textsf {T}}. \end{aligned}$$
(2)
The channel coefficient
\(\rho (\xi ;s)\) represents the small-scale multipath fading component at a given AoA, and it is modeled as a complex circularly symmetric Gaussian process with respect to
\(\xi\). Due to the WSS property, the channel second-order statistics are invariant with respect to the index
\(s \in [N]\). In particular,
\(\rho (\xi ;s)\) has mean zero and variance
\({\mathbb {E}}\left[ \rho \left( \xi ;s\right) \rho ^*\left( \xi ;s\right) \right] = \gamma \left( \xi \right)\). The function
\(\gamma : [-1,1]\rightarrow \mathbb {R}_+\) is a real nonnegative measure that describes how the channel energy is distributed across the angle domain, and it is referred to as the channel ASF. From (
1) and the ASF definition, it follows that the channel spatial covariance matrix, describing the correlation of the channel coefficients at the different antenna elements, is given by
$$\begin{aligned} {\varvec{\Sigma }}_{{\textbf {h}}}={\mathbb {E}}\left[ {\textbf {h}}[s]{\textbf {h}}[s]^{\textsf {H}}\right] =\int _{-1}^1 \gamma (\xi ) {\textbf {a}}(\xi ) {\textbf {a}}(\xi )^{\textsf {H}} d \xi . \end{aligned}$$
(3)
Notice that
\({\varvec{\Sigma }}_{{\textbf {h}}}\) is Toeplitz. This fact is verified when all the scattering clusters (see Fig.
1) are in the far field of the BS array.
5 At RB
s, the received pilot signal at the BS is given as
$$\begin{aligned} {\textbf {y}}[s] = {\textbf {h}}[s] x[s] + {\textbf {z}}[s],~s\in [N], \end{aligned}$$
(4)
where
x[
s] is the pilot symbol and
\({\textbf {z}}[n] \sim {{{\mathcal {C}}}{{\mathcal {N}}}}(\textbf{0},N_0 \textbf{I}_M)\) is the additive white Gaussian noise (AWGN). Without loss of generality, we assume that the pilot symbols are normalized as
\(x[n]=1, \; \forall \, s \in [N]\). The goal of this work is to estimate the channel covariance matrix
\({\varvec{\Sigma }}_{{\textbf {h}}}\) with the given set of
N noisy channel observations
\(\{ {\textbf {y}}[s]: s\in [N]\}\).
3.1 Sample covariance matrix
We start by reviewing the sample covariance estimator. For known noise power
\(N_0\) at the BS, the sample covariance matrix is given by
6$$\begin{aligned} \widehat{{\varvec{\Sigma }}}_{{\textbf {h}}} = \frac{1}{N} \sum _{s=1}^{N} {\textbf {y}}[s] {\textbf {y}}[s]^{\textsf {H}} - N_0 \textbf{I}_M. \end{aligned}$$
(5)
This is a consistent estimator, in the sense that it converges to the true covariance matrix as
\(N \rightarrow \infty\) [
40, Section 1.2.2]. The mean square (Frobenius norm) error incurred by the sample covariance estimator is given as [
40]
\({\mathbb {E}}\left[ \left\| \widehat{{\varvec{\Sigma }}}_{{\textbf {h}}}-{\varvec{\Sigma }}_{{\textbf {h}}}\right\| ^2_{\textsf {F}}\right] = \frac{{\hbox {tr}}\left( {\varvec{\Sigma }}_{{\textbf {h}}}\right) ^2}{N}\). By applying the Cauchy–Schwarz inequality to the singular values of
\({\varvec{\Sigma }}_{{\textbf {h}}}\), it is seen that
\({\hbox {tr}}({\varvec{\Sigma }}_{{\textbf {h}}}) \le \Vert {\varvec{\Sigma }}_{{\textbf {h}}}\Vert _{\textsf {F}}\sqrt{\text {rank}({\varvec{\Sigma }}_{{\textbf {h}}})}\), which together with the estimation error expression yields the upper bound to the normalized mean squared error
\({\mathbb {E}}\left[ \frac{\Vert \widehat{{\varvec{\Sigma }}}_{{\textbf {h}}}-{\varvec{\Sigma }}_{{\textbf {h}}}\Vert ^2_{\textsf {F}}}{\Vert {\varvec{\Sigma }}_{{\textbf {h}}}\Vert ^2_{\textsf {F}}}\right] \le \frac{\text {rank}({\varvec{\Sigma }}_{{\textbf {h}}})}{N}\). As already discussed, a relevant and interesting regime for massive MIMO is when
N and
M are of the same order. From the above analysis, it is clear that the sample covariance estimator yields a small error if
\(\text {rank}({\varvec{\Sigma }}_{{\textbf {h}}}) \ll N\). For example, if the scattering contains only a finite number of discrete components (e.g., as assumed in [
33‐
35]),
\(\text {rank}({\varvec{\Sigma }}_{{\textbf {h}}})\) is small even if
M is very large. In contrast, if
\(\gamma (\xi )\) contains a diffuse scattering component, i.e., if its cumulative distribution function
\(\Gamma (\xi ) = \int _{-1}^{\xi } \gamma (\nu ) d\nu\) is piecewise continuous with strictly monotonically increasing segments, then
\(\text {rank}({\varvec{\Sigma }}_{{\textbf {h}}})\) increases linearly with
M (see [
10]) and the error incurred by the sample covariance estimator may be large. On the other hand, the presence of discrete scattering components implies that
\(\gamma (\xi )\) contains Dirac delta functions (spikes) and therefore it is not squared-integrable. This poses significant problems for estimation methods that assume
\(\gamma (\xi )\) to be an element in a Hilbert space of functions (e.g., the method proposed in [
21]). The challenge tackled in this work is to devise an estimator which is able to handle both the small sample regime
\(N/M \le 1\) and the presence of discrete and diffuse scattering.
3.2 Structure of the channel covariance matrix
As said, the ASF describes how the received signal power is distributed over the AoA domain. The signal from the UE to the BS array propagates through a given scattering environment. The line-of-sight (LoS) path (if present), specular reflections, and wedge diffraction occupy extremely narrow angular intervals. This is usually modeled in a large number of papers as the superposition of discrete separable angular components coming at normalized AoAs
\(\{\phi _i\}\). In particular, it is assumed that the general form (
1) reduces to the discrete sum of
r paths
\({\textbf {h}}[s] = \sum _{i=1}^r \rho _i[s] {\textbf {a}}(\phi _i)\), with corresponding ASF
\(\gamma (\xi ) = \sum _{i=1}^r c_i \delta (\xi - \phi _i)\) and covariance matrix
\({\varvec{\Sigma }}_{{\textbf {h}}} = \sum _{i=1}^r c_i {\textbf {a}}(\phi _i) {\textbf {a}}(\phi _i)^{\textsf {H}}\), where
\(c_i = {\mathbb {E}}[|\rho _i[s]|^2]\). However, it is well known from channel sounding observations (e.g., see [
41]) and widely treated theoretically (e.g., see [
38]) that
diffuse scattering is typically also present and may carry a very significant part of the received signal power especially at frequencies below 6GHz. In this case, scattering clusters span continuous intervals over the AoA domain. In order to encompass full generality, we model the ASF
\(\gamma (\xi )\) as a
mixed-type distribution [
42, Section 5.3] including discrete and diffuse scattering components:
$$\begin{aligned} \gamma (\xi ) = \gamma _d (\xi ) + \gamma _c (\xi ) = \sum _{i=1}^{r} c_i \delta (\xi - \phi _i) \, +\, \gamma _c (\xi ), \end{aligned}$$
(6)
where
\(\gamma _d(\xi )\) models the power received from
\(r \ll M\) discrete paths and
\(\gamma _c(\xi )\) models the power coming from diffuse scattering clusters. Since the ASF can be seen as a (generalized) density function, we borrow the language of discrete and continuous random variables and refer to
\(\gamma _d(\xi )\) and to
\(\gamma _c(\xi )\) as the
discrete and the
continuous parts of the ASF, respectively.
7 This corresponds to a so-called spiked model in the language of asymptotic random matrix theory (e.g., see [
28]), where the spikes are the discrete scattering components.
Plugging (
6) into (
3), we obtain a corresponding decomposition of the channel covariance matrix as
$$\begin{aligned} \begin{aligned} {{\varvec{\Sigma }}}_{{\textbf {h}}}= {{\varvec{\Sigma }}}_{{\textbf {h}}}^d + {{\varvec{\Sigma }}}_{{\textbf {h}}}^c&= \sum _{i=1}^{r} c_i {\textbf {a}}(\phi _i) {\textbf {a}}(\phi _i)^\textsf {H}+ \int _{-1}^1 \gamma _c (\xi ) {\textbf {a}}(\xi ) {\textbf {a}}(\xi )^\textsf {H}d \xi . \end{aligned} \end{aligned}$$
(7)
Notice that, in a typical massive MIMO scenario,
\(\text {rank}({{\varvec{\Sigma }}}_{{\textbf {h}}}^d) = r\) is much smaller than
M.