Skip to main content
Top
Published in: EURASIP Journal on Wireless Communications and Networking 1/2020

Open Access 01-12-2020 | Research

Fast matrix inversion methods based on Chebyshev and Newton iterations for zero forcing precoding in massive MIMO systems

Authors: Sherief Hashima, Osamu Muta

Published in: EURASIP Journal on Wireless Communications and Networking | Issue 1/2020

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In massive MIMO (mMIMO) systems, large matrix inversion is a challenging problem due to the huge volume of users and antennas. Neumann series (NS) and successive over relaxation (SOR) are two typical methods that solve such a problem in linear precoding. NS expands the inverse of a matrix into a series of matrix vector multiplications, while SOR deals with the same problem as a system of linear equations and iteratively solves it. However, the required complexities for both methods are still high. In this paper, four new joint methods are presented to achieve faster convergence and lower complexity in matrix inversion to determine linear precoding weights for mMIMO systems, where both Chebyshev iteration (ChebI) and Newton iteration (NI) are investigated separately to speed up the convergence of NS and SOR. Firstly, joint Chebyshev and NS method (ChebI-NS) is proposed not only to accelerate the convergence in NS but also to achieve more accurate inversion. Secondly, new SOR-based approximate matrix inversion (SOR-AMI) is proposed to achieve a direct simplified matrix inversion with similar convergence characteristics to the conventional SOR. Finally, two improved SOR-AMI methods, NI-SOR-AMI and ChebI-SOR-AMI, are investigated for further convergence acceleration, where NI and ChebI approaches are combined with the SOR-AMI, respectively. These four proposed inversion methods provide near optimal bit error rate (BER) performance of zero forcing (ZF) case under uncorrelated and correlated mMIMO channel conditions. Simulation results verify that the proposed ChebI-NS has the highest convergence rate compared to the conventional NS with similar complexity. Similarly, ChebI-SOR-AMI and NI-SOR-AMI achieve faster convergence than the conventional SOR method. The order of the proposed methods according to the convergence speed are ChebI-SOR-AMI, NI-SOR-AMI, SOR-AMI, then ChebI-NS, respectively. ChebI-NS has a low convergence because NS has lower convergence than SOR. Although ChebI-SOR-AMI has the fastest convergence rate, NI-SOR-AMI is preferable than ChebI-SOR-AMI due to its lower complexity and close inversion result.
Notes
\(^\dag\)Equal contributor

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Abbreviations
5G
5th generation
BER
Bit error rate
BS
Base station
CE
Constant enveloper
CG
Conjugate gradient
ChebI-NS
Joint Chebyshev and NS method
ChebI-SOR-AMI
Joint Chebyshev and SOR-based approximate matrix inversion
ChebI
Chebyshev iteration
CSE
Channel state information
DL
Down link
DPC
Dirty paper coding
GS
Gauss Siedel
i.i.d
Independent and identically distributed
MF
Matched filter
mMIMO
Massive MIMO
MMSE
Minimum mean square error
MSE
Mean square error
NI-Ns
Joint Newton iteration and NS method
NI-SOR-AMI
Joint Newton iteration and SOR-based approximate matrix inversion
NI
Newton iteration
NS
Neumann series
PZF
Phased zero forcing
QAM
Quadrature amplitude modulation
RV
Random variable
RZF
Regularized zero forcing
SNR
Signal to noise ratio
SOR-AMI
SOR-based approximate matrix inversion
SOR
Successive over relaxation
SSOR
Symmetric successive over relaxation
TDD
Time division duplex
THP
Tomlinson Harashima precoding
VP
Vector perturbation
ZF
Zero forcing

1 Introduction

Massive MIMO (mMIMO) is one of the most promising technologies for the 5th generation (5G) communication systems [1]. mMIMO recent applications include machine type communications, drone communications, control circuits in nuclear reactors, and nuclear physics applications. Its channel hardening property ensures mitigating the effect of noise and interference as the number of antennas increase [2]. Hence, linear precoding methods can approximately achieve optimal performance in mMIMO systems [3]. However, there are challenging problems in practical implementation of mMIMO systems such as large matrix inversion resulted from the large number of users and antennas.
Large matrix inversion is an important practical issue that affects the precoder design and performance. A good precoder depends on matrix inversion approximation characteristics such as low complexity and good approximation accuracy. Generally, precoding methods are divided into linear and non-linear ones. Non-linear precoding methods such as constant enveloper (CE), dirty paper coding (DPC) [4], vector perturbation (VP), lattice aided, and Tomlinson-Harashima precoding (THP) are unfriendly to hardware implementation due to their high complexity [5]. Hence, linear precoders such as matched filter (MF), zero forcing (ZF), regularized zero forcing (RZF), phased ZF (PZF), and minimum mean square error (MMSE) are favorable although they need the inversion of channel matrix containing all users [6]. Direct, iterative, and expansions methods are three main categories to calculate large matrix inverse for linear precoding. Direct methods suffer from high complexity as it depends mainly on transferring the matrix to be inverted into a multiplication of simple matrices like QR and Chelosky decomposition [7]. Iterative methods belong to the family of solving linear equations such as Richardson method [8], conjugate gradient (CG) method [9], successive over relaxation (SOR) [10], symmetric successive over relaxation (SSOR) [11], and Gauss-Seidel (GS) method [12]. They have acceptable performance in mMIMO systems. However, these approaches provide indirect matrix inversion approximation as they calculate a product containing matrix inversion and quadrature amplitude modulation (QAM) symbol vectors. In addition, matrix inversion is required separately for specific calculations such as sum rate computations and rapid matrix modifications [13, 21]. The matrix inverse can be directly updated (column added and column deleted) to save the matrix inversion time and complexity. Hence, these methods require more complexity for these specific calculations as the symbol vector is divided. Chebyshev iteration (ChebI) and Newton iteration (NI) provide fast convergence characteristics while their complexity depends on the number of iterations involved [14, 15]. However, both iterative methods require complex calculation of initial input to ensure convergence. The third category, expansion methods, transfers the inverse of a matrix into a series of matrix vector products like Neumann series (NS) [16]. Although NS has slow convergence rate, it not only can approximate matrix inversion separately but also owns simple hardware implementation property [17, 18]. In [19], the authors utilize NI to achieve faster convergence than ordinary NS. This inspired us to replace the quadrature order NI with cubic order ChebI to achieve not only more accurate inversion results but also faster convergence. NS and SOR are recent two research directions to reduce the complexity of matrix inversion in linear precoding. However, their convergence speed should be improved. This motivates us to speed up their convergence using cubic order ChebI.
Although the large precoding gain can be obtained by making use of large number of antennas, interference become dominant factor rather than additive noise. Hence, under these circumstances, ZF precoder is a reasonable choice compared to other precoding methods. So our focus in this paper is on ZF precoding technique for mMIMO systems to reduce its complexity. In this paper, four new joint methods are proposed to achieve faster convergence with reasonable complexity in matrix inversion to determine linear precoding weights for mMIMO systems, where the first iteration result of Chebyshev iteration (ChebI) and Newton iteration (NI) approaches are employed to reconstruct both NS and SOR methods. A high probability of convergence is achieved that can offer useful guidelines for practical mMIMO systems. The main contributions of this paper are five folds. – Firstly, we propose a new joint Chebyshev iteration and Neumann series (ChebI-NS) method that not only achieves faster convergence but also provides more accurate matrix inversion approximation than previous NS methods. – Secondly, we propose a new SOR-based approximate matrix inversion (SOR-AMI) method that directly approximates matrix inversion by separating the QAM symbol vector from the whole iteration process. The new method, which is very useful for further calculations, achieves the same convergence rate as SOR method with lower complexity. – Thirdly, to further improve convergence characteristics of SOR-AMI, we propose joint NI and SOR-AMI method (NI-SOR-AMI), where we adopt one NI iteration to get an efficient searching direction for the following SOR-AMI iterations to achieve a fast convergence rate. – Fourthly, another method to accelerate the convergence of SOR-AMI is to make use of cubic order ChebI instead of quadrature order NI. Hence, joint ChebI and SOR-AMI method (ChebI-SOR-AMI) is the fourth proposed technique that achieves faster convergence rate. – Finally, the above four proposed methods are compared with existing methods in order to prove their faster convergence with reasonable near-ideal ZF 1 performance of downlink (DL) mMIMO system. Based on these results, we discuss the effectiveness of the proposed approaches.
The rest of this paper is organized as follows. Section 2 discusses the system model, mMIMO channel model, and preliminaries about NS expansion, SOR, proposed SOR-AMI method, NI, and Chebyshev iterations. Section 3 describes the four new proposed joint matrix inversion methods. Section 4 presents computational complexity analysis. Simulation results are introduced in Section 5. Finally, Section 6 concludes the work.
Notations: Upper-case and lower-case boldface letters denote matrices and vectors, respectively. (.)T, (.)H, (.)−1, (.)(n), and (.) present transpose, conjugate transpose (Hermitian), inversion, nth iteration number, and pseudo inverse, respectively. CN(μ,σ2IK) denotes the circularly symmetric complex Gaussian distribution with mean μ and co-variance matrix σ2IK where IK is the identity matrix of size K. ||.|| and ||.||2 define the 1-norm and 2-norm, respectively.

2 System model and preliminaries

In this section, we will carefully describe our system model followed by mMIMO channel model. Also, related matrix inversion approaches such as NS, SOR, NI, and ChebI are briefly introduced.

2.1 System model

Figure 1 shows a DL centralized mMIMO system with N antennas equipped at the base station (BS) and serves K <<N single antenna users [1]. If the DL transmitted signal vector after precoding is xCN×1, the received signal vector yCK×1 for K users can be expressed as:
$$ \mathbf{y}=\sqrt{\omega}~ \mathbf{H} \mathbf{x} +\mathbf{n}, $$
(1)
where nCK×1 is the additive white Gaussian noise vector with zero mean and unit variance. ω is a normalization factor to determine signal to noise power ratio (SNR), i.e., SNR is given as \(SNR= \frac {\omega }{ \sigma ^{2}= 1} = \omega \), where σ2 denotes additive noise variance. HCK×N is the DL channel matrix2. Furthermore, H=[h1,h2,....,hk]T, where hkCN is the channel vector between the BS and the kth user modeled as an independent and identically distributed (i.i.d) random vector. x is precoded using the ZF precoder and defined as:
$$ \mathbf{x}=\mathbf{P s}=\beta \mathbf{H}^{\dagger} \mathbf{s}= \beta \mathbf{H}^{H} \left(\mathbf{H}\mathbf{H}^{H}\right)^{-1} \mathbf{s}= \beta \mathbf{H}^{H} \mathbf{W}^{-1} \mathbf{s}, $$
(2)
where sCK×1 is the symbol vector of 64 QAM symbols from K users for transmission [1], PCN×K is the ZF precoding matrix, WCK×K is the Gram matrix defined as HHH, and β is a normalization parameter defined as \(\sqrt {\frac {K}{tr(\mathbf {W}^{-1})}}\) [22], where tr(W−1) defines the trace of W−1. In this paper, we assume perfect channel state information (CSI) at the BS by utilizing the time domain training pilot [23]. In time division duplex (TDD) mMIMO systems, the BS uses the user pilots to estimate the uplink channel. Hence, the DL CSI is achieved using channel reciprocity property in TDD systems.
It is obvious from (2) that the main complexity for ZF precoding is the inversion of K×K matrix W. The Gram matrix W is Hermitian positive definite as in Eq. (3).
$$ \mathbf{u}^{H} \mathbf{W} \mathbf{u}= \mathbf{u}^{H} \left(\mathbf{H}\mathbf{H}^{H}\right)\mathbf{u}= \mathbf{u}^{H} \mathbf{H}\left(\mathbf{u}^{H}\mathbf{H}\right)^{H}, $$
(3)
where u is an arbitrary K×1 non-zero vector.
The columns of the channel matrix H are asymptotically orthogonal and thus H is a full rank matrix [1]. uH equals zero vector only when u is a zero vector. Hence, we have uH(uH)H>0 for all non-zero vectors indicating that W is a positive definite matrix.

2.2 mMIMO channel model

This paper considers not only uncorrelated Rayleigh channel but also spatially correlated ones. The elements of uncorrelated channel, HunCK×N, are independent and identically distributed (i.i.d.) complex Gaussian random variables (RVs) with zero mean and unit variance. On the other hand in the spatially correlated MIMO channel Hco [24], the Kronecker channel model [25], in which HcoCK×N, can be modeled as:
$$ \mathbf{H}_{co}=\mathbf{R}_{rx}^{\frac{1}{2}} \mathbf{H}_{un} \mathbf{R}_{tx}^{\frac{1}{2}}, $$
(4)
where RrxCK×K and RtxCN×N are the correlation matrices for the receive and transmit antennas, respectively. Since we assume single antenna user, then Rrx=I [26]. Note that if also Rtx equals identity matrix, the left hand side of Eq. (4) will be the uncorrelated channel Hun. The (p, q) element of exponentially correlated transmit correlation matrix Rtx is given as [25]:
$$ R_{tx}(p,q)= \left(\zeta~ e^{j ~\Psi}\right)^{q-p}, $$
(5)
where 0≤ζ≤1 denotes the correlation magnitude between adjacent transmit antennas and Ψ is the phase.

2.3 Neumann series (NS) method

According to the Neumann series expansion [17], the required Gram matrix to be inverted, WCK×K, is approximated as a sum of matrix polynomials.
$$ \mathbf{W}^{-1}=\sum_{n=0}^{+\infty}{(\mathbf{I}_{K}- \boldsymbol{\phi} \mathbf{W})^{n} \boldsymbol{\phi}}, $$
(6)
where ϕCK×K preconditioning matrix and IK is the K×K identity matrix. Assumptions of ϕ and the proposed approach to determine it are given in Section 3.1.
The main condition of Eq. (6) to be fulfilled is
$$ {\lim}_{n\to\infty} (\mathbf{I}_{K}-\boldsymbol{\phi} \mathbf{W})^{n} =\mathbf{0}_{K}, $$
(7)
where 0K is a zero matrix of size K×K. For practical use, the inverse, W−1, is approximated according to the value of L which is the maximum number of iterations3
$$ \mathbf{W}^{-1}\approx \hat{\mathbf{W}}^{-1}=\sum_{n=0}^{L}{(\mathbf{I}_{K}-\boldsymbol{\phi} \mathbf{W})^{n} \boldsymbol{\phi}}, $$
(8)
where L is the iteration number and \(\hat {\mathbf {W}}^{-1}\) is the approximated inverse.

2.4 SOR method

The SOR method aims to iteratively solve the Gram matrix inversion problem as a linear equation Wg=s, where g is an unknown vector solution of length K×1. The matrix W is decomposed into
$$ \mathbf{W}= \mathbf{D}+\mathbf{L}+\mathbf{U}, $$
(9)
where D, L, and U=LH are the diagonal component, lower triangular component, and upper triangular component of Hermitian positive definite matrix W.
If Wg=s, i.e., g=W−1s, the nth estimation of W−1s is obtained by substituting Eq. (9) into the SOR method equation as follows [10]:
$$ \mathbf{g}^{({n})}=(\mathbf{D}+\alpha \mathbf{L})^{-1} \left(\alpha \mathbf{s}+((1-\alpha)\mathbf{D} - \alpha \mathbf{U}) \mathbf{g}^{({n}-1)}\right), $$
(10)
where n defines the number of iterations, g(n) is the nth iteration of g which also equals the SOR nth estimation of W−1s, and α is the relaxation parameter. The utilized optimal relaxation parameter in this paper according to [10] equals
$$ \alpha_{opt}= 0.404 ~e^{\left(-0.323 \frac{N}{K}\right)}+1.035. $$
(11)
Note that SOR method computes a product that contains the matrix inverse, i.e., g(n) is the nth estimation of W−1s.

2.5 Newton iteration (NI)

Newton iteration method can be employed to calculate W−1 in an iterative way [14]. Assume that Z(0) is the initial estimation inverse of W−1 and
$$ ||\mathbf{I}_{K}-\mathbf{W} \mathbf{Z}^{(0)} || < 1. $$
(12)
Hence, the (n+1)th iteration estimation of W−1 using NI is obtained by substituting f(Z)=Z−1W in NI function \(\mathbf {Z}^{({n}+1)}= \mathbf {Z}^{({n})}- \frac {f(\mathbf {Z}^{(n)})}{f^{\prime }(\mathbf {Z}^{({n})})}\), where f() is first derivative function of a function whose argument is a matrix as defined in [14, 15]. The final NI formula that calculates the (n+1)th estimation of W−1 is expressed as [14, 15]:
$$ \mathbf{Z}^{({n}+1)}= \mathbf{Z}^{({n})} \left(2\mathbf{I}-\mathbf{W}\mathbf{Z}^{({n})}\right), $$
(13)
where n denotes the number of iterations. If n is large, Eq. (13) is converged to the Gram matrix inverse, i.e., W−1.

2.6 Chebyshev iteration (ChebI)

Chebyshev iteration is a third order convergence algorithm [15]. Similar to NI, substitute the function f(Z)=Z−1W into Chebyshev three terms function \(\mathbf {Z}^{({n}+1)}=\mathbf {Z}^{({n})}-\frac {f(\mathbf {Z}^{(n)})}{f^{\prime }(\mathbf {Z}^{({n})})}- \frac {f^{\prime \prime }(\mathbf {Z}^{({n})})}{2 f^{\prime }(\mathbf {Z}^{({n})})} \left (\frac {f(\mathbf {Z}^{({n})})}{f^{'}(\mathbf {Z}^{({n})})}\right)^{2}\) to get the matrix inversion using ChebI, where f′′() is the second derivative function. Note that the third term of Chebychev Z(n+1) helps ChebI to provide more accurate results than NI. The (n+1)th Chebyshev iteration expression of W−1 is expressed as [15]:
$$ \mathbf{Z}^{({n}+1)}= \mathbf{Z}^{({n})} (3\mathbf{I}-\mathbf{W}\mathbf{Z}^{({n})}(3\mathbf{I}-\mathbf{W} \mathbf{Z}^{({n})})). $$
(14)
If number of iterations is sufficient, Eq. (14) is converged to the matrix inverse, i.e., W−1.

3 Proposed methods

In this section, we will discuss four proposed methods to speed up large matrix inversion calculation. We will start with the first proposal (i.e., ChebI-NS), and then, we will move to the second proposal (i.e., SOR-AMI). To achieve further improvement, we also propose improved SOR-AMI methods as the third and fourth proposal (i.e., ChebI-SORAMI and NI-SOR-AMI).

3.1 Joint Chebyshev iteration and Neumann series method (ChebI-NS)

The initial NS value, ϕ in Eq. (6), greatly affects the convergence. The method of selecting ϕ plays an important role in NS acceleration. There are three assumptions to get ϕ where two of them depend on the special properties of the matrices while the third one focus on getting the initial from other iterations like NS and ChebI. The popular assumption of ϕ is the matrix inversion of K×K diagonal matrix D whose entries are the main diagonal elements of the Gram matrix W, i.e., D−1 [17]. The matrix D−1 can be calculated as follows [18]:
$$ \mathbf{D}^{-1}= diag~\left(\frac{1}{w_{1,1}},....,\frac{1}{w_{{k},{k}}},.....,\frac{1}{w_{{K},{K}}}\right), $$
(15)
where wk,k is the kth diagonal element of Gram matrix W.
The second assumption of ϕ is \(\left (\frac {\mathbf {I}_{K}}{{N}+{K}}\right)\) which also represents a diagonal matrix [1]. This is due to that the largest and smallest eigenvalues of Gram matrix W depends on N and K. As the number of N and K grows, the eigenvalues of the gram matrix converges to a fixed distribution [17]. The third assumption is to utilize the first iteration output of NI, Z(1) as in Eq. (13) with n=0 to initialize NS [19]. Initializing NS with ChebI instead of NI not only provides accurate inversion approximation but also speeds up NS convergence. The advantages of ChebI over NI such as fast convergence and more accurate approximation motivated us to initialize NS with the output of the first iteration of ChebI instead of NI.
In this paper, ChebI is applied first to provide a suitable ϕ to speed up the convergence of NS. The joint ChebI-NS approach main steps to estimate W−1 are:
Step 1 Obtain the inverse of the diagonal matrix of Gram matrix W, i.e, D−1 as in Eq. (15).
Step 2 Apply one Chebyshev iteration (i.e., n=0 in Eq.(14) with initial input Z(0)=D−1 as follows:
$$ \begin{aligned} \mathbf{Z}^{(1)}&= \mathbf{Z}^{(0)} \left(3\mathbf{I}-\mathbf{W} \mathbf{Z}^{(0)}\left(3I-\mathbf{W} \mathbf{Z}^{(0)}\right)\right)\\&=\mathbf{D}^{-1} \left(3\mathbf{I}-\mathbf{W} \mathbf{D}^{-1}\left(3\mathbf{I}-\mathbf{W} \mathbf{D}^{-1}\right)\right). \end{aligned} $$
(16)
Step 3 Apply the obtained first ChebI, Z(1), as an initial to the Neumann series as follows:
$$ \mathbf{W}^{-1}\approx \hat{\mathbf{W}}^{-1}= \sum_{n=0}^{L}{\left(\mathbf{I}_{K}-\mathbf{Z}^{(1)}~\mathbf{W}\right)^{n} \mathbf{Z}^{(1)}}. $$
(17)
An approximated solution, \(\hat {\mathbf {W}}^{-1}\), is obtained for finite number of iterations.
Lemma 1
For DL mMIMO systems, the Neumann series with initial value from Chebyshev iteration, ϕ=Z(1), have a high probability convergence when [16]
$$ \eta \approx \frac{N}{K} > 5.83. $$
(18)
Proof
See Appendix Appendix A Proof of Lemma 1. □
Equation (18) has a practical applications in mMIMO systems as it identifies the suitable number of BS antennas to the number of single antenna users in mMIMO systems. For example, η=8 and η=16 produce two typical downlink mMIMO configuration N×K=256×32 and 256×16 [12]. According to [16,19], these values ensures high probability of convergence of 0.999 due to the large values of η.

3.2 SOR-based approximate matrix inversion method (SOR-AMI)

Our main idea is provided in the following Lemma
Lemma 2
W−1 can be approximated to R(n) when n using iterative SOR method as follows:
$$ \mathbf{R}^{({n})}=(\mathbf{D}+\alpha \mathbf{L})^{-1} \left(\alpha \mathbf{I}_{K}+((1-\alpha)\mathbf{D} - \alpha \mathbf{U}) \mathbf{R}^{({n}-1)}\right), $$
(19)
where R(0) is the initial input and chosen to be the diagonal component, i.e., D−1, R(n) is the nth direct estimation of W−1. An approximated solution, \(\hat {\mathbf {W}}^{-1}\), is obtained for finite number of iterations.
Proof
See Appendix 6. □
The SOR-AMI main steps, to directly estimate W−1, are as follow:
Step 1 Calculate the initial input R0=D−1 from Eq. (15).
Step 2 Apply the obtained R0 on the SOR-AMI method as in Eq.(19).
Since W is Hermitian positive definite, the SOR method is convergent [10]. Hence, as W−1s is approximated by g(n)(n), W−1 can be approximated by R(n)(n). The SOR-AMI method based on Eq. (19) can be utilized to directly calculate W−1. According to Eq. (26), it has the same convergence of SOR iterative method.

3.3 Improved SOR-AMI methods

The convergence of SOR-AMI is accelerated by making use of the fast convergence property of NI and ChebI which is revealed at the beginning of the iteration for SOR-AMI method. Hence, ChebI-SOR-AMI and NI-SOR-AMI are discussed below, respectively.

3.3.1 Joint Chebyshev iteration and SOR-AMI method (ChebI-SOR-AMI)

The joint algorithm main procedures, to directly estimate W−1 using ChebI-SOR-AMI, are as follows:
Step 1 Apply one Chebyshev iteration with initial input Z(0)=D−1 as in Eq. (16).
Step 2 Use the obtained first ChebI, Z(1), to apply on the SOR-AMI method as follows:
$$ {}\mathbf{Z}^{({n})}\,=\, (\mathbf{D}+\alpha \mathbf{L})^{-1} \left(\alpha \mathbf{I}_{K}\,+\,((1-\alpha)\mathbf{D} - \alpha \mathbf{U}) \mathbf{Z}^{({n}-1)}\right)\!, {n}\geq 2. $$
(20)
Since W is Hermitian positive definite, the ChebI-SOR-AMI method is convergent as SOR-AMI has the same convergence of traditional SOR method. Equation (20) calculates W−1 after initializing SOR-AMI method with one iteration of ChebI, i.e., Z(n)W−1. As the number of iterations approaches infinity, i.e., (n), Eq. (20) converges to the exact matrix inverse, i.e., W−1. An approximated solution, \(\hat {\mathbf {W}}^{-1}\), is obtained by finite number of iterations.

3.3.2 Joint Newton iteration and SOR-AMI (NI-SOR-AMI)

Similar to ChebI-SOR-AMI method, NI-SOR-AMI depends on applying one NI as initial input to SOR-AMI. The main steps, to estimate W−1 directly using NI-SOR-AMI, are as follow:
Step 1 Apply one Newton iteration with initial input Z(0)=D−1 as follows:
$$ \mathbf{Z}^{(1)}= \mathbf{Z}^{(0)} \left(2\mathbf{I}-\mathbf{W}\mathbf{Z}^{(0)}\right)= \mathbf{D}^{-1} \left(2\mathbf{I}-\mathbf{W}\mathbf{D}^{-1}\right) $$
(21)
Step 2 Apply the first NI, Z(1), obtained from step 1 to SOR-AMI method similar to Eq. (20).
Similar to ChebI-SOR-AMI, NI-SOR-AMI is convergent and W−1 can be approximated by Z(n)(n) resulted from step 2. Similarly to the previous, an approximated solution, \(\hat {\mathbf {W}}^{-1}\), is obtained for finite number of iterations. The main advantage of this method is its reduced complexity compared with ChebI-SOR-AMI method. Next section discusses the complexity analysis of the proposed methods.

4 Complexity analysis

In this paper, we evaluate the computational complexity analysis of the proposed methods in terms of required number of complex multiplications which is more popular and complicated. The channel coherence interval Tc, defined as the product of coherence time and coherence bandwidth, is under consideration for fair complexity comparison. There are two types of approaches that can solve (2). The first type that include our four proposed methods is to directly compute W−1 every channel coherence interval. Thus, W−1 can be calculated regardless of Tc, while it requires other auxiliary processing which increases the total complexity as Tc increases. On the other hand, the other type is to calculate the precoding weight recursively as a product of W−1s such as SSOR method. Thus, overall complexity is increased as Tc (i.e., the number of symbols per Tc) increases.
The complexity of ZF precoding within Tc is O(K3+TcNK). The NS complexity for different initial ϕ values and more than two iterations (i.e., n>2) is O(K3) compared to the exact matrix inversion. NS implements matrix multiplication and matrix addition which are favorable in hardware as no divisions are required [1,18]. From Eq. (14), one ChebI requires two matrix additions and three matrix multiplications. However, one NI complexity is reduced by one matrix addition and one matrix multiplications according to Eq.(13). When ϕ=D−1 or \(\boldsymbol {\phi }=\frac {\mathbf {I}_{K}}{{N}+{K}}\), the complexity of Eq. (8) is O(K2) for the first iteration (i.e., n=1) and O(K3) for further iterations (i.e., n≥2). Note that for ChebI-NS, the complexity increases at i = 2 due to the added multiplications because of applying one ChebI.
For the SOR-AMI method, the complexity is O(K2+TcNK) as only two matrix multiplications are required. This means that SOR-AMI convergence is faster than NS and also has lower complexity especially at large iteration numbers. The computational complexity of NI-SOR-AMI is slightly lower than ChebI-SOR-AMI by one matrix multiplication and one matrix addition.
The overall complexities of the proposed methods in addition to NS, NI-NS [19], and SSOR [11] are shown in Table 1 considering the channel coherence interval Tc. Small Tc values do not greatly affect our proposed methods complexity because it is defined as complexity per the number of symbols during Tc. Hence, it greatly reduces SSOR method complexity. For large Tc values, if the product of TcNK is larger than K3, then the complexity is increased with Tc increment; otherwise, it has small effect compared to K value. SOR-AMI complexity is lower than SSOR and in the same time it directly computes W−1. ChebI-NS complexity O(K3) is close to its traditional NS methods. Note that SSOR method estimates the matrix inversion using two SOR iterations in both the forward and reverse order. Also, SSOR calculates the matrix inverse indirectly, i.e., W−1s, while other methods compute it directly that is highly recommended for fast matrix inverse updates [21]. The proposed SOR-AMI method has the lowest complexity compared to other methods. Initializing SOR-AMI with either NI or ChebI slightly increases its complexity but provides faster convergence results to achieve close inversion results at low iterations.
Table 1
Complexity analysis (N number of BS antennas, K number of users, i iteration number, Tc channel coherence interval)
Method
i = 2
i = 3
i = 4
NS
3K2K+TcNK
K3+K2+TcNK
2K3+TcNK
ChebI-NS
3K3+2K2+TcNK
4K3+K2+TcNK
5K3+TcNK
NI-NS [19]
2K3+2K2+TcNK
3K3+K2+TcNK
4K3+TcNK
SOR-AMI
2K2+TcNK
3K2+TcNK
4K2+TcNK
ChebI-SOR-AMI
3K3+TcNK+2K2
3K3+TcNK+3K2
3K3+TcNK+4K2
NI-SOR-AMI
2K3+TcNK+K2
2K3+TcNK+2K2
2K3+TcNK+4K2
SSOR [11]
Tc(6K2+3K+NK)
Tc(8K2+3K+NK)
Tc(10K2+3K+NK)

5 Results and discussion

To evaluate the effect of the proposed method, we conducted computer simulation. System model is the same as in Section 2.1. In the three proposed methods (i.e., ChebI-NS, ChebI-SOR-AMI, and NI-SOR-AMI), the initial values of Newton and Chebyshev iterations are the diagonal component of W. To evaluate the proposed methods, the Frobenius norm error and bit error rate (BER) as performance metrics. Un-coded system is assumed during simulation. Also, the average BER of all users are calculated during simulation. The MSE is defined as follows.
$$ F_{error}= \| \mathbf{W}^{-1}- \hat{\mathbf{W}}^{-1} \|_{F}, $$
(22)
where W−1 and \(\hat {\mathbf {W}}^{-1}\) are defined as ideal inverse of the Gram matrix and approximated solution by the three abovementioned proposed methods. The ZF precoding with exact matrix inversion of W is added to our results as the benchmark. Two configurations N×K=256×32 and N×K=128×16 are considered. The utilized modulation scheme is 64 QAM. The parameters of correlated channel model are set to ζ=0.1 and Ψ=60 phase shift.
Figure 2 shows the Monte Carlo simulation results for the Frobenius norm error between exact Gram matrix inverse and its approximated inverse against number of BS antennas, N, for NI-NS, ChebI-NS, SOR-AMI, NI-SOR-AMI, and ChebI-SOR-AMI methods under uncorrelated channel conditions after 10,000 MC trials and for second, third, and fourth iterations, respectively. The MSE is plotted against N to measure the inversion error for each proposed scheme. Our error calculations neglects the modulation effect as our main focus is on the error resulted from precoding matrix inversion approximation. At the second iteration, The error of SOR-AMI method is the largest followed by NI-NS method that have the largest 2-norm error at the following two iterations. According to Lemma 1, when N is known, the number of users K can be easily calculated and there will be a high convergence probability of the inversion. As small N values, the error decreases because of the small matrix inversion dimensions. The figure illustrates the merits of initializing NS and SOR-AMI with ChebI. The three terms Chebyshev iteration is more accurate than NI in spite of increased computational complexity by one matrix addition and multiplication. Also, Because SOR-AMI convergence is faster than NS convergence, the MSE for the three based SOR-AMI methods are lower than ChebI-NS and NI-NS methods.
For ease of illustration and discussion, the next three figures divide the results into three parts. Figure 3 compares the proposed ChebI-NS with other NS-based methods. Figure 4 do the same but for SOR-AMI, ChebI-SOR-AMI, NI-SOR-AMI with SOR based methods. Finally, Fig. 5 compares the all four proposed methods with each other. Figure 3a shows the BER against SNR of NI-NS, diagonal-based NS, new ChebI-NS, and NS with initial \(\frac {I_{K}}{{N+K}}\) under uncorrelated channel conditions with N×K=128×16 at the second iteration. Figure 3b and c are for third and fourth iterations, respectively. From the three sub-figures, the new ChebI-NS algorithm has the superior performance close to ZF followed by the NI-NS method at the third and fourth iteration and has close performance to NI-NS at the second iteration. Figure 3 d, e, and f show the same analysis performed under correlated channel conditions with ζ=0.1 and 60 phase shift for the second, third, and fourth iteration, respectively. It is worth noting that at the second iteration, i.e., Fig. 3a, the NI-NS [19] converges slightly faster than ChebI-NS, but the reverse occurs under correlated channel conditions, i.e., Fig. 3c. Their performance is still not close to the optimal ZF. Hence, their matrix inversion results lack accuracy due to low utilized iterations. In the third and fourth iteration, i.e., Fig. 3b, c, e, and f, ChebI-NS is the nearest method to optimal performance. Because NS has a slow convergence rate, it requires more than two iterations for more accurate matrix inversion. Therefore at these conditions, the new ChebI-NS method gains a superior performance more than other NS approaches ensuring its fastest convergence.
Figure 4 shows the second and fourth iteration of the BER against SNR of new proposed SOR-AMI, NI-SOR-AMI, ChebI-SOR-AMI, and SSOR with N×K=265×32 under uncorrelated channel, Fig. 4 a and b, and correlated channel, Fig. 4c and d, respectively. From the figure, the new Cheb-SOR-AMI algorithm has the superior performance close to ZF followed by the NI-SOR-AMI method. The results at second iteration, i.e., Fig.4a and c, indicate that both ChebI-SOR-AMI and NI-SOR-AMI converge fast to the optimal performance; however, at the fourth iteration, all methods have close performance to ZF.
Figure 5 presents a performance comparison among the new proposed methods under uncorrelated channels, Fig. 5a and b, and correlated channels, Fig. 5c and d, for the second and fourth iterations, with N = 128 and K= 16, respectively. In correlated channel results, the BER error floor for the proposed methods show a good convergent trend similar to uncorrelated channels. ChebI-SOR-AMI has much better performance than other methods. SOR-AMI convergence speed is faster than NS; hence, SOR-AMI-based methods provide accurate results at lower iterations. Also, at low iterations, we recommend utilizing NI-SOR-AMI than ChebI-SOR-AMI as it has close result with reduced complexity. ChebI-NS has low performance due to the slower convergence of NS than SOR method. However, ChebI-NS is preferable for designers due to the ease of NS hardware implementation. Also, our proposed methods are robust to channel correlation more than existing NS, and SOR methods.

6 Conclusion

In this paper, we have investigated the slow convergence of both NS and SOR methods in linear precoding. For this purpose, we have proposed four joint methods to calculate ZF linear precoding weights for mMIMO systems, i.e., ChebI-NS, SOR-AMI, NI-SOR-AMI, and ChebI-SOR-AMI. ChebI-NS has been proposed to speed up the convergence of NS and also to give more accurate approximation. Unlike traditional SOR method, SOR-AMI method directly calculates the matrix inversion without multiplying the symbol vector. Joint ChebI-SOR-AMI and NI-SOR-AMI are based on applying fast converging Chebyshev/Newton iteration as initial step of SOR-AMI method. Simulation results illustrate that the proposed methods give not only accurate results but also fast convergence under both uncorrelated and correlated channel conditions. ChebI-NS method is the fastest among NS-based methods. NI-SOR-AMI is more preferable than ChebI-SOR-AMI as it achieves close performance to ChebI-SOR-AMI with lower complexity. Although NS convergence is lower than SOR-AMI, it is preferable in hardware implementation. Hence, ChebI-NS is important to accelerate the convergence of such a prominent method. Further investigations of the proposed methods with other linear precoders like MMSE and RZF is under consideration as future work.

7 Appendix A Proof of Lemma 1

First, we will approve the convergence of ChebI-NS, i.e., \(\sum _{{n}=0}^{\infty }{\left (I_{K}- \mathbf {Z}^{(1)} \mathbf {W}\right)^{n} \mathbf {Z}^{(1)}}\) then from the convergence approval, the theory is proofed.
According to Eq. (6), the condition of convergence of ChebI-NS is \({\lim }_{n\to \infty } (\mathbf {I}_{K}-\mathbf {Z}^{(1)} \mathbf {W})^{n} =\mathbf {0}_{K} \).
Let matrix A=IKZ(1)W.
Since, ρ(IKZ(1)W)<1⇔|λi(A)|<1,
where λi(A) denotes the ith eigenvalue of A, 1≤ i ≤ k, ρ(IKZ(1)W)=ρ(A)<1,where ρ(A) is the spectral radius of A i.e., λmax(A) the largest absolute value of A eigenvalues
Substituting the value of Z(1) from Eq. (16) into the matrix A yields
$$ {\begin{aligned} \mathbf{A}&=\mathbf{I}_{K}- \mathbf{Z}^{(1)} \mathbf{W}= \mathbf{I}_{K}-\mathbf{D}^{-1} (3\mathbf{I}-\mathbf{W} \mathbf{D}^{-1}(3\mathbf{I}-\mathbf{W} \mathbf{D}^{-1}))\mathbf{W}\\&= (\mathbf{I}_{K}- \mathbf{D}^{-1} \mathbf{W})^{3}. \end{aligned}} $$
(23)
Let B=IKD−1W then \(\sum _{n=0}^{\infty }{(\mathbf {I}_{K}- \mathbf {D}^{-1} \mathbf {W})^{n} \mathbf {D}^{-1}}\) converges to |λi(B)|<1 which have a high probability convergence as in Lemma 1 [16].
Since, A=B3 and λi(A)=λi(B)3then \(\sum _{{n}=0}^{\infty }{(\mathbf {I}_{K}- \mathbf {Z}^{(1)} \mathbf {W})^{n} \mathbf {Z}^{(1)}}\) converges as \(\leftrightarrow \sum _{n=0}^{\infty }{(\mathbf {I}_{K}- \mathbf {D}^{-1} \mathbf {W})^{n} \mathbf {D}^{-1}}\) converges too.
This approves the convergence of ChebI-NS.
From equations (11- 17) in [16], a high probability convergence condition for\(\sum _{{n}=0}^{\infty }{(\mathbf {I}_{K}- \mathbf {D}^{-1} \mathbf {W})^{n} \mathbf {D}^{-1}}\) equals \(\eta > \frac {1}{(\sqrt {2}-1)^{2}}\), i.e., η>5.83. Thus, since we approved the convergence of ChebI-NS, the same high probability convergence for ChebI-NS is achieved when η>5.83 [16]. Hence, the proof is finished.

8 Appendix B Proof of Lemma 2

Substituting R(0)=D−1 into g(0)=D−1s yields the following:
$$ \mathbf{g}^{(0)}= \mathbf{R}^{(0)} \mathbf{s}. $$
(24)
Hence, for kth iteration, g(k)=R(k)s. Substituting this result in SOR method i.e. Eq. (10), R(k+1) is obtained as
$$ {\begin{aligned} \mathbf{g}^{(k+1)}&= (\mathbf{D}+\alpha \mathbf{L})^{-1} \left(\alpha ~\mathbf{s} +((1-\alpha)~ \mathbf{D} - \alpha \mathbf{U}) \mathbf{g}^{(k)}\right)\\ &=(\mathbf{D}+\alpha \mathbf{L})^{-1} \left(\alpha \mathbf{s} +((1-\alpha) \mathbf{D} - \alpha \mathbf{U}) \mathbf{R}^{(k)} \mathbf{s}\right)= \mathbf{R}^{(k+1)} \mathbf{s}. \end{aligned}} $$
(25)
Hence, based on the mathematical induction, we can obtain the following:
$$ \mathbf{g}^{(n)}=\mathbf{R}^{(n)} \mathbf{s},~ (n\geq 0). $$
(26)
Equation (26) ends the proof.

Acknowledgements

Our sincere thanks to MOHE and Center for Japan-Egypt Cooperation in Science and Technology, Kyushu University, Japan, for their guidance, support, and encouragement.

Competing interests

The authors declare that they have no competing interests.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://​creativecommons.​org/​licenses/​by/​4.​0/​), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Footnotes
1
In ideal ZF, exact value of Gram matrix is used. Thus, the result corresponds to case where Gram matrix inverse is obtained with sufficient accuracy.
 
2
Two channel models are considered in the next subsection: (i) uncorrelated Rayleigh channel, named HunCK×N, with Gaussian distribution of zero mean and unit variance, and (ii) spatial correlated channel, named HcoCK×N, as in [23].
 
3
Mainly, L is the number of expanded terms. However, in this paper, L equals the number of iterations(i) so as to compare NS convergence with other methods. The maximum value of L=4, i.e., 4th iteration as it provides a good trade-off between complexity and performance [20].
 
Literature
3.
go back to reference E. Bjrnson, J. Hoydis, L. Sanguinetti, Massive MIMO networks: spectral, energy, and hardware efficiency. Found. Trends Signal Process. (2018). doi:10.1561/2000000093.CrossRef E. Bjrnson, J. Hoydis, L. Sanguinetti, Massive MIMO networks: spectral, energy, and hardware efficiency. Found. Trends Signal Process. (2018). doi:10.1561/2000000093.CrossRef
18.
go back to reference A. Thanos, Algorithms and Hardware Architectures for Matrix Inversion in Massive MIMO Uplink Data Detection, M.Sc. Thesis (University of PATRAS, 2017). A. Thanos, Algorithms and Hardware Architectures for Matrix Inversion in Massive MIMO Uplink Data Detection, M.Sc. Thesis (University of PATRAS, 2017).
19.
go back to reference L. Shao, Y. Zu, Joint Newton iteration and Neumann series method of convergence accelerating matrix inversion approximation in linear precoding for massive MIMO systems. Hindawi J. Math. Probl. Eng. (2016). doi:10.1155/2016/1745808.MathSciNetMATH L. Shao, Y. Zu, Joint Newton iteration and Neumann series method of convergence accelerating matrix inversion approximation in linear precoding for massive MIMO systems. Hindawi J. Math. Probl. Eng. (2016). doi:10.1155/2016/1745808.MathSciNetMATH
Metadata
Title
Fast matrix inversion methods based on Chebyshev and Newton iterations for zero forcing precoding in massive MIMO systems
Authors
Sherief Hashima
Osamu Muta
Publication date
01-12-2020
Publisher
Springer International Publishing
DOI
https://doi.org/10.1186/s13638-019-1631-x

Other articles of this Issue 1/2020

EURASIP Journal on Wireless Communications and Networking 1/2020 Go to the issue

Premium Partner