Skip to main content

DFT codebook-based hybrid precoding for multiuser mmWave massive MIMO systems

Abstract

In millimeter wave (mmWave) massive MIMO (multiple-input multiple-output) systems, it is difficult to apply conventional digital precoding techniques due to hardware constraints. Fortunately, the hybrid precoding can be utilized to reduce power consumption and high costs. In this paper, a codebook-based hybrid precoding scheme for downlink multiuser mmWave massive MIMO systems is proposed. Our main idea is that the analog and digital precoders are designed separately to maximize the achievable sum rate. In the analog domain, we take the potential multiuser conflict and angular domain of channel matrix into consideration and propose an efficient conflicting-aware (CA) beam-column selection method to obtain a discrete Fourier transform (DFT) codebook-based analog precoder. According to the CA method, all users are classified into two groups, i.e., conflicting users (CUs) and non-conflicting users (NCUs). Different criteria of beam-column selection are applied for the two user groups. Then, zero-forcing (ZF) digital precoder is directly used in the digital domain. Simulation results illustrate that our proposed algorithm which has low complexity achieves satisfactory SR performance, which approaches that of the full digital precoding (the upper bound) and outperforms other existing hybrid algorithms.

1 Introduction

Millimeter-wave (mmWave) massive MIMO has been considered as a promising candidate for the fifth generation of mobile telecommunications (5G) [1], since it will enable gigabits per second transmitting data rate owing to the large bandwidth available in mmWave spectrum. In order to implement high-quality communication in mmWave massive MIMO systems, the deployment of a large number of antenna arrays are required at base stations (BSs). Each BS needs to serve multiple mobile stations (MSs) simultaneously for efficient system performance. The precoding technology that is applied to generate transmitting signals at BSs is usually processed in the baseband for conventional lower frequency systems. Nevertheless, for conventional digital precoding, energy-intensive radio frequency (RF) chains are required for each antenna. The energy consumption of each RF chain (about 250 mW per RF chain [2]) is a large portion of the total energy consumption. If traditional full digital precoding is used in mmWave massive MIMO systems, the relevant large number of RF chains will produce high energy consumption. The high hardware cost of numerous RF chains makes traditional fully digital precoders unfit for mmWave massive MIMO systems unfortunately [3]. So precoding in the above systems is a trend to be divided into analog domain and digital domain due to current research.

The authors in [3] investigated hybrid precoding which contains analog and digital precoding for single-user scenario. It was indicated that hybrid precoding technology can reduce the number of required RF chains and performs almost the same as fully digital precoding [3] was designed to obtain spatial multiplexing gain, and it can achieve good performance only when numerous antennas are deployed at both transmitter and receiver. The beam steering scheme for the multiuser scenario was proposed in [4], which firstly chose the best beam steering vectors as the analog precoding matrix for each user and then applied the zero-forcing (ZF) precoding in the digital domain. This algorithm performed well in single-path scenarios, but would deteriorate a lot in multi-path scenarios because beam steering could not make full use of multipath channel gains.

Another related set of works has considered the codebook-based hybrid precoding, where a precoder is selected from a set of a priori fixed codebook. The complexity of the codebook-based design is generally much lower than the non-codebook-based design. The codebooks in the IEEE 802.15.3c standard [5] of indoor systems and IEEE 802.11ad [6] standard of wireless local area networks (WLAN) were designed. A Q-bit resolution codebook using the optimal precoding weight vector was proposed in [7] to achieve a uniform maximum gain in all directions. However, the column vectors in codebooks of [5,6,7] are not orthogonal, which may lead to performance degradation for multiuser scenarios due to the multiuser interference. For mmWave systems, an efficient discrete Fourier transform (DFT) codebook-based precoding training scheme was proposed in [8]. The analog precoding matrix is produced by column vector selection from the DFT codebook whose columns are orthogonal [9]. proposed hybrid precoding design, where mutually unbiased bases combined (MUB) were applied for digital precoder and the DFT codebook was invoked for analog precoder. A single-user scenario was considered in [9], which had a good performance and low complexity. For multiuser scenario, the DFT codebook-based beam selection scheme was proposed in [10] to formulate the hybrid precoding design into an optimization problem, referred to exhausted searching which traverses all the possible beam combinations. The joint optimization of the DFT codebook-based hybrid precoding was investigated in [11] for multiuser mmWave systems under two performance measures: sum rate (SR) and energy efficiency (EE). The joint codeword selection and precoder design (JWSPD) scheme in [11] are flexible, which can achieve a satisfactory performance for both SR and EE.

In this paper, we propose a new codebook-based hybrid precoding to maximize the achievable SR for multiuser mmWave massive MIMO systems. Especially, the analog and digital precoders are designed separately. In the analog domain, the constant modulus limitation of analog precoder should be taken into account. So the analog precoder is derived from the DFT codebook whose column vectors are orthogonal and each entry of which is constant modulus. We propose an efficient conflicting-aware (CA) beam selection scheme for column selection from the DFT codebook of analog precoder, with the consideration of the potential multiuser conflict. Note that the same beam column in DFT codebook may be selected for different users, which will cause severe multiuser interference. Moreover, some RF chains are likely to be wasted because they do not contribute to SR. Hence, all users are firstly classified into two groups, i.e., non-conflicting users (NCUs) and the conflicting users (CUs). If the dominant beam index of one user differs from any other user’s, the user is defined as NCU. On the contrary, the user whose dominant beam index is the same as any other users is defined as CU. Then, different criteria of beam-column selection are applied for the two user groups. For NCUs, the beams with the largest signal to leakage plus noise ratio (SLNR) are picked out, while for CUs, the best beams are chosen by an incremental algorithm with low complexity in order to maximize the achievable SR. The proposed CA searching scheme performs similarly to the exhausted searching algorithm in [10]. In the digital domain, zero-forcing (ZF) technique [12] is adopted to obtain the digital precoder. An arbitrary number of RF chains of the selection is considered in this paper without the restriction that the number of RF chains equals the number of users. Finally, the numerical results validate that our proposed codebook-based hybrid precoding scheme achieves satisfactory SR performance, which approaches to that of the full digital precoding (the upper bound), and gets a better performance than other existing hybrid precoding algorithms, such as [4, 5, 7, 11]. The contributions of the paper are summarized as follows.

  • The hybrid precoding design is investigated in multiuser mmWave massive MIMO systems under SR performance optimization with non-convex constraint. Our scheme is based on the idea that analog and digital precoders are separately designed.

  • Considering the constant modulus limitation of analog precoder, we propose a DFT codebook-based analog precoder, whose orthogonal beam columns (i.e., analog precoding vectors) are specified by DFT codeword.

  • We propose a novel CA beam-column selection method for designing analog precoder, which can efficiently classify users into two groups, i.e., CUs and NCUs. Different criteria of beam-column selection are applied for the two user groups.

  • Our proposed codebook-based hybrid precoding scheme has low computational complexity. The channel matrix is mapped from the spatial domain to angular domain. Since the channel matrix in the spatial domain is sparse, the mapping can decrease the complexity while selecting beams. Then, due to the Sherman-Morrison-Woodbury identity [13], a recursive way can be used on the matrix inversion which can decrease the complexity.

The remaining of this paper is organized as follows. Section 2 introduces the system model and formulates the SR maximization problem. The proposed DFT codebook-based hybrid precoding design with CA beam selection method and complexity analysis are shown in Section 3. In Section 4, simulations are provided to evaluate the performance of the proposed scheme. Conclusions are presented in Section 5.

1.1 Notation

In this paper, boldface letters denote vectors and matrices; E()denotes the expectation; ()T, ()H, and ()−1 denote the transpose, conjugate transpose, inversion, respectively; |·| denotes the determinant of a matrix; ||·||F denotes the Frobenius norm of a matrix; diag(p) denotes the diagonal matrix. £M × M represents the spaces of M × M matrices with complex entries; \( \mathcal{CN}\left(m,R\right) \) denotes the complex Gaussian distribution with mean m and covariance R; IU is the U × U identity matrix. \( \mathcal{A}\backslash \mathcal{B} \) denotes the set where the elements in set \( \mathcal{B} \) are eliminated from set \( \mathcal{A} \). Finally, [] represents to round a decimal to its nearest lower integer.

2 System model and problem statement

2.1 System and signal model

In this paper, a fully connected downlink mmWave system is considered, which is shown in Fig. 1. Every RF chain is connected to all antennas of the BS. Now, a general scenario with one BS serving U users simultaneously is considered. The BS is equipped with NRF RF chains and NBS antennas. Each MS is equipped with NMS antennas. The BS communicates with each MS only through one stream. So the total number of transmitting streams is NS = U. In addition, the number of MS cannot exceed the number of BS RF chains [7], i.e., NS ≤ NRF. The spatial multiplexing gain of this hybrid system for multiuser is limited by NBS ≥ NRF.

Fig. 1
figure 1

A multiuser mmWave massive MIMO system model

On the downlink multiuser mmWave MIMO system as shown in Fig. 1, the BS applies an NRF × NS digital precoding matrix \( {\mathbf{F}}_{\mathbf{BB}}=\left[{\mathbf{f}}_{\mathbf{BB}}^1,{\mathbf{f}}_{\mathbf{BB}}^2,\dots, {\mathbf{f}}_{\mathbf{BB}}^U\right] \) where \( {\mathbf{f}}_{\mathbf{BB}}^u\left(u\in \left[1,2,...,U\right]\right) \) is an NRF × 1 vector, followed by an NBS × NRF RF precoding matrix. The transmitting signal is therefore

$$ \mathbf{x}={\mathbf{F}}_{\mathbf{RF}}{\mathbf{F}}_{\mathbf{BB}}\mathbf{s}, $$
(1)

where s = [s1, s2, …, sU]T is the message to be transmitted which is an NS × 1(U × 1) vector and normalized as E[ssH] = (1/NS)IU. Since FRF is the analog precoding matrix implemented as analog phase shifters, and the entries of FRF are constant modulus, which should be normalized to satisfy \( \mid {F}_{RF}\left[i,j\right]\mid =1/\sqrt{N_{\mathrm{BS}},}\kern0.5em \forall i=1,2\dots, {N}_{BS},j=1,2,\dots, {N}_{RF}, \) where FRF[i, j] is the (i, j)th element of analog precoding matrix FRF. The total power constraint of the hybrid precoding matrix is enforced by normalizing FBB as \( {\left\Vert {\mathbf{F}}_{\mathbf{RF}}{\mathbf{F}}_{\mathbf{BB}}\right\Vert}_{\mathrm{F}}^2={N}_{\mathrm{S}} \).

The received signal observed by the uth MS is

$$ {\mathbf{y}}_u=\sqrt{\rho }{\mathbf{H}}_u\sum \limits_{n=1}^U{\mathbf{F}}_{\mathbf{RF}}{\mathbf{f}}_{\mathbf{BB}}^n{\mathbf{s}}_n+{\mathbf{n}}_u, $$
(2)

where Hu is an NMS × NBS matrix which indicates the channel between the BS and the uth user, \( {\mathbf{n}}_u\in \mathcal{CN}\left(0,{\sigma}^2{\mathbf{I}}_{N_{\mathrm{r}}}\right) \) represents the Complex Gaussian noise vector, and ρ represents the average received power.

At the uth MS, the received signal is processed by the uth combiner wu. The processed signal is

$$ {\mathbf{y}}_u={\mathbf{w}}_u^{\mathrm{H}}\sqrt{\rho }{\mathbf{H}}_u\sum \limits_{n=1}^U{\mathbf{F}}_{\mathbf{RF}}{\mathbf{f}}_{\mathbf{BB}}^n{\mathbf{s}}_n+{\mathbf{w}}_u^{\mathrm{H}}{\mathbf{n}}_u, $$
(3)

where \( {\mathbf{w}}_u\in {\mathbb{C}}^{N_{\mathrm{MS}}\times 1} \) is the digital combining vector of the uth user.

2.2 Millimeter-wave channel model

An extended Saleh-Valenzuela model is used in this paper due to the high antenna correlation and the limited spatial selectivity, which is commonly applied in mmWave massive MIMO channel modeling [14,15,16]. We consider the scenario that each scatter reflects one path. An adoption of a geometric channel model with Lu scatters is used for the channel of the uth user. Under this model, the channel matrix of the uth user \( {\mathbf{H}}_u\in {\mathbb{C}}^{N_{\mathrm{MS}}\times {N}_{\mathrm{BS}}} \) can be expressed as

$$ {\mathbf{H}}_u=\sqrt{\frac{N_{\mathrm{MS}}{N}_{\mathrm{BS}}}{L_u}}\sum \limits_{l=1}^{L_u}{\alpha}_{u,l}{\mathbf{a}}_{u,\mathbf{MS}}\left({\theta}_{u,l}\right){\mathbf{a}}_{u,\mathbf{BS}}^{\mathrm{H}}\left({\phi}_{u,l}\right), $$
(4)

where NBS is the number of antennas at the BS, NMS is the number of antennas at the uth MS. αu, l is the complex gain of the lth path, which includes the path loss with \( \mathrm{E}\left[{\left|{\alpha}_{u,l}\right|}^2\right]=\overline{\alpha} \) and \( \overline{\alpha} \)is a normalization constant. The variable \( {\phi}_{u,l}\kern0.5em \in \left[0,2\pi \right] \) is the lth path’s angle of departure (AoD). The variable θu, l [0, 2π] is the lth path’s angle of arrival (AoA). At last, au,BS(ϕu, l) is the antenna array response vector of the BS. au,MS(θu, l) is the antenna array response vector of the uth MS. Lu is the number of propagation paths for the channel of the uth MS with l [1, 2, …, Lu]. We assume the BS and every MS get knowledge about the structure of their antenna arrays. A typical uniform linear array (ULA) is utilized in the simulations. If a ULA is equipped, au,BS(ϕu, l) is defined as

$$ {\mathbf{a}}_{u,\mathbf{BS}}\left({\phi}_{u,l}\right)=\frac{1}{\sqrt{N_{\mathrm{BS}}}}{\left[1,{e}^{j\frac{2\pi }{\lambda }d\sin \left({\phi}_{u,l}\right)},...,{e}^{j\left({N}_{\mathrm{BS}}-1\right)\frac{2\pi }{\lambda }d\sin \left({\phi}_{u,l}\right)}\right]}^{\mathrm{T}}, $$
(5)

where d is the distance between adjacent antenna elements and λ is the wavelength of mmWave. au,MS(θu, l) of the MS can be described in a similar pattern. If each MS has one antenna, no AoAs exist. Therefore, (4) can be simplified as

$$ {\mathbf{h}}_u=\sqrt{N_{\mathrm{BS}}/{L}_u}\sum \limits_{l=1}^{L_u}{\alpha}_{u,l}{\mathbf{a}}_{u,\mathbf{BS}}^{\mathrm{H}}\left({\phi}_{u,l}\right). $$

Now, H indicates the collective channel matrix from all users, which is denoted as

$$ \mathbf{H}={\left[{\mathbf{H}}_1^{\mathrm{T}},{\mathbf{H}}_2^{\mathrm{T}},\dots, {\mathbf{H}}_U^{\mathrm{T}}\right]}^{\mathrm{T}}. $$
(6)

The received signal vector y = [y1, y1, , yU]T of the total U users can be expressed as

$$ \mathbf{y}={\mathbf{w}}^{\mathrm{H}}\sqrt{\rho}\mathbf{H}{\mathbf{F}}_{\mathbf{RF}}{\mathbf{F}}_{\mathbf{BB}}\mathbf{s}+{\mathbf{w}}^{\mathrm{H}}\mathbf{n}, $$
(7)

where \( \mathbf{n}\in \mathcal{CN}\left(0,{\sigma}^2{\mathbf{I}}_{U\times {N}_{\mathrm{MS}}}\right) \) represents the Complex Gaussian noise vector and \( \mathbf{w}={\left[{\mathbf{w}}_1^{\mathrm{T}},{\mathbf{w}}_2^{\mathrm{T}},\dots, {\mathbf{w}}_U^{\mathrm{T}}\right]}^{\mathrm{T}} \) is the collective combiner of all users.

The channel model can be translated into the beam space or angular domain [17] via a beamforming matrix Ut, which is an NBS × NBS unitary matrix. The columns of Ut are the vectors in the DFT matrix with uniform spacing. The (k, l) entry of Ut is analytically defined as

$$ {\mathbf{U}}_t\left(k,l\right)=\frac{1}{\sqrt{N_{\mathrm{BS}}}}\exp \left(\frac{-j2\pi kl}{N_{\mathrm{BS}}}\right),k,l=0,\cdots, {N}_{\mathrm{BS}}-1. $$
(8)

The DFT codebook that divides the space into NBS orthogonal beams is introduced. An NBS-dimensional DFT matrix can be denoted as

$$ {\mathbf{U}}_t=\frac{1}{\sqrt{N_{\mathrm{BS}}}}\left[\begin{array}{cccc}1& 1& \dots & 1\\ {}1& {e}^{\frac{j2\pi }{N_{\mathrm{BS}}}}& \dots & {e}^{\frac{j2\pi \left({N}_{\mathrm{BS}}-1\right)}{N_{\mathrm{BS}}}}\\ {}\vdots & \vdots & & \vdots \\ {}1& {e}^{\frac{j2\pi \left({N}_{\mathrm{BS}}-1\right)}{N_{\mathrm{BS}}}}& \dots & {e}^{\frac{j2\pi {\left({N}_{\mathrm{BS}}-1\right)}^2}{N_{\mathrm{BS}}}}\end{array}\right]. $$
(9)

Hence, Ha is defined as an equivalent representation of the channel in angular domain [17], which can be denoted as

$$ {\mathbf{H}}^a=\mathbf{H}{\mathbf{U}}_t. $$
(10)

(7) can be expressed in the angular domain as follows.

$$ \mathbf{y}={\mathbf{w}}^{\mathrm{H}}\sqrt{\rho }{\mathbf{H}}^a{\mathbf{U}}_t^{\mathrm{H}}{\mathbf{F}}_{\mathbf{RF}}{\mathbf{F}}_{\mathbf{BB}}\mathbf{s}+{\mathbf{w}}^{\mathrm{H}}\mathbf{n}. $$
(11)

The relationships between the channel matrix of the angular domain Ha and channel matrix of the spatial domain H in (6) are denoted as

$$ {\mathbf{H}}^a{\mathbf{U}}_t^{\mathrm{H}}=\mathbf{H}, $$
(12)
$$ {\mathbf{H}}^a={\left[{\mathbf{U}}_t^{\mathrm{H}}{\mathbf{H}}_1^{\mathrm{T}},{\mathbf{U}}_t^{\mathrm{H}}{\mathbf{H}}_2^{\mathrm{T}},\dots, {\mathbf{U}}_t^{\mathrm{H}}{\mathbf{H}}_U^{\mathrm{T}}\right]}^{\mathrm{T}}. $$
(13)

The channel matrix Ha is a mapping of the signals for each MS in a new domain of orthogonal vectors. The channel matrix in the spatial domain H is the discrete Fourier transform of the channel matrix in the angular domain Ha. The Ut defined in (8) is unitary, i.e., \( {\mathbf{U}}_t{\mathbf{U}}_t^{\mathrm{H}}={\mathbf{U}}_t^{\mathrm{H}}{\mathbf{U}}_t=\mathbf{I} \). Moreover, the representation of signals in the angular domain which can be applied to other antenna array structures is a more general concept. Since the number of dominant scatters is very limited in mmWave systems, the non-zero elements of every vector in angular channel is much smaller than NBS, namely, the angular domain channel matrix Ha is sparse [18]. This sparse structure can be utilized to design FRF with dimension reduced by column vector selection from the DFT codebook Ut whose columns are orthogonal. The ideal performance can only be achieved when the users located in the orthogonal subspace in mmWave massive MIMO systems. The multiplication of Ut and FRF can obtain parallel sub-channels to transmit signals after analog precoding according to (11). Moreover, Ut satisfies the constant modulus constraint which is enforced by the feasible domain of analog precoder FRF. Then, the problem of designing FRF is transformed to select proper columns from the DFT codebook.

2.3 Problem formulation

The objective is to efficiently design appropriate hybrid precoding matrices at the BS for the mmWave massive MIMO system. For this purpose, the common performance measures are considered: the achievable sum rate (SR) of the system. Our optimum target is to maximize SR in the downlink multiuser scenario. Assume that the BS can get the perfect channel state information (CSI). According to the processed signal received at the uth user in (3), the achievable data rate of the uth user is

$$ {R}_u={\log}_2\left(1+\frac{\frac{\rho }{N_{\mathrm{S}}}{\left|{\mathbf{w}}^{\mathrm{H}}{\mathbf{H}}_u{\mathbf{F}}_{\mathbf{RF}}{\mathbf{f}}_{\mathbf{BB}}^u\right|}^2}{\frac{\rho }{N_{\mathrm{S}}}\sum \limits_{n\ne u}{\left|{\mathbf{w}}^{\mathrm{H}}{\mathbf{H}}_u{\mathbf{F}}_{\mathbf{RF}}{\mathbf{f}}_{\mathbf{BB}}^n\right|}^2+{\sigma}^2}\right). $$
(14)

The SR of the system is

$$ {R}_{\mathrm{SUM}}=\sum \limits_{u=1}^U{R}_u. $$
(15)

Although other criterion such as the max-min fairness criterion is interesting in [19], we aim to maximize the achievable SR in this paper, since the achievable SR is a crucial criterion in the mmWave systems. The SR performance is preferable to fairness when precoders of BS are practically operated, which brings greater system capacity and better user experience. Hence, we choose SR performance as the optimization objective.

In our paper, we assume that each MS is equipped with a small number of antennas, since NMS is usually much smaller than NBS in massive MIMO systems [16]. The BS communicates with each MS only through one stream. Therefore, in order to get the multi-antenna gain, digital combining is utilized at the MSs. The hybrid combining cannot reduce RF chains when the number of MS antennas is smaller than the number of propagation paths in the channel. The digital combiner of MS is practical when each MS has a few antennas, since it will not bring plenty of RF chains and high cost of hardware overhead. The matched filter is chosen by us as the combiner at MSs, since the matched filter is an optimal linear filter, and its output signal to interference plus noise ratio (SINR) can be maximized which can make a great contribution to the system sum rate. The matched filtering (MF) combiner is defined as

$$ {\mathbf{w}}_u={\hat{\mathbf{H}}}_u,\kern0.5em \mathrm{with}\ \mathrm{MF}, $$
(16)

where \( {\hat{\mathbf{H}}}_u \) is an effective channel matrix which can be indicated as \( {\hat{\mathbf{H}}}_u={\mathbf{H}}_u\sum \limits_{n=1}^U{\mathbf{F}}_{\mathbf{RF}}{\mathbf{f}}_{\mathbf{BB}}^n \).

The proceeding design problem of FRF and FBB can be described as

$$ {\displaystyle \begin{array}{c}{\mathbf{F}}_{\mathbf{RF}}^{\mathbf{OPT}},{\mathbf{F}}_{\mathbf{BB}}^{\mathbf{OPT}}=\underset{{\mathbf{F}}_{\mathbf{RF}},{\mathbf{F}}_{\mathbf{BB}}}{\mathrm{argmax}{R}_{\mathrm{S}\mathrm{UM}}},\\ {}\mathrm{s}.\mathrm{t}.{\mathbf{F}}_{\mathbf{RF}}\in \mathbf{W},\\ {}{\left\Vert {\mathbf{F}}_{\mathbf{RF}}{\mathbf{F}}_{\mathbf{BB}}\right\Vert}_F^2={N}_{\mathrm{S}}\ \end{array}} $$
(17)

where W represents the feasible domain of analog precoders that is the set of NBS× NRF matrices with constant-magnitude entries \( \left(|{\mathbf{F}}_{\mathbf{RF}}\left[i,j\right]|=1/\sqrt{N_{\mathrm{BS}}}\right) \). The total power constraint of the BS is enforced by normalizing FBB as \( {\left\Vert {\mathbf{F}}_{\mathbf{RF}}{\mathbf{F}}_{\mathbf{BB}}\right\Vert}_F^2={N}_{\mathrm{S}} \). No other hardware-related constraints are imposed on the digital precoding matrix.

The optimal solution to (17) is almost impossible to obtain due to several difficulties, including the RF constraints, the couple between FRF and FBB. However, near-optimal solutions can be found through iterative algorithms, which require a large number of training processes and feedback overhead. Therefore, a direct solution to this maximization problem of sum rate in (17) is neither tractable nor practical. In view of the application difficulties of previous precoding algorithms in mmWave, we develop a new DFT codebook-based hybrid algorithm suitable for multiuser mmWave massive MIMO systems in Section 3. Our proposed scheme achieves better performance than the other algorithms.

3 Methods

In this section, we propose a codebook-based hybrid precoding algorithm to solve the optimization problem and analyze its computational complexity. The main idea of our scheme is dividing the calculation of the hybrid precoding into analog and digital domains.

3.1 The proposed hybrid algorithm

In the digital domain, zero-forcing (ZF) technique is adopted to obtain the digital precoder, based on the equivalent channel matrix.

In the analog domain, the analog precoder FRF is picked from DFT codebook [9] which is also called beam selection. Beams mean column vectors of the DFT codebook. Our target is to select NRF best unshared beams from total NBS beams for U users to maximize the achievable sum rate. Hence, it is crucial to choose proper beams which will be applied for the design of analog precoder, although that can be done by exhaustive searching with high time cost over all the possible beam combinations to identify the optimum beams. A more efficient CA beam selection method that can make the balance between performance and time consumption is proposed for the column vector selection of the analog precoder. Then, ZF precoding is directly utilized for digital precoder. CA method consists of two stages: (1) group to identify NCUs and CUs by consideration of the potential multiuser conflict and (2) fix the beam indexes for NCUs and seek the best exclusive beam indexes for CUs, in order to maximize the SR. As FRF is the abstraction from the DFT matrix Ut, the column indexes of DFT codebook which can be picked by FRF are obtained. For the sake of simplicity, the scenario that each MS has one antenna and NRF = U will be firstly explained as follows.

3.1.1 Stage 1: Grouping stage to identify CUs and NCUs

Our goal is to pick NRF best dedicated beams from the total NBS beams of Ut to maximize the achievable sum rate. As we know, the optimal solution can be obtained by exhaustive search, which has the best performance but suffers the high complexity which is \( {C}_{N_{\mathrm{BS}}}^{N_{\mathrm{RF}}} \). For a mmWave massive MIMO system with NBS = 128 and NRF = 16, the total searching number reaches 128 ! /16 ! (128 − 16) !  = 9.8 × 1018. Hence, a more practical selection scheme to get the near-optimal performance is necessary.

Firstly, we sort the row elements of Ha in ascending order of the ratio T as follows.

$$ {T}_{u,i}=\frac{{\left|{\mathbf{H}}^a\left(u,i\right)\right|}^2}{\sum \limits_{j=1,j\ne i}^{N_{\mathrm{BS}}}{\left|{\mathbf{H}}^a\left(u,j\right)\right|}^2+{\sigma}^2},u=1,2,\cdots U,i=1,2,\cdots {N}_{\mathrm{BS}}, $$
(18)

where Ha(u, i) is the (u, i)th element of the matrix Ha. The maximum value Tu, i in the whole uth row of the angular channel Ha means the dominant beam of the uth user. The dominant beam index i of the uth user is denoted as \( {b}_u^{\ast}\in \left[1,2,...,{N}_{\mathrm{BS}}\right]. \) The dominant beams \( {\left\{{b}_u^{\ast}\right\}}_{u=1}^U \) make a direct contribution to achievable sum rate according to Appendix. Hence, if \( {b}_1^{\ast}\ne {b}_2^{\ast}\ne \cdots \ne {b}_U^{\ast } \), the beam set \( \mathcal{B}=\left\{{b}_1^{\ast },{b}_2^{\ast },\cdots, {b}_U^{\ast}\right\} \) can get the near optimal sum rate. But if there are two users sharing the same dominant beam, the two users will suffer severe multiuser interference. Unfortunately, if U is large and NBS is small, the probability of selecting the same dominant beam is obviously large. Therefore, users can be grouped. Users are categorized into line-of-sight (LOS) and non-line-of-sight (NLOS) groups in [20], based on the channel conditions. In this paper, all U users are classified into two groups, i.e., non-conflicting users (NCUs) and conflicting users (CUs) according to their dominant beam index in Ha. Different criteria to select beams are adopted for different user groups. An example for grouping is given as follows.

  1. (1)

    User u is defined as NCU, if the dominant beam index of u differs from any other users, i.e., \( {b}_u^{\ast}\notin \left\{{b}_1^{\ast },\cdots, {b}_{u-1}^{\ast },{b}_{u+1}^{\ast },\cdots, {b}_U^{\ast}\right\} \). The NCU group \( {\mathcal{F}}_{\mathrm{NCU}} \) consists of all NCU users. The example is shown in Fig. 2a, in which the dominant beam index of users can be obtained according to (18) as \( {b}_1^{\ast }=2,{b}_4^{\ast }=8 \) and \( {b}_2^{\ast }={b}_3^{\ast }=5. \) It is obvious that \( {\mathcal{F}}_{\mathrm{NCU}}=\left\{1,4\right\} \), since \( {b}_1^{\ast}\notin \left\{{b}_2^{\ast },{b}_3^{\ast },{b}_4^{\ast}\right\} \) and \( {b}_4^{\ast}\notin \left\{{b}_1^{\ast },{b}_2^{\ast },{b}_3^{\ast}\right\} \). The dominant beam index can be directly selected for the user \( u\in {\mathcal{F}}_{\mathrm{NCU}} \), because this beam not only make a direct contribution to achievable sum rate but also produce little interference to other users.

  2. (2)

    User u is defined as CU, if the dominant beam index of u is the same as any other users, i.e., \( {b}_u^{\ast}\in \left\{{b}_1^{\ast },\cdots, {b}_{u-1}^{\ast },{b}_{u+1}^{\ast },\cdots, {b}_U^{\ast}\right\} \). The CU group \( {\mathcal{F}}_{\mathrm{CU}} \) consists of all CU users. \( {\mathcal{F}}_{\mathrm{CU}}\cup {\mathcal{F}}_{\mathrm{NCU}}=\left\{1,2,\cdots, U\right\} \) and \( {\mathcal{F}}_{\mathrm{CU}}\cap {\mathcal{F}}_{\mathrm{NCU}}=\varnothing \) are obvious. The example in Fig. 2a is considered, where \( {\mathcal{F}}_{\mathrm{CU}}=\left\{2,3\right\} \), since \( {b}_2^{\ast}\in \left\{{b}_1^{\ast },{b}_3^{\ast },{b}_4^{\ast}\right\} \) and \( {b}_3^{\ast}\in \left\{{b}_1^{\ast },{b}_2^{\ast },{b}_4^{\ast}\right\} \). For the user \( u\in {\mathcal{F}}_{\mathrm{CU}} \), the suitable beam index is picked from the set \( \left\{1,2,\cdots, {N}_{\mathrm{BS}}\right\}\operatorname{}\left\{{b}_u^{\ast }|u\in {\mathcal{F}}_{\mathrm{NCU}}\right\} \). which is introduced in the next stage as shown in Fig. 2b.

Fig. 2
figure 2

An example of the CA beam-column selection method. a Stage 1: Grouping stage to identify CUs and NCUs. b Stage 2: Best beam searching stage

A simple example for single-antenna MS is taken above to better understand our proposed scheme. If each MS of the user has several antennas, the angular channel of the uth user is not a vector, but a matrix denoted as \( {\mathbf{H}}_u^a \). Then, we search all elements in \( {\mathbf{H}}_u^a \), instead of the row elements of Ha, and the ratio T is sorted in ascending order as well.

3.1.2 Stage 2: Best beam indexes searching stage

For the user \( u\left(u\in {\mathcal{F}}_{\mathrm{NCU}}\right) \), the best beam index can be directly selected as the dominant beam index. In the above example, the best indexes of NCUs are \( \mathrm{Card}\left({\mathcal{F}}_{\mathrm{NCU}}\right)=\left\{2,8\right\} \). Then, another search for \( u\in {\mathcal{F}}_{\mathrm{CU}} \) remains \( \left(\mathrm{Card}\left({N}_{BS}\right)-\mathrm{Card}\left({\mathcal{F}}_{\mathrm{NCU}}\right)\right)=\left\{1,3,4,5,6,7\right\} \) beams to maximize the SR. A classical ZF precoding technology is adopted in the digital domain as follows.

$$ {\mathbf{F}}_{\mathbf{BB}}=\alpha {\mathbf{H}}_{\mathrm{eq}}^{\mathrm{H}}{\left({\mathbf{H}}_{\mathrm{eq}}{\mathbf{H}}_{\mathrm{eq}}^{\mathrm{H}}\right)}^{-1}, $$
(19)

where \( {\mathbf{H}}_{\mathrm{eq}}={\mathbf{H}}^a{\mathbf{U}}_t^{\mathrm{H}}{\mathbf{F}}_{\mathbf{RF}} \) is the equivalent channel matrix seen from the perspective of the digital domain. α is a scaling factor to make sure that \( \mathrm{E}\left\{{\left\Vert {\mathbf{F}}_{\mathbf{BB}}\mathbf{s}\right\Vert}_2^2\right\}=\rho \). Hence, the factor α is depicted as

$$ \alpha =\sqrt{\frac{\rho }{\mathrm{tr}\left({\left({\mathbf{H}}_{\mathrm{eq}}{\mathbf{H}}_{\mathrm{eq}}^{\mathrm{H}}\right)}^{-1}\right)}}. $$
(20)

If equal power allocation is utilized at the BS, the average data rate of the uth user is given by

$$ {R}_u={\log}_2\left(1+\frac{{\left|\alpha \right|}^2}{N_{\mathrm{S}}{\sigma}^2}\right). $$
(21)

Then, the SR maximization problem in (17) can be written as

$$ {\mathbf{F}}_{\mathbf{RF}}^{\mathbf{OPT}}=\underset{D}{\mathrm{argmax}{R}_{\mathrm{SUM}}}=\underset{D}{\mathrm{argmax}}U{\log}_2\left(1+\frac{\rho }{N_S{\sigma}^2 tr{\left({\mathbf{H}}_{\mathrm{eq}}^{\mathrm{r}}{\left({\mathbf{H}}_{\mathrm{eq}}^{\mathrm{r}}\right)}^H\right)}^{-1}}\right), $$
(22)

where D is the possible beam candidate set for the CUs with \( \mathrm{Card}\left({\mathcal{F}}_{\mathrm{NCU}}\right) \), which is the unshared columns selected from the set \( \left\{\left.1,2,\cdots, {N}_{\mathrm{BS}}\right\}\backslash {b}_u^{\ast }|u\in {\mathcal{F}}_{\mathrm{NCU}}\right\} \). We denote \( {\mathbf{H}}_{\mathrm{eq}}^{\mathrm{r}}={\mathbf{H}}_{\mathrm{eq}}\left(:,s\right),s\boldsymbol{\in}\mathcal{D}\cup \left\{{b}_u^{\ast }|u\in {\mathcal{F}}_{\mathrm{NCU}}\right\} \).

It is obvious that maximizing the SR is equivalent to minimize \( tr{\left({\mathbf{H}}_{\mathrm{eq}}^{\mathrm{r}}{\left({\mathbf{H}}_{\mathrm{eq}}^{\mathrm{r}}\right)}^H\right)}^{-1} \). Hence, (22) can be further formulated as

$$ {\mathbf{F}}_{\mathbf{RF}}^{\mathbf{OPT}}=\underset{D}{\mathrm{argmin}\ } tr{\left({\mathbf{H}}_{\mathrm{eq}}^{\mathrm{r}}{\left({\mathbf{H}}_{\mathrm{eq}}^{\mathrm{r}}\right)}^H\right)}^{-1}. $$
(23)

Next, the beam columns with the greatest contribution to the SR are picked out in each step. The optimal solution is by an exhaustive search which involves prohibitively high complexity of search with \( {C}_{N_{\mathrm{BS}}}^{N_{\mathrm{RF}}} \) possible combinations. A low-complexity incremental algorithm [21] for beam-column selection is proposed to get a near-optimal solution.

As the example is shown as Fig. 2, in which CUs consist of user 2 and user 3, since \( {b}_2^{\ast }={b}_3^{\ast }=5. \) Therefore, in the first step, the unshared beam column b2 of user 2 should be selected as

$$ \underset{b}{b_2=\mathrm{argmin}}\ tr{\left(\mathbf{G}+{\mathbf{g}}_b{\mathbf{g}}_b^{\mathrm{H}}\right)}^{-1},\mathbf{G}\triangleq \mathbf{A}{\mathbf{A}}^{\mathrm{H}}+\varepsilon \mathbf{I}, $$
(24)

where \( \mathbf{A}={\mathbf{H}}_{\mathrm{eq}}\left(:,s\right),s\boldsymbol{\in}\left\{{b}_u^{\ast }|u\in {\mathcal{F}}_{\mathrm{NCU}}\right\} \) is a matrix of beam columns selected for NCUs, \( {\mathbf{g}}_b={\mathbf{H}}_{\mathrm{eq}}\left(:,b\right),b\boldsymbol{\in}S=\left\{\left.1,2,\cdots, {N}_{\mathrm{BS}}\right\}\backslash {b}_u^{\ast }|u\in {\mathcal{F}}_{\mathrm{NCU}}\right\} \) is one beam column possibly selected for CUs. ε is set to make G nonsingular, which can guarantee the matrix inversion in (24) exists. ε is a positive number (e.g., 10-2).

Based on the Sherman-Morrison-Woodbury identity [13], a recursive way can be used to obtain the matrix inversion which can decrease the complexity. The matrix inversion can be obtained with complexity \( \mathcal{O}\left({U}^2\right) \) instead of \( \mathcal{O}\left({U}^3\right) \). A convenient expression is given for the inverse of the matrix (B + UVH) where B is n by n and U and V are n by k:

$$ {\left(\mathbf{B}+\mathbf{U}{\mathbf{V}}^{\mathrm{H}}\right)}^{-1}={\mathbf{B}}^{-1}-{\mathbf{B}}^{-1}\mathbf{U}{\left(\mathbf{I}+{\mathbf{V}}^{\mathrm{H}}{\mathbf{B}}^{-1}\mathbf{U}\right)}^{-1}{\mathbf{V}}^{\mathrm{H}}{\mathbf{B}}^{-1}. $$
(25)

In (25), both B and (I + VHB−1U) are assumed to be nonsingular. A rank-k correction to a matrix results in a rank-k correction of the inverse. Particularly, if k = 1, B is nonsingular, β = 1 + vHB−1u ≠ 0, then

$$ {\left(\mathbf{B}+\mathbf{u}{\mathbf{v}}^{\mathrm{H}}\right)}^{-1}={\mathbf{B}}^{-1}-\frac{1}{\beta }{\mathbf{B}}^{-1}\mathbf{u}{\mathbf{v}}^{\mathrm{H}}{\mathbf{B}}^{-1}. $$
(26)

According to (26), since G is nonsingular, (24) can be equivalently formulated as

$$ \underset{b}{b_2=\mathrm{argmin}}\ tr\left({\mathbf{G}}^{-1}-\frac{{\mathbf{G}}^{-1}{\mathbf{g}}_b{\mathbf{g}}_b^{\mathrm{H}}{\mathbf{G}}^{-1}}{1+{\mathbf{g}}_b^{\mathrm{H}}{\mathbf{G}}^{-1}{\mathbf{g}}_b}\right). $$
(27)

By calculating the objective function in (27), b2 has been selected. In the example in Fig. 2, user 2 selects the beam 5. Then, G and S can be updated as \( \mathbf{G}=\mathbf{G}+{\mathbf{g}}_{b2}{\mathbf{g}}_{b2}^{\mathrm{H}} \) and S = S\b2. Then, user 3 in CUs uses the same method presented above to pick other beams. We assume that user 3 selects the beam 6 after computation. If there are more users, an incremental algorithm should continue until all \( \mathrm{Card}\left({\mathcal{F}}_{\mathrm{CU}}\right) \) beams have been picked out. Consequently, the set \( {\mathcal{B}}^{\ast } \) of all selected beam columns can be formed as

$$ {\mathcal{B}}^{\ast }=\left\{{b}_1,\cdots, {b}_{\mathrm{Card}\left({\mathcal{F}}_{\mathrm{CU}}\right)}\right\}\cup \left\{{b}_u^{\ast }|u\in {\mathcal{F}}_{\mathrm{NCU}}\right\} $$
(28)

In this paper, the number of RF chains selected is arbitrary. Many methods such as [4] solve the hybrid precoding problem, but NRF = U is strictly demanded, which cannot meet the general scene as NRF ≥ U. Our proposed algorithm satisfies the normal condition without the constraint NRF = U. Hence, this can utilize the additional RF chains to explore all potential gains of sum rate. As mentioned above, we first introduce per-user CA beam selection method above that each of the user is assigned with one RF chain. While the condition is NRF ≥ U, each user will be assigned multiple RF chains. For fairness, we strive to make every user use the same number of RF chains. Denote C = [NRF/U], each user is firstly allocated with C RF chains which can be acquired from the CA method executing C rounds. The remaining RF chains can be denoted as T = NRF − [NRF/U] U. The remaining RF chains are prioritized to allocate to NCUs, since users without multiuser interference can bring higher SR. Then, two cases are classified according to the number of NCUs which is denoted as \( {N}_{{\mathcal{F}}_{\mathrm{NCU}}} \): (1) If \( {N}_{{\mathcal{F}}_{\mathrm{NCU}}}>T \), each of the first Tth NCUs is allocated with one RF chains. (2) If \( {N}_{{\mathcal{F}}_{\mathrm{NCU}}}<T \), all of the NCUs are allocated with one RF chain. Each of the first \( {\left(T-{N}_{{\mathcal{F}}_{\mathrm{NCU}}}\right)}^{th} \) CUs is allocated with one RF chain.

3.2 Computational complexity

Here, we compare the computational complexities of our proposed algorithm with the full digital algorithm, the exhausted searching algorithm, and the algorithms in [4, 5, 7, 11], which are presented in Table 1. The complexity of the proposed algorithm includes two parts as follows.

  1. (1)

    The first part comes from the beam selection of the analog precoding matrix. If there are K CUs and U − K NCUs, it requires (NBS − U) + (K + 1 − k), k = 1, 2, K, possible beam selections for the kth CU. The inversion of the matrix has a complexity of \( \mathcal{O}\left({U}^2\right) \).

  2. (2)

    The second part is from the calculation of ZF digital precoder. The complex multiplication of the matrix requires \( \mathcal{O}\left({N}_{\mathrm{BS}}{U}^2\right). \)

Table 1 Comparison of the computational complexities

Consequently, the totally computational complexity of the proposed algorithm is \( \mathcal{O}\left({N}_{\mathrm{BS}}{U}^2+{U}^2\right) \). Meanwhile, the complexity of the full digital algorithm is \( \mathcal{O}\left({N}_{\mathrm{BS}}^3\right) \). Compared to the exhausted searching, the proposed CA scheme reduces the computation time from \( {C}_{N_{\mathrm{BS}}}^{N_{\mathrm{RF}}} \) to U2, which is much more acceptable. Hence, the complexity of an exhausted searching hybrid algorithm is \( \mathcal{O}\left({N}_{\mathrm{BS}}{U}^2+{C}_{N_{\mathrm{BS}}}^{N_{\mathrm{RF}}}\right) \). The complexity of beam steering hybrid algorithm in [4] is \( O\left({N}_{\mathrm{BS}}^2{U}^2\right) \). The complexities of hybrid algorithms in [5, 7] are \( \mathcal{O}\left({N}_{\mathrm{BS}}{U}^2+{U}^2\right) \) as well, since beam selection methods except for selected codebooks in them are the same in simulations of the next section. The complexity of the JWSPD hybrid algorithm in [11] is \( \mathcal{O}\left({N}_{\mathrm{BS}}^4{U}^4\right) \).

It can be noticed that our proposed hybrid precoding algorithm can highly reduce the computational complexity compared with the full digital algorithm and the exhausted searching algorithm. Besides this above, when NBS is much larger than U, the complexity of the proposed algorithm is much lower than the algorithms in [4, 11]. Though the complexity of our proposed algorithm is equal to the algorithms in [5, 7], our proposed hybrid precoding can get better performance.

4 Results and discussion

In this section, we evaluate the performance of our proposed algorithm with the full digital precoding, the algorithm based on DFT codebook with exhausted searching scheme mentioned in [9], beam steering hybrid algorithm in [4], IEEE 802.15.3c codebook-based hybrid precoding [5], Q-bit resolution codebook-based hybrid precoding [7], and JWSPD hybrid algorithm in [11], which are all under fully connected architecture for fairness.

The simulation parameters are described as follows. Considering the system model depicted in Section 2, the ULA is implemented at the BS. The spacing between antenna elements of BS is half-wavelength. The channel of every MS has 5 propagation paths. The AoD of every propagation path is uniformly distributed [0, 2π]. The average power of every propagation path is randomly generated from a uniform random variable within [0, 1]. The total received power is normalized for every MS in order to satisfy the power constraint. Signal to noise ratio (SNR) is defined as SNR = ρ/σ2. The simulation results are generated by 1000 trial Monte-Carlo simulation. The performance of precoding algorithms is shown in terms of achievable SR and EE in the following subsections.

4.1 Achievable sum rate

Figure 3 depicts the performance comparison of achievable sum rate, against the average SNR, with the BS antenna number NBS = 128 and user number U = 16. A simple scenario is assumed that the number of RF chains is equal to the number of MSs and each MS is equipped with one antenna. It is shown that the performance of our proposed algorithm is close to the full digital algorithm (the upper bound), but the number of RF chains is significantly reduced. The proposed algorithm can also get better performance than the algorithms in [5, 7]. The comparison of our proposed hybrid precoding and the JWSPD scheme proposed in [11] is given under two different configurations, i.e., λ ≠ 0 and λ = 0. A turnable sparse parameter λ is introduced in [11] to control the sparsity of the optimization solution. The larger λ is, the more sparse solution is. Thus, λ is properly chosen firstly in the JWSPD scheme to balance maximizing the system SR and minimizing the number of the selected virtual antennas (codewords). Then, the remained (unselected) RF chains with the corresponding phase shifter networks can be turned off to save power. In other words, the introduced sparse parameter λ represents a trade-off between system SR and EE. However, our proposed algorithm directly optimizes only one factor which is SR. Hence, the SR performance of our proposed scheme is better than the SRmax JWSPD scheme, when λ is properly chosen and λ ≠ 0 as it is shown in Fig. 3a. If the sparse parameter λ is mandatorily chosen to zero in the simulation of Fig. 3b, the JWSPD scheme achieves a little better SR performance than our proposed scheme, but with higher computational complexity. Since λ = 0 means that the sparsity of the optimization solution and EE are mandatorily unconsidered, the system SR can be maximized.

Fig. 3
figure 3

The achievable sum rate vs. SNR (NBS = 128 and U = 16 with 1-antenna’s MS). aλ ≠ 0 in JWSPD. bλ = 0 in JWSPD

The performance of our proposed algorithm is also close to the exhausted searching DFT codebook-based hybrid algorithm. The exhausted searching algorithm traverses all the possible beam combinations and jointly exploits the optimum beams. However, it is extremely time-consuming when NBS grows large. Our proposed scheme is a more practical scheme with an acceptable performance difference. When fixing the SR at 40 bits/s/Hz, the proposed hybrid algorithm can get 5 dB gain compared to [5]. That demonstrates the CA beam selection method and DFT codebook of the proposed algorithm are efficient and attractive ways to solve the optimal problem (17).

Figure 4 depicts the performance comparison of achievable sum rate, against the average SNR, with U = 4, and NBS = 128 or NBS = 256. Each MS is equipped with two antennas in simulations. The BS serves U MSs simultaneously using NRF, and RF chains are assumed for simplicity as well.

Fig. 4
figure 4

The achievable sum rate vs. SNR (U = 4 with 2-antennas’ MS). aNBS = 128. bNBS= 256

As shown in Fig. 4a, the performance of our proposed algorithm is better than the algorithms in [4, 5, 7], when NBS = 128. The performance difference is caused by the following three reasons. Firstly, the column vectors are not orthogonal in beam steering codebook in [4], 802.15.3c codebook in [5] and Q-bit resolution codebook in [7], which may lead to performance degradation due to multiuser interference. However, we use the DFT codebook which has orthogonal columns. Secondly, our proposed CA method for columns selection from the codebook is more flexible than the selection method in [4]. CA method utilizes the sparse angular domain of the channel matrix, with the consideration of the potential multiuser conflict. But the selection method in [4] is based on the assumption of a single path in the channel matrix. Thirdly, the algorithm in [4] performed well in single-path scenarios but would deteriorate a lot in multi-path scenarios because beam steering could not make full use of multi-path channel gains. In fact, mmWave systems are mainly based on multi-path scenarios in reality. Our proposed algorithm can perform well in both single-path and multi-path scenarios.

Some observations can be obtained by analyzing Fig. 4b. The sum rates of all the algorithms grow as the number of BS antennas NBS increases from 128 to 256. The sum rate of our proposed algorithm almost rises to 70 bps/Hz at SNR is 30 dB. But the performance gaps between the proposed algorithm and other algorithms are almost invariable with the increasing number of NBS, since numerous antennas are deployed at BS which can obtain spatial multiplexing gain.

Figure 5 depicts the performance comparison of achievable sum rate, against the average SNR, with different number of users at NBS = 256. The number of user U is the same as the number of RF chains NRF, for three different configurations such as U = 32, U = 16, and U = 8. From Fig. 5, it can be observed that the SR of the proposed algorithm improves with the increasing number of users. But the rate gap between the proposed algorithm and full digital algorithm becomes bigger as the number of users increases. This trend is because the inter-user interference becomes high.

Fig. 5
figure 5

The achievable sum rate vs. SNR with the number of users

Figure 6 depicts the performance comparison of achievable sum rate, against the number of RF chains at NBS = 128, U = 16, and SNR = 25 dB. The number of RF chains is from 8 to 128. It can be observed that the SR of the proposed algorithm enhances by increasing RF chains owing to the diversity gains supplied by multiple RF chains. It can take full advantage of additional RF chains. Some other observations can be obtained by analyzing Fig. 6. The first is that there are some differences between full digital algorithm and the proposed algorithm at low NRF. The second is the performance finally converges to the full digital algorithm (the upper bound) by increasing NRF, while the gap reduces to zero at NRF = 128. The third is when NRF = 64, and the proposed algorithm achieves 99% achievable sum rate of the full digital algorithm.

Fig. 6
figure 6

The achievable sum rate vs. the number of RF chains

Figure 7 depicts the performance comparison of the achievable sum rate, against the average SNR, with different power allocation schemes for users, i.e., waterfilling power distribution [22] and equal power allocation. For waterfilling, power allocation is implemented after hybrid precoding. The effective channel matrix \( {\hat{\mathbf{H}}}_u \) is firstly implemented by singular value decomposition (SVD) to get parallel subchannels. Then, the Lagrangian method is used to distribute the power of subchannels. It is shown that waterfilling power distribution scheme is better than equal power allocation. The performance of our proposed hybrid precoding with waterfilling power distribution is much closer to the full digital precoding (the upper bound). Due to Lagrangian of waterfilling power distribution method can be used to distribute the power of subchannels, the system capacity is the largest. Consequently, the achievable sum rate of the system can be maximized.

Fig. 7
figure 7

The achievable sum rate vs. SNR with different power allocation

4.2 Energy efficiency

The energy efficiency ζ is defined in [2] as the ratio between the achievable sum rate and the total power consumption, i.e., \( \zeta =\frac{R_{\mathrm{SUM}}}{P+{N}_{\mathrm{RF}}{P}_{\mathrm{RF}}+{P}_{\mathrm{BB}}}\left(\mathrm{bps}/\mathrm{Hz}/\mathrm{W}\right) \), where P is the transmitting power of BS, PRF is the power consumption of every RF chain, and PBB is the power consumption of digital precoder. In general, the typical values are adopted that PRF = 250 mW, and PBB = 300 mW [23].

Figure 8 shows the performance of EE, against the average SNR, also with the BS antenna number NBS = 128 and user number U = 16. It is shown that our proposed algorithm can get higher EE than other five compared algorithms. In particular, our proposed algorithm performs much better than the full digital precoding algorithm because NRF equals to NBS in full digital precoding. A large number of RF chains cause very high energy consumption, i.e., 250 mW per RF chain. In hybrid precoding, RF chains are far less than antennas at the transmitter. Therefore, the hybrid precoding algorithm can extremely reduce the energy consumption generated by the RF chains compared to the full digital scheme. Moreover, in terms of EE, our proposed algorithm outperforms the JWSPD scheme [11] (λ = 0), which achieves a similar SR performance to that of our scheme in Fig. 3b. Since EE are unconsidered in the JWSPD scheme as λ = 0.

Fig. 8
figure 8

The energy efficiency vs. SNR (NBS = 128 and U = 16)

Figure 9 shows the performance of EE, against the number of users, where SNR is 15 dB and the number of users increases from 5 to 25. It is obvious that EE of our proposed algorithm decreases with the increasing number of users, since more users will cause more RF chains selected at BS with more energy consumption generated. The EE of our proposed algorithm is higher than all of other five algorithms, especially when there are not too many users U.

Fig. 9
figure 9

The energy efficiency vs. the number of users (SNR = 15 dB)

5 Conclusion

In this paper, a low-complexity hybrid precoding scheme is proposed for multiuser mmWave massive MIMO systems. The main idea is designing hybrid precoding in the digital and analog domains separately to achieve the near-optimal achievable sum rate. In the analog domain, we exploit a DFT codebook-based analog precoder. The core of the analog precoder design is the proposed conflicting-aware (CA) beam-column selection from the DFT codebook, which is from the analysis of sum rate maximum and the utilization of angular channel matrix. Compared to exhaustive search, the CA selection method dramatically reduces computational complexity. Then, in the digital domain, the digital precoder is obtained by applying ZF criterion in the effective channel matrix. After our proposed hybrid precoding, the achievable sum rate of the system can be maximized. The SR and EE performance of the schemes under different configurations are shown from various simulation results. Our proposed algorithm with low computational complexity approaches to the full digital performance and gets better performance compared to other existing hybrid algorithms.

Availability of data and materials

Not available online. Please contact the author for data requests.

Abbreviations

5G:

The fifth generation of mobile telecommunications

AoA:

Angle of arrival

AoD:

Angle of departure

BS:

Base station

CA:

Conflicting aware

CSI:

Channel state information

CUs:

Conflicting users

DFT:

Discrete Fourier transform

EE:

Energy efficiency

EE:

Energy efficiency

JWSPD:

Joint codeword selection and precoder design

MIMO:

Multiple-input multiple-output

mmWave:

Millimeter wave

MS:

Mobile station

MUB:

Unbiased bases combined

NCUs:

Non-conflicting users

RF:

Radiofrequency

SINR:

Signal to interference plus noise ratio

SLNR:

Signal to leakage plus noise ratio

SNR:

Signal to noise ratio

SR:

Sum rate

SR:

Sum rate

ULA:

Uniform linear array

WLAN:

Wireless local area networks

ZF:

Zero-forcing

References

  1. S. Han, C.-L. I, Z. Xu, et al., Large-scale antenna systems with hybrid precoding analog and digital beamforming for millimeter wave 5G. IEEE Commun. Mag. 53(1), 186–194 (2015)

    Article  Google Scholar 

  2. P. Amadori, C. Masouros, Low RF-complexity millimeter-wave beamspace-MIMO systems by beam selection. IEEE Trans. Commun. 63(6), 2212–2222 (2015)

    Article  Google Scholar 

  3. O. El Ayach, S. Rajagopal, S. Abu-Surra, et al., Spatially sparse precoding in millimeter wave MIMO systems. IEEE Trans. Wirel. Commun. 13(3), 1499–1513 (2014)

    Article  Google Scholar 

  4. A. Alkhateeb, G. Leus, R.W. Heath, Limited feedback hybrid precoding for multi-user millimeter wave systems. IEEE Trans. Wirel. Commun. 14(11), 6481–6494 (2015)

    Article  Google Scholar 

  5. Part 15.3: wireless MAC and PHY specifications for high rate WPANs. Amendment 2: millimeter-wave-based alternative physical layer extension, IEEE Std 802.15.3c-2009. (2009).

  6. Part 11: wireless LAN medium access control (MAC) and physical layer (PHY) specifications. Amendment 3: enhancements for very high throughput in the 60 GHz Band, IEEE Std. 802.11ad-2012. (2012).

  7. S. Kutty, D. Sen, Beamforming for millimeter wave communications: an inclusive survey. IEEE Commun. Surv. Tutorials 18(2), 949–973 (2016)

    Article  Google Scholar 

  8. L. Zhou, Y. Ohashi, Efficient codebook-based MIMO beamforming for millimeter-wave WLANs (IEEE 23rd International Symposium on Personal, Indoor and Mobile Radio Commun. (PIMRC), Sydney, 2012), pp. 1885–1889

    Google Scholar 

  9. K. Satyanarayana, M. El-Hajjar, et al., Millimeter wave hybrid beamforming with DFT-MUB aided precoder codebook design (2017 IEEE 86th Vehicular Technology Conference (VTC-Fall), Toronto, 2017)

    Book  Google Scholar 

  10. Y. Han, S. Jin, J. Zhang, et al., DFT-based hybrid beamforming multiuser systems: rate analysis and beam selection. IEEE J. Select Top. Signal Process. 12(3), 514–528 (2018)

    Article  Google Scholar 

  11. S. He, J. Wang, Y. Huang, et al., Codebook-based hybrid precoding for millimeter wave multiuser systems. IEEE Trans. Signal Process. 65(20), 5289–5304 (2017)

    Article  MathSciNet  Google Scholar 

  12. T.M. Pham, R. Farrell, et al., Efficient zero-forcing precoder design for weighted sum-rate maximization with per-antenna power constraint. IEEE Trans. Vehicular Technol. 67(4), 3640–3645 (2017)

    Article  Google Scholar 

  13. G.H. Golub, C.F. Van Loan, Matrix computations, 4th edn. (The Johns Hopkins University Press, Baltimore, 2013), p. 89

    MATH  Google Scholar 

  14. T.S. Rappaport, F. Gutierrez, et al., Broadband millimeter-wave propagation measurements and models using adaptive-beam antennas for outdoor urban cellular communications. IEEE Trans. Antennas Propagation 61(4), 1850–1859 (2013)

    Article  Google Scholar 

  15. I.A. Hemadeh, K. Satyanarayana, et al., Millimeter-wave communications: physical channel models, design considerations, antenna constructions, and link-budget. IEEE Commun. Surv. Tutorials 20(2), 870–913 (2018)

    Article  Google Scholar 

  16. T.S. Rappaport, S. Sun, R. Mayzus, et al., Millimeter wave mobile communications for 5G cellular: it will work! IEEE Access 1, 335–349 (2013)

    Article  Google Scholar 

  17. D. Tse, P. Viswanath, Fundamentals of wireless communication (Cambridge University Press, Cambridge, 2005)

    Book  Google Scholar 

  18. Y. Zeng, R. Zhang, Millimeter wave MIMO with lens antenna array: a new path division multiplexing paradigm. IEEE Trans. Commun. 64(4), 1557–1571 (2016)

    Article  Google Scholar 

  19. C.-S. Lee, W.-H. Chung, Max-min hybrid precoding in millimeter wave cooperative MISO systems (Conf. IEEE International Conference on Communications (ICC), Kuala Lumpur, 2016), pp. 1–6

    Google Scholar 

  20. K. Satyanarayana, M. El-Hajjar, P.-H. Kuo, A.A.M. Mourad, L. Hanzo, Adaptive transceiver design for C-RAN in mmWave communications. IEEE Access 6, 16770–16782 (2017)

    Article  Google Scholar 

  21. S. Sanayei, A. Nosratinia, Antenna selection in MIMO systems. IEEE Commun. Mag. 42(10), 68–73 (2004)

    Article  Google Scholar 

  22. H. Moon, Waterfilling power allocation at high snr regimes. IEEE Trans. Commun. 59(3), 708–715 (2011)

    Article  Google Scholar 

  23. X. Gao, L. Dai, S. Han, C.-L. I, R.W. Heath, Energy-efficient hybrid analog and digital precoding for mmWave MIMO systems with large antenna arrays. IEEE J. Sel. Areas Commun. 34(4), 998–1009 (2016)

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for their valuable comments and suggestions that helped improve the quality of this manuscript.

Funding

This work was supported by the Natural Science Foundation of China under grant no. 61372126 and no. 61771257, Jiangsu Higher Education Institutions under grant no. 18KJB510027, Technology Projects of Jiangsu Quality and Technical Supervision grant no. KJ175943, and Postgraduate Research & Practice Innovation Program of Jiangsu Province grant no. KYCX19_0928.

Author information

Authors and Affiliations

Authors

Contributions

All authors have contributed equally. All authors have read and approved the final manuscript.

Corresponding author

Correspondence to Chen Liu.

Ethics declarations

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

In this section, we introduce a selection criterion of the dominant beam for the uth user.

Different from retaining most of the power in previous beam selection, we use a metric T, defined by the signal to leakage plus noise ratio (SLNR) to maximize SR. SLNR for the uth user is defined as

$$ SLN{R}_u=\frac{{\left\Vert {\mathbf{H}}_u^a{\mathbf{w}}_u\right\Vert}_F^2}{\sum \limits_{i=1,i\ne u}^{N_{\mathrm{BS}}}{\left\Vert {\mathbf{H}}_i^a{\mathbf{w}}_u\right\Vert}_F^2+{\sigma}^2},u=1,2,\cdots U,i=1,2,\cdots {N}_{\mathrm{BS}}, $$
(29)

where \( {\mathbf{H}}_u^a \) is the NMS × NBS matrix which indicates the angular channel between the BS and the uth user, and wu is the hybrid precoding matrix described as \( {\mathbf{F}}_{\mathbf{RF}}{\mathbf{f}}_{\mathbf{BB}}^u \).

If the leakage channel matrix for the uth user is defined as\( {\mathbf{H}}_{ul}^a={\left[{\left({\mathbf{H}}_1^a\right)}^{\mathrm{H}}\cdots {\left({\mathbf{H}}_{u-1}^a\right)}^{\mathrm{H}},\cdots {\left({\mathbf{H}}_{u+1}^a\right)}^{\mathrm{H}}\cdots {\left({\mathbf{H}}_{N_{\mathrm{BS}}}^a\right)}^{\mathrm{H}}\right]}^{\mathrm{H}} \), the formula (29) can be equivalently written as

$$ SLN{R}_u=\frac{{\mathbf{w}}_u^{\mathrm{H}}{\left({\mathbf{H}}_u^a\right)}^{\mathrm{H}}{\mathbf{H}}_u^a{\mathbf{w}}_{\mathrm{u}}}{{\mathbf{w}}_u^{\mathrm{H}}\left({\sigma}^2+{\left({\mathbf{H}}_{ul}^a\right)}^{\mathrm{H}}{\mathbf{H}}_{ul}^a\right){\mathbf{w}}_{\mathrm{u}}}. $$
(30)

(30) is typical generalized Rayleigh ratio for the variable wu. We can obtain that

$$ SLN{R}_u\le {\lambda}_{\mathrm{max}}\left({\left({\sigma}^2\mathbf{I}+{\left({\mathbf{H}}_{ul}^a\right)}^{\mathrm{H}}{\mathbf{H}}_{ul}^a\right)}^{-1}{\left({\mathbf{H}}_u^a\right)}^{\mathrm{H}}{\mathbf{H}}_u^a\right), $$
(31)

where λmax indicates the maximum eigenvalue of a matrix. The maximum value of SLNR is marked as \( {T}_{u,i}=\frac{{\left|{\mathbf{H}}^a\left(u,i\right)\right|}^2}{\sum \limits_{j=1,j\ne i}^{N_{\mathrm{BS}}}{\left|{\mathbf{H}}^a\left(u,j\right)\right|}^2+{\sigma}^2} \), which can be the metric for dominant beam selection.

The physical significance of the high SLNR value is that the equivalent channel gain of the user increases, while the power leaked to other users is still small, which means the interference to other users is small. When each user leaks to other users with less power, the interference to other users is relatively smaller. Hence, the SINR of each user can be large. Then, the achievable sum rate is high according to (14). Following this selection criterion of beams for the uth user, the dominant beams make a direct contribution to SR.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, Y., Liu, C., Song, Y. et al. DFT codebook-based hybrid precoding for multiuser mmWave massive MIMO systems. EURASIP J. Adv. Signal Process. 2020, 11 (2020). https://doi.org/10.1186/s13634-020-00669-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13634-020-00669-4

Keywords