Top

Telecommunication Systems

Published in:

Open Access 14-12-2023

Low-density parity-check codes: tracking non-stationary channel noise using sequential variational Bayesian estimates

Authors: J. du Toit, J. du Preez, R. Wolhuter

Published in: Telecommunication Systems | Issue 2/2024

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Patentsearch

Off

Abstract

We present a sequential Bayesian learning method for tracking non-stationary signal-to-noise ratios in low-density parity-check (LDPC) codes by way of probabilistic graphical models. We represent the LDPC code as a cluster graph using a general purpose cluster graph construction algorithm called the layered trees running intersection property (LTRIP) algorithm. The channel noise estimator is a global gamma cluster, which we extend to allow for Bayesian tracking of non-stationary noise variation. We evaluate our proposed model on real-world 5G drive-test data. Our results show that our model can track non-stationary channel noise accurately while adding performance benefits to the LDPC code, which outperforms an LDPC code with a fixed stationary knowledge of the actual channel noise.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

In wireless communication systems, channel noise interferes with radio transmissions between a transmitter and receiver. The nature of the noise is assumed to be non-stationary, since it can change over time and vary according to environmental dynamics.

Knowledge of channel noise is useful in a communication system to adapt certain radio parameters, ultimately optimising the reliability and overall quality of communication. These optimisation methods include, inter alia, adaptive modulation and coding (AMC), interference suppression, and rate-matching.

Communication systems such as LTE and 5G rely on predefined reference signals, i.e. pilot signals, which are used to estimate the channel noise. These pilot signals are generated in the physical layer and scheduled periodically via the physical uplink control channel (PUCCH), or aperiodically via the physical uplink shared channel (PUSCH) [1]. Instantaneous estimates of the channel noise should ideally be acquired at the same rate as the channel changes, which require frequent use of pilot signals. The downside of doing so is an additional communication overhead, since the time-frequency resources could be utilised to carry payload data instead [2].

Our interest is in finding a way to estimate the channel noise variance applicable to the additive white Gaussian noise (AWGN) channel without relying on scheduled pilot signals. The aim of our work is not to omit the broader channel state information (CSI) system in its entirety, since it also provides information on other channel properties outside the scope of this work. Instead, our focus is on reliably and dynamically estimating the channel noise variance only, which we see as a step towards substituting certain elements of the CSI system. We propose to combine received signal values from the physical layer with a channel-coding scheme such as a low-density parity-check (LDPC) code. By doing so, we modify a standard LDPC code’s decoding algorithm, which can learn the channel noise on-the-fly and simultaneously benefit from the channel-noise estimates. We are not attempting to design new LDPC codes, but rather explore modifications to the existing LDPC decoding algorithm so that our methods can be applied to generic LDPC codes.

LDPC codes were first introduced by Gallager in 1962 and rediscovered in 1995 by MacKay [3, 4]. These codes are a family of block codes with good theoretical and practical properties. They have found extensive application in various wireless communication systems, including the 5G New Radio (NR) technology standard, which will be used in our study.

While knowledge of the channel noise is useful for aspects of the communication system discussed before, LDPC decoders also benefit from this information. LDPC codes require knowledge of the channel noise variance when the communication channel is assumed to be an AWGN channel [5]. Channel noise variance corresponds to the channel signal-to-noise ratio (SNR), which we will use interchangeably in this paper to refer to the channel noise. If the noise variance is over- or understated, the performance of the LDPC decoder can suffer [6‐8].

Alternative solutions, including the focus of our work, aim to estimate SNRs in the forward error correction (FEC) layer while concurrently decoding received messages. The rationale is that through the use of LDPC codes, the channel uncertainty can be made explicit in the LDPC decoder rather than assuming it to have a fixed “confident” a priori value. By making the decoder “aware” of its uncertainty (i.e. allowing it to learn the channel constraints), it may also improve the error-correcting capability of the LDPC code. We use a Bayesian approach that takes into account a statistical distribution over the SNR, which is also a much better realisation of the channel in terms of its stochastic time-varying nature.

Iterative message passing algorithms used in LDPC codes such as bit-flipping use binary (hard-decision) messages between variable nodes and check nodes to update the received bit values in a factor graph (also known as a Tanner graph). Whereas this decoding method is fast, it does not provide probabilistic information about the received bits, which is required to estimate an SNR statistic in AWGN channels. The sum-product algorithm uses probabilistic (soft-decision) messages and is equivalent to the loopy belief propagation (LBP) algorithm used for performing generic inference tasks on probabilistic graphical models (PGMs) [9, 10]. However, as will become clear in Sect. 2, modelling SNRs in AWGN channels results in the use of conditional Gaussian distributions that during subsequent message passing morphs into mixture distributions—these have problematic properties for inference that force us to employ approximation techniques.

As such, studies in [11‐17] have proposed hybrid methods for jointly modelling the SNRs and the transmitted bit messages. These methods are all based on factor graphs with some graph extension that uses an inference scheme different from LBP for estimating the SNRs. A study in [14] presents a comparison between a variational message passing (VMP) based estimator and an expectation maximisation (EM) based estimator for stationary SNRs. The VMP-based estimator demonstrated superior performance over the EM-based estimator and achieved a lower frame error rate, with no significant increase in computational complexity. In [11, 16, 17] SNRs are assumed stationary within fixed independent sections of the LDPC packet with a focus on finding a low complexity noise estimator. These studies do not model the sequential dynamics of channel noise variance, e.g. the possible correlations between inferred instances of SNRs.

It is reasonable to assume that the same noise energy can influence an entire LDPC packet over the duration of a packet due to the transmission speed at which communication takes place. The basic transmission time interval in LTE systems, for example, is 1 ms [18]. Furthermore, a succeeding LDPC packet’s noise estimate can also depend on the noise induced on previous LDPC packets. We introduce a dependency between LDPC packets that gradually “forgets” previous channel noise estimates. This allows an LDPC decoder to track time-varying SNRs as packets arrive at the receiver.

Initially, we used a standard factor graph representation with an additional continuous random variable to model the SNR at a packet level. The larger parity-check factors from the LDPC code created computational challenges, which led us to review other options for representing the problem and doing inference. To address this, we (1) use a more general graphical structure called cluster graphs, (2) use a variant of LBP called loopy belief update (LBU), (3) create a message order that relaxes repeated marginalisation from large parity-check factors, and (4) introduce a cluster-stopping criterion that turns off uninformative clusters during message passing. Cluster graphs have been shown to outperform factor graphs in some inference tasks in terms of accuracy and convergence properties [10, 19, 20].

Our contribution: We view this problem more generally as a PGM and introduce a sequential Bayesian learning technique capable of tracking non-stationary SNRs over time. We represent the PGM as a cluster graph compiled by means of a general purpose algorithm called layered trees running intersection property (LTRIP), developed in [20]. We believe this is the first work that represents LDPC codes as a cluster graph, which may be due to a lack of available algorithms that can construct valid cluster graphs. We demonstrate: (1) tracking of non-stationary channel noise over time, and (2) performance benefits of our approach compared to an LDPC code with stationary knowledge of the channel as well as an LDPC code with perfect knowledge of the channel.

This paper is structured as follows: In Sect. 2, we explain how LDPC codes are represented in a more general PGM framework and extended with the channel noise estimator. Our message passing approach and schedule are also explained. Section 3 describes how the PGM is updated sequentially, which allows the model to track non-stationary channel noise. The results are shown in Sect. 4 and in Sect. 5 we present a supplementary investigation. Finally, our conclusions and future work are discussed in Sect. 6.

2 LDPC codes with non-stationary SNR estimation as PGMs

In this section, we compile a cluster graph of an LDPC code and introduce additional random variables that are used to estimate the channel noise. This is extended further to enable the tracking of non-stationary channel noise. We also discuss our hybrid inference approach and message passing schedule.

2.1 Channel SNR estimation with LDPC codes

We assume an AWGN channel model (without fading), which adds zero mean random Gaussian noise to a transmitted signal. This can be defined by the following input–output relationship:

$$\begin{aligned} X = B + W, \end{aligned}$$

(1)

where X is a random variable representing the received signal at the receiver, B is a random variable representing the transmit signal at the transmitter, and $W \sim \mathcal {N}(0, \gamma )$ is the added Gaussian distributed noise. See [9, Section 11.1] for a more formal definition.

The strength of the noise depends on the Gaussian precision $\gamma $ (the inverse of the variance). We denote the LDPC code’s bit sequence as $b_0,\ldots ,b_{N-1}$, where N is the length of the codeword. We use BPSK signal modulation with unit energy per bit $E_b = 1$ (i.e. a normalised signal). The channel-noise precision is an unknown quantity for which we assign a gamma prior distribution—a Bayesian way of treating unknown variables. The gamma distribution is the conjugate prior for the precision of a Gaussian distribution and is part of the exponential family [21, Section 2.3.6]. With these assumptions, the observed received signal $x_n$ is modelled using a conditional Gaussian likelihood function $f(x_n \mid b_n,\mu ,\gamma )$ with known means $\mu \in $ {$\mu _{0}=-\sqrt{E_b}=-1,\mu _{1}=\sqrt{E_b}=1$} that represent the modulated binary data (i.e. the two different phases of the carrier wave). Since the same noise influences both states of b the received signal’s likelihood function is simplified to:

$$\begin{aligned} f(x_{n} \mid b_{n},\gamma ) = {\left\{ \begin{array}{ll} \sqrt{\frac{\gamma }{2}}e^{-\frac{\gamma (x_n + 1)^2}{2}} &{} {\text {when }} b_n = 0, \\ \sqrt{\frac{\gamma }{2}}e^{-\frac{\gamma (x_n - 1)^2}{2}} &{} {\text {when }} b_n = 1. \end{array}\right. } \end{aligned}$$

(2)

After multiplying in the prior distributions over $b_n$ and $\gamma $, the presence of the $b_n$ terms in the joint changes this to a mixture distribution which does not form part of the exponential family. This implies that its marginals towards the $\gamma $ and $b_n$ variables will not be conjugate towards their prior forms. To rectify this requires some form of approximation that forces these marginals to the required conjugate forms. In the work here we do this by using the VMP approach—thereby replacing the sufficient statistics of $\gamma $ and $b_n$ random variables with their expected values. This leaves Eq. 2 in its conditional form [22, Section 4.3]—Sect. 2.3.1 provides the details about this.

The channel noise is assumed independent and identically distributed (i.i.d.) over the length N of an LDPC packet (the codeword length). The transmission duration of a packet is around 1ms [18], which we assume provides sufficient samples for estimating the noise, and is a short enough time frame for non-stationarity to not be a problem. The channel-noise precision can be translated to a rate-compensated SNR given by ${\text {SNR}}_{\textrm{dB}} = 10 \log _{10}(\frac{E_b\gamma }{2R})$, where R is the code rate [9, Section 11.1], and $\gamma $ is the distribution over the channel-noise precision. Note that we may use the terms precision and SNR interchangeably. The next section presents LDPC codes with SNR estimation using the gamma prior and conditional Gaussian random variables discussed here.

2.2 Representation of LDPC codes with SNR estimation

During the course of this study, we noted that cluster graph representations of LDPC codes are not addressed in the available channel coding literature. Researchers in the channel coding domain may be more familiar with the factor graph (or Tanner graph) representation of LDPC codes. While our study focuses on channel noise estimation, we address the novelty and performance advantages of cluster graph LDPC codes compared to factor graph LDPC codes in a subsequent study [23]. Nonetheless, in the interest of readability, we offer a brief summary of cluster graphs here without diminishing the importance of our other study.

A cluster graph is an undirected graph consisting of two types of nodes. A cluster node (ellipse) is a set of random variables and a sepset (short for “separation set”) node (square) is a set of random variables shared between a pair of clusters. In most interesting cases (which also include LDPC decoding), this graph will not be a tree structure, but will contain loops. Inference on such a “loopy” system requires the so-called running intersection property (RIP) [10, Section 11.3.2]. This specifies that any two clusters containing a shared variable, must be connected via a unique sequence of sepsets all containing that particular variable, i.e. no loops are allowed for any particular variable.

Our study uses a general purpose cluster graph construction algorithm that produces a cluster graph of the LDPC code. This algorithm is termed the layered trees running intersection property (LTRIP) algorithm developed in [20]. The LTRIP algorithm proceeds by processing layers, with each layer dedicated to an individual variable. For each variable, it determines an optimal tree structure over all clusters. The final sepsets are formed by merging the sepsets between pairs of clusters across all layers. The resulting cluster graph satisfies the RIP. Compared to factor graphs, cluster graphs offer the benefits of more precise inference and faster convergence. We refer the reader to [20] for more detail regarding the LTRIP algorithm and [23] for a comparison study between cluster graph and factor graph LDPC codes.

For LDPC codes, each cluster contains a parity-check factor equivalent to a parity-check constraint in the original parity-check matrix. To illustrate this, we use an irregular (16,8) LDPC code with $\textbf{H}$ matrix given in Eq. 3. Note that this LDPC code is for illustrative purposes only, we use larger standardised LDPC codes for our simulations. We denote the message bit sequence as $b_0,\ldots ,b_7$, the parity-check bits as $b_8,\ldots ,b_{15}$, and the parity-check factors as $\phi _{0},\ldots ,\phi _{7}$. The cluster graph of the (16,8) LDPC code is shown in Fig. 1 in plate notation.

$$\begin{aligned} \textbf{H} = \left[ \begin{array}{llllllllllllllll} 0 &{} 1 &{} 1 &{} 1 &{} 0 &{} 0 &{} 1 &{} 1 &{} 1 &{} 0 &{} 1 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 1 &{} 0 &{} 0 &{} 0 &{} 1 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 0 &{} 1 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 1 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 0 &{} 1 &{} 0 &{} 0 &{} 0 \\ 1 &{} 0 &{} 0 &{} 0 &{} 1 &{} 1 &{} 1 &{} 1 &{} 0 &{} 0 &{} 0 &{} 1 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 1 &{} 0 &{} 1 &{} 0 &{} 0 &{} 0 &{} 1 &{} 0 &{} 0 &{} 1 &{} 0 &{} 0 &{} 0 \\ 1 &{} 1 &{} 0 &{} 0 &{} 1 &{} 0 &{} 0 &{} 0 &{} 1 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 &{} 0 &{} 0 &{} 1 &{} 1 &{} 0 &{} 1 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 0 \\ 0 &{} 1 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 0 &{} 1 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 \end{array}\right] \end{aligned}$$

(3)

The gamma prior belief is captured by the global cluster $\zeta (\gamma )$ (outside the plate) that is fully connected to all the observed conditional Gaussian clusters $\theta _{0}(x_{0}\mid b_{0},\gamma ),\ldots ,\theta _{15}(x_{15}\mid b_{15},\gamma )$ via the $\gamma $ sepsets. The conditional Gaussian clusters connects to the first layer of parity-check clusters of the LDPC code via sepsets $b_0,\ldots ,b_{15}$. The structure inside the plate repeats as LDPC packets $p \in P$ arrive at the receiver.

Our study focuses on irregular LDPC codes as adopted by the 5G new radio (NR) standard by the 3rd Generation Partnership Project (3GPP). Irregular LDPC codes are also known to outperform regular LDPC codes [5, 9]. With regular LDPC codes, all parity-check factors have equal cardinality, which is not the case for irregular LDPC codes. Ideally, the larger parity-check factors in irregular LDPC codes should have minimal connectivity, which helps reduce expensive computation during inference. This is not possible to do in a factor graph representation, as the number of edge connections is equal to the cardinality of each parity-check factor. Instead, we bundle the larger parity-check clusters away from the observed conditional Gaussian clusters using a message passing schedule. The message passing schedule is described in Sect. 2.3.4. Note that no connections are present between observed conditional Gaussian clusters and the large parity-check clusters $\phi _0$ and $\phi _3$ as shown in Fig. 1. The observed conditional Gaussian clusters are linked to the smaller parity-check clusters $\phi _1$, $\phi _2$, $\phi _4$, $\phi _5$, $\phi _6$, and $\phi _7$, leading to considerable computational savings.

2.3 Message passing approach

This section describes variational message passing (VMP) between the gamma and conditional Gaussian nodes, loopy belief update (LBU) between the parity-check nodes and a hybrid message passing approach between the conditional Gaussian nodes and parity-check nodes. All the required update equations and rules are discussed. We also discuss the message passing schedule mentioned previously, which alleviates expensive parity-check computation.

2.3.1 Variational message passing

This section concerns the message passing between the various conditional Gaussian clusters and the gamma cluster. VMP makes a distinction between parent-to-child messages and child-to-parent messages [22]. We deal with the parent-to-child messages that update the conditional Gaussian clusters first.

The gamma distribution and its expected moments are required for the parent-to-child messages from the $\gamma $ cluster to the conditional Gaussian clusters $\theta _n$. We use the following parameterisation¹ of the gamma distribution:

$$\begin{aligned}{} & {} {\text {Gam}}(\gamma \mid \omega ,\nu ) = \frac{(\frac{1}{2\omega })^\frac{\nu }{2}}{\Gamma (\frac{\nu }{2})} \gamma ^{\frac{\nu }{2} - 1}e^{-\frac{\gamma }{2\omega }}, \end{aligned}$$

(4)

where $\nu $ is the degrees of freedom (equivalent to the number of observations that convey a “confidence” in the mean value), $\omega $ is a scaled precision, and $\Gamma $ is the gamma function.

In exponential family form transformed to the log-domain this is given as:

$$\begin{aligned} \log ({\text {Gam}}(\gamma \mid \omega ,\nu ))&= \begin{bmatrix} -\frac{1}{2\omega } \\ \frac{\nu }{2} - 1 \end{bmatrix}^{\textrm{T}} \begin{bmatrix} \gamma \\ \log (\gamma ) \end{bmatrix} -(\log (\Gamma (\frac{\nu }{2}))\nonumber \\ {}&+ \frac{\nu }{2}\log (2\omega )). \end{aligned}$$

(5)

In this form, the left column vector is the natural parameters of the distribution and the right column vector is the sufficient statistics of the distribution. The expected moment vector of the sufficient statistics is [21, Appendix B]:

$$\begin{aligned} \left\langle \gamma \right\rangle , \left\langle \log (\gamma ) \right\rangle&= \begin{bmatrix} \nu \omega \\ \psi (\frac{\nu }{2}) - \log (\frac{1}{2 \omega }) \end{bmatrix}^{\textrm{T}}, \end{aligned}$$

(6)

where $\psi $ is the digamma-function.

The other parent-to-child messages required to update the conditional Gaussian clusters are from the parity-check clusters. To understand these, we first need the categorical distribution in its exponential family form transformed to the log-domain given by [24, Section 2.3.1]:

$$\begin{aligned} \log (p(b \mid \pi _{0},\pi _{1})) = \begin{bmatrix} \log (\pi _0) \\ \log (\pi _1) \end{bmatrix}^{\textrm{T}} \begin{bmatrix} \llbracket b=0 \rrbracket \\ \llbracket b=1 \rrbracket \end{bmatrix} - \log (\sum _{i=0,1} \pi _{i}), \end{aligned}$$

(7)

where $\pi _{i}$ are the unnormalised bit probabilities. The expected moment vector of the sufficient statistics is:

$$\begin{aligned} \left\langle \llbracket b=0 \rrbracket \right\rangle , \left\langle \llbracket b=1 \rrbracket \right\rangle&= \begin{bmatrix} \frac{\pi _0}{\sum _{i=0,1}\pi _{i}} \\ \frac{\pi _1}{\sum _{i=0,1}\pi _{i}} \end{bmatrix}^{\textrm{T}}. \end{aligned}$$

(8)

We choose the gamma distribution prior parameters by considering its mean value, given by $\nu \times \omega $, and we think of $\nu $ as the number of observations that its mean value is based on. Since the gamma distribution is conjugate to the Gaussian distribution, the conditional Gaussian likelihood from Eq. 2 can be rewritten into a similar form. The Gaussian precision is common to both cases of b and is therefore updated as a single entity (unlike the standard Gaussian mixture model). The exponential family form of the conditional Gaussian distribution transformed to the log-domain is given by:

$$\begin{aligned} \log (p(x_n \mid b_n,\gamma ))&= \begin{bmatrix} \sum _{i} \llbracket b_n=i \rrbracket (\gamma \mu _{i}) \\ \sum _{i} \llbracket b_n=i \rrbracket (\frac{-\gamma }{2}) \end{bmatrix}^{\textrm{T}} \begin{bmatrix} x_{n} \\ x^{2}_{n} \end{bmatrix} \nonumber \\&+ \frac{1}{2}\sum _{i} \llbracket b_n=i \rrbracket (\log (\gamma ) - \gamma \mu ^{2}_{i} - \log (2\pi )). \end{aligned}$$

(9)

The $x_n$ terms are the observed signal values and the Gaussian means $\mu _i$ are known and can be replaced by their fixed values $\mu _{0}=-1$ and $\mu _{1}=1$.

During the parent-to-child update from the gamma cluster, the $\gamma $ and $\log (\gamma )$ terms in Eq. 9 will be replaced with their expected values given by Eq. 6. This is shown by:

$$\begin{aligned} \mu ^{'}_{\zeta ,\theta _n}&= [\left\langle \gamma \right\rangle ,\left\langle \log (\gamma ) \right\rangle ]^{\textrm{T}}, \end{aligned}$$

(10)

$$\begin{aligned} \Psi ^{'}_{\theta _n}&= \begin{bmatrix} \sum _{i} \llbracket b_n=i \rrbracket (\left\langle \gamma \right\rangle \mu _{i}) \\ \sum _{i} \llbracket b_n=i \rrbracket (\frac{-\left\langle \gamma \right\rangle }{2}) \end{bmatrix}^{\textrm{T}} \begin{bmatrix} x_{n} \\ x^{2}_{n} \end{bmatrix} \nonumber \\&+ \frac{1}{2}\sum _{i} \llbracket b_n=i \rrbracket (\left\langle \log (\gamma )\right\rangle - \left\langle \gamma \right\rangle \mu ^{2}_{i} - \log (2\pi )). \end{aligned}$$

(11)

Similarly, during the parent-to-child update from a parity-check cluster, the Iverson bracket $\llbracket b_n=i \rrbracket $ in Eq. 9, will be replaced with its expected value.

The required expected value from a parity-check cluster reduces to estimating the bit probabilities $P(b_n=i)$. We discuss the replacement of the Iverson bracket with expected values from parity-check clusters in Sect. 2.3.3 which deals with the hybrid message passing.

The expected values that replace the $\gamma $ and $\llbracket b_n=i \rrbracket $ terms in Eq. 9 are based on the latest beliefs from the gamma cluster distribution and the parity-check cluster distribution. After the expected values are installed, Eq. 9 forms a mixture of two gamma distributions since $x_n$ is observed and $\mu _i$ are known value. Equation 9 is kept in a conditional form rather than a higher dimensional joint form.

Messages from conditional Gaussian clusters to parity-check clusters are child-to-parent messages. We also postpone discussing these message updates to Sect. 2.3.3 (that deals with hybrid message passing).

Messages from conditional Gaussian clusters to the gamma cluster are child-to-parent messages that require Eq. 9 to be marginalised towards obtaining gamma distribution parameters. The terms of Eq. 9 are re-arranged to obtain the appropriate parameters given by:

$$\begin{aligned} \log (p(x_n \mid b_n,\gamma ))&= \begin{bmatrix} \sum _{i} \llbracket b_n=i \rrbracket (-\frac{1}{2}(x_{n} - \mu _i)^2) \\ \sum _{i} \llbracket b_n=i \rrbracket (\frac{1}{2}) \end{bmatrix}^{\textrm{T}} \begin{bmatrix} \gamma \\ \log (\gamma ) \end{bmatrix} \nonumber \\&- \sum _{i} \llbracket b_n=i \rrbracket (\log (2\pi )) . \end{aligned}$$

(12)

We note in Eq. 12 that for child-to-parent messages, the separation of the sufficient statistic vector during marginalisation removes the previous parent-to-child message from the updated natural parameter vector. With $\llbracket b_n=i \rrbracket $ replaced by its expected value, $x_n$ replaced with the observed signal value, and $\mu _{0}=-1$ and $\mu _{1}=1$, the left column gives the incremental update with which the prior gamma parameters need to increase to form the posterior. This is shown by:

$$\begin{aligned} \mu ^{'}_{\theta _n,\zeta }&= \begin{bmatrix} \sum _{i=0,1} \left\langle \llbracket b_n=i \rrbracket \right\rangle (-\frac{1}{2}(x^{2}_{n} - 2 x_n \mu _i + \mu ^{2}_i) \\ \sum _{i=0,1} \frac{1}{2} \left\langle \llbracket b_n=i \rrbracket \right\rangle \end{bmatrix}^{\textrm{T}}, \end{aligned}$$

(13)

$$\begin{aligned} \Psi _{{\zeta }_{\textrm{prior}}}&= \begin{bmatrix} -\frac{1}{2\omega } \\ \frac{\nu }{2} - 1 \end{bmatrix}^{\textrm{T}}, \end{aligned}$$

(14)

$$\begin{aligned} \Psi ^{'}_{\zeta }&= \Psi _{{\zeta }_{\textrm{prior}}} + \sum _{n} \mu ^{'}_{\theta _n,\zeta }. \end{aligned}$$

(15)

The expected values of the updated gamma cluster can now be recalculated using Eq. 6 to be used for updating the conditional Gaussian clusters in the next round.

With this background, we demonstrate how VMP is applied (using a sub-graph extracted from Fig. 1) in Fig. 2.

Cluster $\theta _0(x_0\mid b_0,\gamma )$ is updated with information from cluster $\zeta (\gamma )$ using a VMP parent-to-child message $\mu _{\zeta ,\theta _0}$. This message is the expected values of $\log ({\text {Gam}}(\gamma \mid \omega ,\nu ))$ given by Eq. 6. The expected values are installed at the corresponding $\gamma $ terms in Eq. 9. The message from cluster $\theta _0(x_0\mid b_0,\gamma )$ to cluster $\zeta (\gamma )$ is a VMP child-to-parent message $\mu _{\theta _0,\zeta }$. This requires cluster $\theta _0(x_0\mid b_0,\gamma )$ to be marginalised in obtaining a distribution over sepset $\gamma $ with its parameters given by the natural parameter vector in Eq. 12. The message is absorbed into cluster $\zeta (\gamma )$ by adding its natural parameters to those of the prior (as given in Eq. 5).

2.3.2 LBU message passing

This section concerns the message passing between the various parity-check clusters-here we make use of the Lauritzen–Spiegelhalter message passing algorithm [25], also known as the loopy belief update (LBU) algorithm. However, the fundamental concepts are easier to define via the well known Shafer–Shenoy algorithm [26], also known as the sum-product or loopy belief propagation (LBP) algorithm. Hybrid message passing, linking the VMP and the LBU sections of the graph, also draws on understanding the relationship between LBP and LBU. We therefore provide a brief summary of both LBP and LBU here-a fuller version is available in the definitive handbook by Koller & Friedman [10, Sections 10.2, 10.3 and 11.3].

We use Fig. 3 to illustrate the various relevant concepts. As discussed in Sect. 2.2, cluster graphs contain cluster nodes and sepsets, both of which are used during message passing to update nodes. The cluster internal factor functions are given by $\phi _a$ and $\phi _b$ where a and b identifies the particular cluster (in our application these functions are conditional distributions enforcing even parity over all the variables involved in each parity-check cluster). We will at times somewhat abuse the notation by also using these factor functions directly to identify their particular clusters.

The sepset connecting the two clusters a and b is denoted as $\mathcal {S}_{a,b}$ and comprises the collection of random variables about which the two clusters will exchange information. The notation $\backslash a$ is used to indicate the set of all cluster nodes excluding cluster node a.

For LBP message passing, the message passed from cluster a to cluster b is denoted as $\mu _{a,b}$. The product of all other messages incoming to cluster node a-excluding message $\mu _{b,a}$-is denoted by $\mu _{\backslash b,a}$. If anything changes in the messages $\mu _{\backslash b,a}$, we can propagate this towards cluster b via the update equation:

$$\begin{aligned} \mu ^{'}_{a,b} = \sum _{\backslash \mathcal {S}_{a,b}} \mu _{\backslash b,a} \phi _{a}, \end{aligned}$$

(16)

where the marginalisation sum over the set $\backslash \mathcal {S}_{a,b}$ removes all variables not present in the sepset $\mathcal {S}_{a,b}$. Note that this implies that the larger a sepset, the cheaper the required marginalisation will be—another benefit of cluster graphs over factor graphs which always have only single variables in their sepsets.

Also note that, especially with clusters sharing links with many others, the LBP formulation can be quite expensive due to the redundancy implied by the various message combinations present in the $\mu _{\backslash b,a}$ term.

In a loopy system (such as we will typically have with LDPC codes) these $\mu $ messages will have to be passed iteratively according to some schedule until they have all converged. At that point, we can then get an estimate for the marginal distribution of all the random variables present in a particular cluster—this is known as the cluster belief and is given by:

$$\begin{aligned} \Psi _{a} = \phi _{a} \mu _{b,a} \mu _{\backslash b,a}, \end{aligned}$$

(17)

i.e. it is the product of the cluster internal factor with all incoming messages. Similarly, we can calculate the belief over the sepset variables $\mathcal {S}_{a,b}$ as the product of the two opposing messages passing through that sepset:

$$\begin{aligned} \psi _{a,b} = \mu _{a,b} \mu _{b,a}. \end{aligned}$$

(18)

Note that the sepset beliefs, being the product of the two $\mu _{a,b}$ and $\mu _{b,a}$ messages, are intrinsically directionless.

In contrast to LBP, LBU message passing is expressed fully in terms of only cluster beliefs and sepset beliefs. For this we use two alternative (although equivalent) expressions for these quantities (see [10, Section 10.3] for the derivations). The sepset belief update is given by:

$$\begin{aligned} \psi ^{'}_{a,b}&= \sum _{\backslash \mathcal {S}_{a,b}} \Psi ^{'}_a. \end{aligned}$$

(19)

Note that this is computationally more efficient since it avoids the repeated re-combination of the various messages present in the $\mu _{\backslash b,a}$ term in Eq. 16 when considering different target clusters b. Using this we can now incrementally update the cluster belief using:

$$\begin{aligned} \Psi ^{'}_{b}&= \Psi _b\frac{\psi ^{'}_{a,b}}{\psi _{a,b}}. \end{aligned}$$

(20)

With this background, we can now turn to Fig. 4 (a sub-graph extracted from Fig. 1) to illustrate how LBU applies to the parity-check clusters. Cluster node $\phi _4$ is updated using LBU message passing. Using Eq. 19, the updated sepset belief is calculated by marginalising the cluster belief $\Psi _{\phi _1}$ to obtain the required sepset $b_9$. The previous sepset belief is cancelled (divided) as per the LBU update rules given by Eq. 20. This gives the update equations as:

$$\begin{aligned} \psi ^{'}_{\phi _1,\phi _4}&= \sum _{b_0,b_4,b_{11}} \Psi _{\phi _1}, \end{aligned}$$

(21)

$$\begin{aligned} \Psi ^{'}_{\phi _4}&= \Psi _{\phi _4} \frac{\psi ^{'}_{\phi _1,\phi _4}}{\psi _{\phi _1,\phi _4}}. \end{aligned}$$

(22)

Unlike VMP, the form of the update rules remains the same regardless of the message direction. We therefore simply iterate these messages until convergence using the message passing schedule of Sect. 2.3.4.

2.3.3 Hybrid message passing

This section describes our hybrid message passing between the VMP-based conditional Gaussian cluster nodes and the LBU-based parity-check nodes connected to it. Although our treatment is general and applies to any $\{b_n\}$, we will specifically focus on the $\mu _{\theta _0,\phi _1}$ and $\mu _{\phi _1,\theta _0}$ messages running over the $\{b_0\}$ sepset in Fig. 5 (which in its turn is a sub-graph extracted from Fig. 1).

We first consider the child-to-parent message $\mu ^{'}_{\theta _0,\phi _1}$ running from the $\theta _0(x_0\mid b_0,\gamma )$ cluster to the $\phi _1(b_0,b_4,b_9,b_{11})$ cluster. According to VMP message passing we determine its natural parameters by re-arranging Eq. 9 so that the $b_0$ related terms move to the sufficient statistics column:

$$\begin{aligned} \log (p(x_0 \mid b_0,\gamma )) =&\begin{bmatrix} \gamma \mu _{0}x_0 - \frac{\gamma x_0^{2}}{2} + \frac{(\log (\gamma ) - \gamma \mu _{0}^{2} - \log (2\pi ))}{2} \\ \gamma \mu _{1}x_0 - \frac{\gamma x_0^{2}}{2} + \frac{(\log (\gamma ) - \gamma \mu _{1}^{2} - \log (2\pi ))}{2} \end{bmatrix}^{\textrm{T}} \nonumber \\ {}&\begin{bmatrix} \llbracket b_0=0 \rrbracket \\ \llbracket b_0=1 \rrbracket \end{bmatrix}. \end{aligned}$$

(23)

From our model definition, the values of the means are known as $\mu _0=-1$ and $\mu _1=1$. The expected values of the $\gamma $ terms are known via Eq. 6. The natural parameters column (the left-hand one) effectively measure the heights of the two Gaussians corresponding to $b_0$ being either 0 or 1. Similar to what we saw before with Eq. 12, we note that for child-to-parent messages, the separation of the sufficient statistic vector during marginalisation effectively removes the previous parent-to-child message from the updated natural parameter vector.

The question now is how this is to be made compatible with the corresponding $\phi _1$ cluster which, of course, will also be updated via sepset beliefs from its other side (connecting to other parity-check clusters using Eq. 20). To reconcile these worlds, we start by comparing Eqs. 16 and 17 and noticing that we can also determine the (original) update message using:

$$\begin{aligned} \mu ^{'}_{a,b} = \sum _{\backslash \mathcal {S}_{a,b}} \Psi _a / \mu _{b,a}. \end{aligned}$$

(24)

Due to that (explicit) division, we refer to this as the post-cancellation message, i.e. the message we derive by marginalising the cluster belief² and then removing the opposite direction message via division. But this strongly reminds us of the child-to-parent message of Eq. 23 which also (implicitly) accomplishes removing the influence of its parent. We therefore set our updated $\mu ^{'}_{\theta _0,\phi _1}$ to be a a categorical message with natural parameters given by the corresponding lefthand column of Eq. 23.

Next, we need to determine how to incrementally update the cluster belief $\Psi _{\phi _1}$ with such a post-cancellation message. We do this via Eq. 17 (but now with $\Psi _b$ in mind) to get the cluster belief update equation:

$$\begin{aligned} \Psi ^{'}_{b}&= \phi _{b} \mu ^{'}_{a,b} \mu _{\backslash a,b}, \nonumber \\&= \Psi _{b} \frac{\mu ^{'}_{a,b}}{\mu _{a,b}}. \end{aligned}$$

(25)

In short, we can directly update an LBU cluster belief by dividing the old post-cancellation message out and multiplying the new one in. This gives us:

$$\begin{aligned} \Psi ^{'}_{\phi _{1}}&= \Psi _{\phi _{1}}\frac{\mu ^{'}_{\theta _0,\phi _1}}{\mu _{\theta _0,\phi _1}}. \end{aligned}$$

(26)

The child-to-parent update rules for the general case that connects a conditional Gaussian cluster $\theta _n$ to a parity-check cluster $\phi _j$ are:

$$\begin{aligned} \mu _{\theta _n,\phi _j}&= \begin{bmatrix} \left\langle \gamma \right\rangle \mu _{0}x_n - \frac{\left\langle \gamma \right\rangle x_n^{2}}{2} + \frac{(\left\langle \log (\gamma )\right\rangle - \left\langle \gamma \right\rangle \mu _{0}^{2} - \log (2\pi ))}{2} \\ \left\langle \gamma \right\rangle \mu _{1}x_n - \frac{\left\langle \gamma \right\rangle x_n^{2}}{2} + \frac{(\left\langle \log (\gamma )\right\rangle - \left\langle \gamma \right\rangle \mu _{1}^{2} - \log (2\pi ))}{2} \end{bmatrix}^{\textrm{T}}, \end{aligned}$$

(27)

$$\begin{aligned} \Psi ^{'}_{\phi _{j}}&= \Psi _{\phi _{j}}\frac{\mu ^{'}_{\theta _n,\phi _j}}{\mu _{\theta _n,\phi _j}}. \end{aligned}$$

(28)

Lastly, we consider the parent-to-child message $\mu ^{'}_{\phi _1,\theta _0}$ running from the $\phi _1(b_0,b_4,b_9,b_{11})$ cluster to the $\theta _0(x_0\mid b_0,\gamma )$ cluster. From the VMP viewpoint, our message $\mu ^{'}_{\phi _1,\theta _0}$ simply is the expected values:

$$\begin{aligned} \mu ^{'}_{\phi _1,\theta _0}&= \left\langle \llbracket b_0=i\rrbracket \right\rangle _{\Psi _{\phi _1}(b_0,b_4,b_9,b_{11})}{} & {} {\text {with }} i \in \{0,1\}. \end{aligned}$$

The first step is to find the distribution over $b_0$ by marginalising:

$$\begin{aligned} \psi _{\theta _0,\phi _1}(b_0)&= \sum _{b_4,b_9,b_{11}} \Psi _{\phi _1}(b_0,b_4,b_9,b_{11}). \end{aligned}$$

(29)

The required expectations are the probabilities for $b_0$ as found in this sepset belief—we simply install them as the required parameters in the $\theta _0$ cluster.

In general, where we have a message describing $b_n$ running from the parent cluster $\phi _j$ to a child cluster $\theta _n$, we first find the sepset belief:

$$\begin{aligned} \psi ^{'}_{\theta _n,\phi _j}(b_n)&= \sum _{\backslash b_n} \Psi _{\phi _j}, \end{aligned}$$

(30)

and then replace the two probabilities for $b_n$ found in this sepset belief as the required expected sufficient statistics $\left\langle \llbracket b_n=i\rrbracket \right\rangle _{\Psi _{\phi _j}}$ in the $\theta _n$ cluster:

$$\begin{aligned} \Psi ^{'}_{\theta _n}&= \begin{bmatrix} \sum _{i} \left\langle \llbracket b_n=i \rrbracket \right\rangle _{\Psi _{\phi _{j}}} (\left\langle \gamma \right\rangle \mu _{i}) \\ \sum _{i} \left\langle \llbracket b_n=i \rrbracket \right\rangle _{\Psi _{\phi _{j}}} (\frac{-\left\langle \gamma \right\rangle }{2}) \end{bmatrix}^{\textrm{T}} \begin{bmatrix} x_{n} \\ x^{2}_{n} \end{bmatrix} \nonumber \\&+ \frac{1}{2}\sum _{i} \left\langle \llbracket b_n=i \rrbracket \right\rangle _{\Psi _{\phi _{j}}} (\left\langle \log (\gamma )\right\rangle - \left\langle \gamma \right\rangle \mu ^{2}_{i} - \log (2\pi )). \end{aligned}$$

(31)

2.3.4 Message passing schedule

A message passing schedule is important for loopy graphs as the message order can influence convergence speed, accuracy, and the computational cost of inference. Although not empirically verified here, these feedback loops (or cycles) may reinforce inaccurate cluster beliefs causing self-fulfilling belief updates, which affect the LDPC decoder’s performance. This problem is more prominent in LDPC codes with small feedback loops as described in [5]. Taking this into consideration, our message passing schedule (1) uses a structured schedule with a fixed computational cost, (2) aims to minimise the effect of loops, and (3) aims to minimise the computational cost of inference.

The message passing schedule is determined by first identifying the larger parity-check clusters in the graph. We select the larger clusters $\phi _0$ (with cardinality 7) and $\phi _3$ (with cardinality 6). The message schedule starts with the selected clusters as initial sources $\mathcal {S}$ and proceeds by visiting all its neighbouring clusters $\mathcal {N}$, which become the immediate next layer of clusters. A set of available clusters $\mathcal {A}$ is kept to make sure that clusters from previous layers are not revisited, which helps minimise the effect of loops. We repeat this procedure to add subsequent layers of clusters until all clusters are included. The source-destination pairs are stored in a message schedule $\mathcal {M}$. This procedure isolates the initially selected large parity-check clusters from the rest of the clusters as shown in Fig. 1. The idea is to keep the expensive clusters at the final layer so that the smaller (less expensive) parity clusters, in preceding layers, can resolve most of the uncertainty about the even parity states. When the larger parity clusters get updated, some of the even parity states in their discrete tables may have zero probability, which are removed due to our software implementation. This further reduces a large parity cluster’s computational footprint. Our layered message passing schedule is detailed in Algorithm 1.

The observed conditional Gaussian clusters are coupled to the parity-check clusters in the layer furthest away from the initial isolated group of large clusters. We refer to this layer as the first parity-check layer. The smaller parity-check clusters in this layer are given priority in terms of their connectivity to the conditional Gaussian clusters, which saves computation. If the first layer of parity-check clusters does not have all the unique bits, the following layers are utilised until all conditional Gaussian clusters are connected.

The message passing order for the entire PGM starts at the gamma cluster and flows down to the conditional Gaussian clusters through all the parity cluster layers until it reaches the bottom layer. We refer to this as the forward sweep. The backward sweep returns in the opposite direction, which concludes one iteration of message passing. Pseudo-code for our message passing is shown in Algorithm 2.

The following settings and software implementations apply to our inference approach:

a cluster is deactivated during message passing when messages entering it have not changed significantly. This is determined by a symmetrical³ Kullback–Leibler divergence measure between the newest and immediately preceding sepset beliefs.
the stopping criterion for inference is when a valid codeword was detected (also known as a syndrome check) after all parity-check clusters “agree” on their shared bit values or when a maximum number of iterations is reached,
all discrete table factors support sparse representations to reduce memory resources,
zero probability states in discrete tables are removed during inference.

The next section describes how the LDPC code tracks non-stationary SNRs using a Bayesian sequential learning technique.

3 Bayesian sequential learning for non-stationary SNR estimation

Channel noise in a wireless communication system can change over time, especially when the end user is mobile. This means the statistical properties of the received signal should be treated as non-stationary. We adapt Eq. 1 to define the following input–output relationship for the non-stationary case:

$$\begin{aligned} X_t = B_t + W_t, \end{aligned}$$

(32)

where $X_t$ is a random variable representing the received signal at the receiver, $B_t$ is a random variable representing the transmit signal at the transmitter, and $W_t \sim \mathcal {N}(0, \gamma _t)$ is the added Gaussian distributed noise. In this case, the channel noise precision $\gamma _{t}$ is not constant for all time t.

For the PGM to remain representative of the varying channel conditions, the parameters of the gamma distribution need to adapt accordingly as LDPC packets arrive at the receiver. This can be achieved with Bayesian sequential learning also known as Bayesian sequential filtering. Each time an LDPC packet is decoded, the parameters of the obtained posterior gamma distribution are stored and used as the prior distribution when decoding the next LDPC packet. However, as an increased number of LDPC packets are decoded, our stationarity assumptions cause an ever-increasing certainty around the gamma distribution’s mean and the PGM struggles to respond accurately to the changing noise. Tracking the noise becomes restricted due to the strong underlying i.i.d. assumption across all noise estimates (from the observed data), which will follow the average channel noise instead of the evolving channel noise.

To remedy this, we introduce a time constant T that represents a period of time in which we assume the stationarity assumption holds. For example, if we assume no drastic changes in channel noise over a period of T seconds, our stationarity assumption should be valid for $S=\frac{T\times 1000}{1}$ number of LDPC packets (assuming a transmission time of 1ms per LDPC packet).

We “relax” the i.i.d. assumption by accumulating the posterior gamma distribution’s parameters as we normally would, but only up to S number of LDPC packets. After the S number of packets are decoded, the posterior gamma distribution’s parameters (the natural parameter vector) are continuously being re-scaled with a scaling factor $\frac{S\times N}{\frac{\nu }{2} - 1}$ as follows:

$$\begin{aligned} \Psi ^{'}_{\zeta } = {\left\{ \begin{array}{ll} \begin{bmatrix} -\frac{1}{2\omega } \\ \frac{\nu }{2} - 1 \end{bmatrix}^{\textrm{T}} , &{} \hspace{8mm} \frac{\nu }{2} - 1 \le S\times N \\ \begin{bmatrix} -\frac{1}{2\omega } \\ \frac{\nu }{2} - 1 \end{bmatrix}^{\textrm{T}} \frac{S\times N}{\frac{\nu }{2} - 1},&\hspace{8mm} \frac{\nu }{2} - 1 > S\times N. \end{array}\right. } \end{aligned}$$

(33)

By scaling the gamma distribution’s parameters, the number of observations $\nu $ that make up the posterior distribution reaches a ceiling value, which is not equal to the theoretical maximum of the full i.i.d. assumption. This allows for variance around the gamma distribution’s mean so that the estimate does not become too “confident” in the data, which makes the gamma distribution responsive to the evolving channel noise. Scaling the natural parameter vector in this way makes contributions from previous estimates progressively less important compared to more recent estimates—allowing the PGM to “forget” historic channel noise. Note that this is not a windowing approach or a Kalman filter type approach where noise is added between estimations. The posterior gamma distribution remains informed by the entire sequence of received data, but the contributions from past data decay exponentially as packets are received and decoded.

4 Experimental investigation

This section describes the results obtained from stationary noise and non-stationary noise experiments.

The LDPC code used is constructed from the 5G new radio (NR) standard [27]. We use base graph 2 with size (42, 52) and expansion factor 11. The resultant $\textbf{H}$ matrix is shortened to a codeword length of $N=220$ bits, of which $K=110$ are message bits (producing a code rate $R=0.5$). A cluster graph is compiled from the parity-check factors using the LTRIP algorithm, and the message schedule is initialised by clusters with cardinality 8 and 10 (the largest factors), which form the bottom layer of the PGM (see Sect. 2.3.4). We use BPSK modulation over an AWGN channel and assume a transmitted bit has unit energy.

4.1 Purpose of stationary noise experiment

The stationary noise experiment is presented as a BER vs SNR curve as typically would be the case for determining the behaviour and performance of error correction codes over a range of SNR values. The purpose of the experiment is to compare the BER performance between our proposed PGM and a PGM with perfect knowledge of the noise across a range of SNR values. The selected SNR range consists of 6 equidistant points from 0 to 4.45 dB (inclusive). A random bit message is encoded and Gaussian noise is added to each bit, which repeats for 50k packets per SNR value. The same received bit values are presented to both PGMs to ensure a like-for-like comparison. We set the maximum number of message passing iterations to 20 for both PGMs. Our channel estimation PGM is parameterised with $S = 10$, which assumes stationarity over 10 LDPC packets (equivalent to a period of 10ms).

4.1.1 Results and interpretation

The results of the stationary noise experiment are shown in Fig. 6. The BER performance (shown on the left) of our proposed PGM closely follows the performance of a PGM with perfect knowledge of the noise precision (the error bars reflect the 95% confidence intervals). Similarly, the number of iterations required to reach convergence (shown on the right) is almost identical between the two PGMs.

4.2 Purpose of non-stationary noise experiment

The purpose of the experiment is to test whether our proposed PGM is capable of tracking non-stationary channel noise, and if it benefits from the estimated channel noise information. We compare three scenarios: (1) a PGM that has perfect (instantaneous) knowledge of the channel noise precision, (2) a PGM using a fixed value for the channel noise precision, and (3) our proposed PGM that sequentially updates its channel noise precision estimation. Note that the assumed channel noise in (2) is the last precision mean obtained by running our proposed PGM over the test data once with $S = \infty $. While (2) assumes stationarity, its fixed value of the channel was accumulated from the entire test sequence and is advantageous, since it allows the PGM to “anticipate” future values of the channel noise. We set the maximum number of message passing iterations to 20 for all scenarios and parameterise (3) with $S = 10$, which assumes stationarity over 10 LDPC packets (equivalent to a period of 10ms).

4.2.1 Drive-test data

We test our model using actual drive test measurements obtained from Vodacom Group Limited, a mobile communications company based in South Africa. Drive tests capture key network performance indicators that allow the mobile network operator to troubleshoot and optimise the network. We use the signal-to-interference-plus-noise ratio (SINR) measurements from a 5G device while driving approximately 10 km along a coastal route. The captured data includes handovers between cells as the user equipment travels in and out of reception range. The start and end times of the test along with the SINR measurements and GPS locations are shown in Fig. 7.

The drive-test data had missing measurements and irregular time intervals. The missing values were removed and the sequence was re-indexed to create a time series with regular intervals. It was upsampled using linear interpolation to match a basic transmission time interval of 1ms. Note that our final test data may reflect more extreme channel variations due to some discontinuities introduced as a consequence of the described data cleaning. We generate random bit messages that are encoded using the $\textbf{H}$ matrix to produce LDPC packets (or codewords). Bit values in an LDPC packet are modulated using BPSK modulation, and random Gaussian noise is added to the resultant signal values using the SINR drive-test data. The SINR data are converted from their logarithmic form to a linear form to obtain precision values. The precision values are used to add zero-mean Gaussian noise to the transmitted signal, which produces the received signal. Note that the same precision value is used to add noise to all signal values from the same LDPC packet.

The dataset used during the current study is not publicly available due to the organisation’s data privacy policy, but can be made available from the corresponding author on reasonable request.

4.2.2 Results and interpretation

As stated earlier, we use the Gaussian distribution’s precision to model the channel noise. Results from our experiment are shown in Fig. 8. The gamma posterior distribution is capable of tracking the actual noise precision (shown by the mean and one standard deviation in the top part of the figure). The BER shown in the bottom part of the figure is a moving average consisting of 10000 packets centred around the current point in time; this helps to reveal the underlying trend. We use Dirichlet smoothing to avoid reporting BER values in the range $0< {\text {BER}} < 9\times 10^{-7}$, which is smaller than the possible BER. The BER shown in our results is calculated using: $ {\text {BER}} = \frac{1}{10,000}\sum \frac{(a + {\text {bit errors in a packet}})}{(K + a)}$, where $a=0.005$ and K is the number of message bits.

We observe an improvement in the BER compared to a PGM using a fixed value (a precision of 8.76 or 9.42 dB SNR) of the average noise precision (shown in the bottom part of the figure).

The BER performance of our proposed PGM closely follows the BER performance of the PGM with perfect knowledge of the noise precision. This is due to the small difference between the actual noise precision and the estimated noise precision of our proposed PGM.

In some experiments, we observed instances where the estimated precision was much lower than the actual precision. These were instances where the LDPC code failed to fix all bit errors, which the PGM then interpreted as worse than actual channel conditions. Another observation is that the gamma posterior has a wider standard deviation at higher precision means, and a narrower standard deviation at lower precision means. This is due to the characteristics of the gamma distribution. We view this as a beneficial artifact, since the PGM makes provision for more rapid changes in the channel conditions at lower channel noise. (This can also be regarded as a "sceptical" type of behaviour.) When channel conditions are bad, the PGM’s channel estimation more confidently reports that they are bad.

A summary of the overall results appears in Table 1. The average BER of our PGM is slightly higher than the PGM with perfect knowledge of the channel, and outperforms the PGM with fixed value of the channel (at approximately 1.5 times better BER on average). Our PGM requires approximately 1.02 times fewer message passing iterations on average compared to the PGM with a fixed value of the channel, and the same number of iterations compared to the PGM with perfect knowledge of the channel.

Table 1

A summary of the overall performance comparison. Note that the expected behaviour of our PGM is similar to the PGM with perfect knowledge of the channel noise, and outperforms the PGM with fixed value of the average channel noise in terms of the BER

	BER (mean)	Iterations (mean)
True precision	0.003034	2.49
Fixed precision	0.005013	2.54
Estimated precision	0.003336	2.49

Note that the averages presented in this table are more influenced by the high SNR instances where bit errors are infrequent. Figure 8 illustrates the performance advantage of our proposed PGM more clearly where bit errors occur more frequently at lower SNRs.

5 Supplementary investigation

In some instances, the BER performance of our proposed PGM (estimating the SNR) is better than the PGM with knowledge of the actual SNR. We noted this happening when the estimated SNR is lower than the actual SNR, which is counter-intuitive.

A study presented in [8] investigates the effect of channel noise mismatch in LDPC codes. It found that the LDPC code’s performance degrades more quickly when the assumed SNR is overestimated, but is less sensitive to degradation when underestimated up to around 1 dB below the actual SNR. What is also interesting is that the optimal BER is at an assumed SNR lower than the actual SNR.

We reproduce the same experiment to establish whether this behaviour is similar for a cluster graph representation of LDPC codes. Our channel noise mismatch experiment is presented as a BER vs model SNR curve where the model SNR is the assumed channel noise while the actual channel noise is fixed at 1.22 dB (similar to [8]). The purpose of the experiment is to (1) understand the impact on BER performance when the actual channel noise is under- and over-estimated by the model, and (2) determine where the optimal BER is.

A random bit message is encoded and Gaussian noise is added using a fixed precision of 1.32 (equivalent to a SNR of 1.22 dB) to each bit, which repeats for 10k packets per model SNR value. The same received bit values are used for each model SNR value to ensure a like-for-like comparison. We set the maximum number of message passing iterations to 20. The results of the channel noise mismatch experiment is shown in Fig. 9.

This result is similar to [8] and suggests that the LDPC code’s optimal BER is not necessarily at the point where the model SNR is equal to the actual SNR, but somewhere slightly below the actual SNR (in this case at 0.84 dB).

A conservative estimation of the channel noise seems to be beneficial, since the LDPC decoder is more forgiving in the region below the actual noise. Using the drive-test data we run the same experiment as described in Sect. 4.2. However, after each packet is decoded we adjust the estimated precision value to be $-$0.1 dB below the posterior gamma mean value. We found that an adjustment larger than this yielded performance degradation. The result of this adjustment is shown in Table 2.

A slight improvement in the BER can be seen, however, there is a trade-off between the number of iterations and BER performance. The decoder requires slightly more iterations on average when the adjustment to the posterior gamma is made.

We emphasise that the purpose of the supplementary investigation is not to propose that an adjustment be made to the estimated SNR in practice, but rather to show the counter-intuitive behaviour of the LDPC code that we found interesting. The benefit of adjusting the estimated SNR seems minimal.

In our study, we simply selected the average SNR as the static value knowing that the true SNR will be better and at times will be worse. We refrain from making any assertions about the ideal static SNR value, recognising that it can significantly vary based on specific circumstances. Instead, our approach is to dynamically adjust to the real-time SNR conditions prevailing at any given moment.

6 Conclusion and future work

This paper contributes a simple sequential Bayesian learning method for tracking non-stationary channel noise using LDPC codes. We demonstrated the idea by employing a more general probabilistic framework called cluster graphs, and evaluated our proposed model using real-world data. The results show that the performance of our proposed model is nearly indistinguishable from a model with perfect knowledge of the channel noise.

Apart from the performance advantages shown in our results, the approach embeds well within an LDPC decoder and does not require stand-alone machine learning techniques, pre-training, or management of large data sets. It is capable of learning the channel noise accurately on-the-fly while decoding LDPC packets without compromise.

Table 2

A summary of the overall performance comparison between the PGM that estimates the channel noise and the PGM that more conservatively estimates the channel noise. Note that the expected behaviour of the more conservative approach performs better than our initial PGM only in terms of the BER

	BER (mean)	Iterations (mean)
Estimated precision	0.003336	2.49
Estimated precision ($-$0.1 dB)	0.003329	2.49

The implications of this method with respect to the communication system are that (1) the scheduling and use of pilot signals dedicated to channel noise estimation may become redundant, (2) LDPC codes coupled with our proposed channel-noise estimator can be used to estimate non-stationary channel noise on-the-fly with no need for scheduling, and (3) channel-noise information is now inherent in the LDPC code and provides performance advantages to the code.

Our current approach relies on parameterising a time constant, which is a prior assumption about the speed at which channel noise will vary. Future work will focus on applying real-time Bayesian model selection, allowing the LDPC decoder to choose intelligently between multiple assumptions (or combine multiple assumptions) about the speed at which the channel varies. Such work relates to Bayesian change-point detection, which accounts for changes in the underlying data-generating process while estimating parameters. While this study focuses on channel-noise estimation alone, the methodology could also be expanded to estimate other channel properties such as phase and complex channel gain, which will depend on the modulation scheme and channel model discussed in Sect. 2.

Acknowledgements

The authors would like to thank Bonnie Xaba for providing the drive-test data. She is a senior specialist in network benchmarking and E2E network performance quality at Vodacom Group Limited in South Africa.

Declarations

Competing interests

The authors declare no competing interests.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

previous article ESPAR array design for the wireless communication and IoT applications

next article AmRED and RED-QE: redesigning random early detection algorithm

An alternative parameterisation for the gamma distribution is: $\alpha $ (shape), and $\beta $ (rate), which translates to $\nu = 2\alpha $, and $w = \frac{1}{2\beta }$.

From Eq. 19 this is the sepset belief—we also term it the pre-cancellation message.

The average divergence from both directions is used, taking into consideration the mode-seeking and moment-seeking behaviours of the metric given by $\frac{D_{KL}(P||Q) + D_{KL}(Q||P)}{2}$.

Rumney, M. (2013). LTE and the evolution to 4G wireless: Design and measurement challenges (2nd ed.). Hoboken: Wiley. https://doi.org/10.1002/9781118799475CrossRef

Björnson, E., Hoydis, J., & Sanguinetti, L. (2017). Massive mimo networks: Spectral, energy, and hardware efficiency. Foundations and Trends®in Signal Processing, 11(3–4), 154–655. https://doi.org/10.1561/2000000093CrossRef

Gallager, R. (1962). Low-density parity-check codes. IRE Transactions on Information Theory, 8(1), 21–28.MathSciNetCrossRef

Davey, M. C., & MacKay, D. (1998). Low-density parity check codes over gf(q). IEEE Communications Letters, 2(6), 165–167. https://doi.org/10.1109/4234.681360CrossRef

Johnson, S. J. (2006). Introducing Low-density parity-check codes. University of Newcastle, Australia, 1, 2006.

Saeedi, H., & Banihashemi, A. H. (2007). Performance of belief propagation for decoding ldpc codes in the presence of channel estimation error. IEEE Transactions on Communications, 55(1), 83–89. https://doi.org/10.1109/TCOMM.2006.887488CrossRef

Tan, W., & Cruz, J. R. (2004). Signal-to-noise ratio mismatch for Low-density parity-check coded magnetic recording channels. IEEE Transactions on Magnetics, 40(2), 498–506. https://doi.org/10.1109/TMAG.2004.824130ADSCrossRef

MacKay, D. J., & Hesketh, C. P. (2003). Performance of Low density parity check codes as a function of actual and assumed noise levels. Electronic Notes in Theoretical Computer Science, 74, 89–96.CrossRef

MacKay, D. J. (2003). Information theory, inference and learning algorithms. Cambridge: Cambridge University Press.

10.

Koller, D., & Friedman, N. (2009). Probabilistic graphical models: Principles and techniques (1st ed.). London: MIT Press.

11.

Wang, S., Cui, L., Cheng, S., Zhai, Y., Yeary, M., & Wu, Q. (2011). Noise adaptive LDPC decoding using particle filtering. IEEE Transactions on Communications, 59(4), 913–916. https://doi.org/10.1109/CISS.2009.5054686CrossRef

12.

Jakubisin, D. J., Buehrer, R. M., & da Silva, C. R. (2016). Probabilistic receiver architecture combining bp, mf, and ep for multi-signal detection. arXiv preprint arXiv:1604.04834.

13.

Wu, S., Kuang, L., Ni, Z., Lu, J., David Huang, D., & Guo, Q. (2014). Expectation propagation approach to joint channel estimation and decoding for ofdm systems. In 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 1941–1945). https://doi.org/10.1109/ICASSP.2014.6853937

14.

Senst, M., & Ascheid, G. (2012). A message passing approach to iterative Bayesian snr estimation. In 2012 International symposium on signals, systems, and electronics (ISSSE) (pp. 1–6) (2012). https://doi.org/10.1109/ISSSE.2012.6374328

15.

Nissila, M., & Pasupathy, S. (2009). Adaptive iterative detectors for phase-uncertain channels via variational bounding. IEEE Transactions on Communications, 57(3), 716–725. https://doi.org/10.1109/TCOMM.2009.03.070068CrossRef

16.

Wang, S., Cui, L., & Cheng, S. (2011). Noise adaptive ldpc decoding using expectation propagation. In 2011 IEEE Global Telecommunications Conference - GLOBECOM (pp. 1–5). https://doi.org/10.1109/GLOCOM.2011.6133838

17.

Cui, L., Wang, S., & Cheng, S. (2012). Online snr statistic estimation for ldpc decoding over awgn channel using Laplace propagation. In 2012 IEEE global communications conference—GLOBECOM (pp. 3743–3747). https://doi.org/10.1109/GLOCOM.2012.6503699

18.

Dahlman, E., Parkvall, S., & Skold, J. (2011). 4G LTE/LTE-Advanced for Mobile Broadband (1st ed., pp. 301–321). Oxford: Academic Press. https://doi.org/10.1016/B978-0-12-385489-6.00014-XCrossRef

19.

Yedidia, J. S., Freeman, W. T., & Weiss, Y. (2005). Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Transactions on Information Theory, 51(7), 2282–2312. https://doi.org/10.1109/TIT.2005.850085

20.

Streicher, S., & Du Preez, J. (2017). Graph coloring: Comparing cluster graphs to factor graphs. Proceedings of the ACM multimedia 2017 workshop on South African Academic Participation (pp. 35–42). https://doi.org/10.1145/3132711.3132717

21.

Bishop, C. (2006). Pattern Recognition and Machine Learning (Vol. 4). London: Springer.

22.

Winn, J., & Bishop, C. M. (2005). Variational message passing. Journal of Machine Learning Research, 6, 661–694.MathSciNet

23.

Du Toit, J., du Preez, J., & Wolhuter, R. (2022). LDPC codes: Comparing cluster graphs to factor graphs. arXiv preprint arXiv:2204.06350.

24.

Taylor, R. M. (2023, March). Approximate message passing algorithms for latent Dirichlet allocation. PhD thesis, Stellenbosch University

25.

Lauritzen, S. L., & Spiegelhalter, D. J. (1988). Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society: Series B (Methodological), 50(2), 157–194.MathSciNet

26.

Shafer, G., & Shenoy, P. P. (1990). Probability propagation. Annals of Mathematics and Artificial Intelligence, 2, 327–351. https://doi.org/10.1007/BF01531015MathSciNetCrossRef

27.

Bae, J. H., Abotabl, A., Lin, H.-P., Song, K.-B., & Lee, J. (2019). An overview of channel coding for 5g nr cellular communications. APSIPA Transactions on Signal and Information Processing, 8, e17.CrossRef

Title: Low-density parity-check codes: tracking non-stationary channel noise using sequential variational Bayesian estimates
Authors: J. du Toit
J. du Preez
R. Wolhuter
Publication date: 14-12-2023
Publisher: Springer US
Published in: Telecommunication Systems / Issue 2/2024
Print ISSN: 1018-4864
Electronic ISSN: 1572-9451
DOI: https://doi.org/10.1007/s11235-023-01083-5

Springer Professional

Abstract

Publisher's Note

1 Introduction

2 LDPC codes with non-stationary SNR estimation as PGMs

2.1 Channel SNR estimation with LDPC codes

2.2 Representation of LDPC codes with SNR estimation

2.3 Message passing approach

2.3.1 Variational message passing

2.3.2 LBU message passing

2.3.3 Hybrid message passing

2.3.4 Message passing schedule

3 Bayesian sequential learning for non-stationary SNR estimation

4 Experimental investigation

4.1 Purpose of stationary noise experiment

4.1.1 Results and interpretation

4.2 Purpose of non-stationary noise experiment

4.2.1 Drive-test data

4.2.2 Results and interpretation

5 Supplementary investigation

6 Conclusion and future work

Acknowledgements

Declarations

Competing interests

Publisher's Note

Other articles of this Issue 2/2024

An efficient node localization and failure node detection in the MANET environment

Households’ censored mobile phone spending and its determinants in Turkey: an inverse-hyperbolic sine double-hurdle model

QFS-RPL: mobility and energy aware multi path routing protocol for the internet of mobile things data transfer infrastructures

ESPAR array design for the wireless communication and IoT applications

AmRED and RED-QE: redesigning random early detection algorithm

Quantitative measurement of link failure reaction time for devices with P4-programmable data planes