## 1 Introduction

Advanced wireless communication systems use sophisticated digital modulation schemes as well as space–time diversity in order to provide high data rates. The transmission quality of these systems is determined by the performance evaluation, which can be made using metrics such as the bit error probability (BEP), the block error probability (BLEP), or throughput. However, unified analytical expressions of these metrics are not available for several digital communication systems. The common method used to fix this problem is Monte Carlo simulation in which one has to simulate the transmitter, the transmission channel, and the receiver. Unfortunately, in complex systems, this method becomes very prohibitive in terms of computation time, and it requires a very large number of transmitted samples to estimate very low error probabilities. As a solution, semi-analytical performance prediction (SPP) has been proposed in recent years and it has been the subject of numerous studies. In [1], the authors have proposed the importance sampling (IS) method for BER prediction. It has been found that for simple memoryless systems (e.g., a BPSK modem [2]), the efficiency of the IS technique is high and its implementation is relatively easier. However, its accuracy can be severely degraded, especially when a complex system receiver is used. For this reason, Abdi et al. have proposed in [3] a low complexity prediction technique for turbo-like codes. It is based on estimating the probability density function (pdf) of the log-likelihood ratio (LLR) at the output of the decoder using a normal density as a reference. Nevertheless, it does not allow reducing the complexity of the iterative decoding algorithm. In [4], the authors have derived a semi-analytical expression of the bit error probability using a non-parametric estimation of the probability density of the observed samples. It has been shown that the accuracy of the pdf estimator is sensitive to the choice of the smoothing parameter. The method we have proposed in [5] considers the estimation of the pdf using kernel estimator [6] which uses an efficient technique for selecting the smoothing parameter. In [7], we have compared some methods to make up for the optimum smoothing parameter choice. The first is the minimum integrated squared error (MISE) [8], which exhibited a significant squared error between the true pdf and the estimated one. In the second method, the smoothing parameter is estimated using a cross-validation (CV) method [9, 10]. Simulation studies have concluded that the method called cross validation outperforms the other method in terms of squared error. Nevertheless, this technique can lead to inconsistent estimator and requires too much computing time.

In this paper, we propose a new semi-analytical approach based on Fourier transform inversion to derive a semi-analytical expression of error probability. In this method, the probability density of the decision variable at the matched filter output is estimated from the characteristic function via Fourier transform inversion. This is due to the fact that the characteristic function is defined as the Fourier transform of the probability density function. In addition, Fourier integrals can be numerically evaluated by the fast Fourier transform (FFT) algorithm. Furthermore, in order to control the behavior of the probability density estimator, we applied a bootstrap method for selecting the optimum smoothing parameter. This leads to an accurate semi-analytical error probability due to the bootstrap approach efficiency.

Advertisement

The remainder of the paper is organized as follows. In Section 2, we describe the system model considered in this work. In Section 3, a new semi-analytical expression of the error probability is derived, using Fourier inversion approach. Some methods for selecting the smoothing parameter are given in Section 4. Simulations and numerical results are given in Section 5. Then, concluding remarks are made in Section 6.

## 2 System model

The digital communication system considered in this work is shown in Fig. 1. It consists of a transmitter, a transmission channel, and a receiver. At the transmitter end, a digital source delivers a bit-stream represented by the binary sequences denoted by b=[b

_{1},b_{2},…,b_{ L }] and each has length L. The sequences of bits are then passed to a digital modulation scheme which converts them into sequences of symbols, each has length M and whose elements take values in constellation set Ω. The digital modulation can perform binary phase-shift keying (BPSK), quadrature phase-shift keying (QPSK), or high order modulation such as 16-quadrature amplitude modulation (QAM) and 64-QAM. Other techniques, such as single carrier frequency division multiple access (SC-FDMA) or orthogonal frequency division multiplexing (OFDMA), can be included in the transmitter to improve the system reliability. After bit-to-symbol mapping, the modulation transforms the symbol stream into an analog signal suitable to be sent through the transmission channel which can degrade the signal quality.
×

At the receiver, the channel output is passed to a matched filter to reduce the noise effect. After that, the demodulation is performed for symbol-to-bit conversion. Finally, the receiver makes a decision to detect the information bits.

## 3 Semi-analytical error probability derivation

### 3.1 Bit error probability definition

The receiver observes a set of N samples C={x

_{1},x_{2},…,x_{ N }} at the output of the matched filter and makes a decision to estimate the information bits. Due to the channel effect, this decision can be erroneous. So, it is important to measure the communication system efficiency in terms of bit error probability (BEP). According to the system model presented in Fig. 1, this bit error probability is defined to be the conditional probability that the receiver makes a wrong decision on a transmitted information bit. Assuming that the ith bit is transmitted, the error probability is expressed as follows:$$ \begin{aligned} P_{b} &= Pr\left[\text{Error} ~|~ b_{i}~\text{sent}\right] \\ &= Pr\left[\widetilde{b}_{i} \neq b_{i} ~|~ b_{i}~\text{sent}\right], \end{aligned} $$

(1)

Advertisement

Let X be the random variable whose realizations are the observed samples at the matched filter output and define the decision region associated to the information bit b

_{ i }as$$ \begin{aligned} Z_{i}=\left\{ X \in \mathbb{R};~Pr\left[\widetilde{b}_{i} = b_{i}~|~ X\right] > Pr\left[\widetilde{b}_{i} \neq b_{i} ~|~ X\right] \right\}. \end{aligned} $$

(2)

where \(\widetilde {b}_{i}\) is the estimation of the ith information bit at the receiver end. The probability of error on the bit b

_{ i }defined in (1) is then re-expressed as$$ \begin{aligned} P_{b} &= Pr\left[X \notin Z_{i} ~|~ b_{i}~\text{sent}\right], \end{aligned} $$

(3)

We can express this error probability in terms of the probability density of X to get

$$ \begin{aligned} P_{b} &= \int_{X \notin Z_{i}}^{} f_{X}(x ~|~ b_{i}~\text{sent})\,dx, \end{aligned} $$

(4)

To obtain the expression of the average bit error probability P

_{ e }, we divide the set of the observed samples C into two subsets C_{0}and C_{1}. The first subset contains N_{0}observed samples which corresponds to the transmission of b_{ i }=0. The second subset consist of N_{1}observed samples when the bit b_{ i }=1 is transmitted. In this manner, the probability density function of X can be viewed as a mixture of two probability densities \(f_{X}^{(1)}(x)\) and \(f_{X}^{(0)}(x)\) of the observed samples corresponding to the transmitted information bits b_{ i }=1 and b_{ i }=0, respectively. Then, the average bit error probability is written as$$ {\fontsize{9.1}{6}\begin{aligned} P_{e} &= P_{1}.Pr\left[X\notin Z_{1} | b_{i}=1\right] +P_{0}. Pr\left[X\notin Z_{0} | b_{i}=0\right] \\ &= P_{1}. \int_{-\infty}^{0} f_{X}^{(1)}\left(x | b_{i}=1\right)dx + P_{0}. \int_{0}^{+\infty} f_{X}^{(0)}\left(x | b_{i}=0\right)\,dx. \end{aligned}} $$

(5)

where \(P_{k} = \frac {N_{k}}{N}, k=0,1\), is the probability that b

_{ i }=k is transmitted.For equally likely transmitted information bits, the average BEP is finally given by

$$\begin{array}{@{}rcl@{}} P_{e} &=& \int_{0}^{+\infty} f_{X}^{(0)}\left(x | b_{i}=0\right)\,dx \\ &=& \int_{-\infty}^{0} f_{X}^{(1)}\left(x | b_{i}=1\right)\,dx. \end{array} $$

(6)

Accordingly, for predicting the error probability P

_{ e }, one has to estimate the probability densities \({f_{X}^{1}}(x)\) and \({f_{X}^{0}}(x)\). In this paper, we will focus on the use of Fourier inversion approach and its use for estimating error probability.### 3.2 Probability density function estimation

Various techniques for estimating the probability density function have been developed in literature. The most known of these methods is that based on kernel estimator [11]. The approach we propose in this paper is based on the fact that the pdf can be found from the characteristic function of a random variable X via Fourier transform inversion. This is expressed as follows:

$$\begin{array}{@{}rcl@{}} \widetilde{f}(x) &=& \frac{1}{2\pi}\int_{-\infty}^{+\infty} e^{-jtx} {\varphi}_{X}(t)\,dt, \end{array} $$

(7)

where φ

_{ X }is the characteristic function of a random variable X, defined as$$\begin{array}{@{}rcl@{}} {\varphi}_{X}(t) &=& \mathbb{E}\left[e^{jtX}\right] \\ &=& \int_{-\infty}^{+\infty} e^{jtx} f_{X}(x)\,dx, \end{array} $$

(8)

Given N observed samples {x

_{1},x_{2},…,x_{ N }}, the expectation in (8) can be approximated by a finite sum. Hence, the characteristic function φ_{ X }can be written as$$\begin{array}{@{}rcl@{}} \widetilde{\varphi}_{X}(t) &=& \frac{1}{N} \sum\limits_{i=1}^{N} e^{{jtx}_{i}}, \end{array} $$

(9)

Consequently, the probability density function can be estimated according to (7) by using the approximation of φ

_{ X }(t) given in (9). However, the Fourier integral in (7) can exhibit divergence for large values of the time variable t. To solve this limitation, the characteristic function estimator \(\widetilde {\varphi }_{X}(t)\) is multiplied by a damping function ψ_{ h }(t)=ψ(h t) to control the smoothness of the estimated probability density function.Therefore, the characteristic function expression becomes

$$\begin{array}{@{}rcl@{}} \widetilde{\varphi}_{X}(t) &=& \frac{1}{N} \sum\limits_{i=1}^{N} e^{{jtx}_{i}} \psi_{h}(t). \end{array} $$

(10)

where h is a smoothing parameter.

It follows that the estimated probability density function is given by (see proof in Appendix A)

$$\begin{array}{@{}rcl@{}} \widetilde{f}(x;h) &=& \frac{1}{Nh} \sum\limits_{i=1}^{N} v\left(\frac{x-x_{i}}{h}\right), \end{array} $$

(11)

where

$$\begin{array}{@{}rcl@{}} v(x) &=& \frac{1}{2\pi} \int_{-\infty}^{+\infty} e^{-jtx}\psi(t)\,dt. \end{array} $$

(12)

The most common choice for the damping function ψ(t) is the Gaussian function \(\psi (t)= e^{-\pi t^{2}}\). Then, the semi-analytical probability density function is done as

$$\begin{array}{@{}rcl@{}} \widetilde{f}(x;h) &=& \frac{1}{Nh} \sum\limits_{i=1}^{N} \frac{1}{2\pi} \int_{-\infty}^{+\infty} e^{-jt\frac{(x-x_{i})}{h} }e^{-\pi t^{2}}\,dt \\ &=& \frac{1}{Nh} \sum\limits_{i=1}^{N} \frac{1}{2\pi} e^{-\left(\frac{x-x_{i}}{2\sqrt{\pi}h}\right)^{2}}. \end{array} $$

(13)

After replacing the probability densities using the estimation above, and evaluating the integral in (5), the semi-analytical bit error probability can be finally re-expressed as (see proof in Appendix B)

$$\begin{array}{@{}rcl@{}} {}P_{e} = \frac{P_{1}}{N_{1}} \sum\limits_{i=1}^{N_{1}} Q\left(\frac{(x_{i})_{1}}{\sqrt{2\pi}h_{1}}\right) + \frac{P_{0}}{N_{0}} \sum\limits_{i=1}^{N_{0}} Q\left(\frac{-(x_{i})_{0}}{\sqrt{2\pi}h_{0}}\right). \end{array} $$

(14)

where (x

_{ i })_{0}and (x_{ i })_{1}are the observed samples corresponding to the transmitted bits b_{ i }=0 and b_{ i }=1, respectively. h_{1}(respectively, h_{0}) is the smoothing parameter which depends on the number of observed samples, i.e., N_{1}(respectively, N_{0}). Q(:) denotes the complementary unit cumulative Gaussian distribution, that is$$\begin{array}{@{}rcl@{}} Q(x)=\frac{1}{\sqrt{2\pi}} \int_{x}^{+\infty} e^{-t^{2}/2}\,dt. \end{array} $$

(15)

From (14), it is clear that the accuracy of bit error probability estimation depends on the choice of the optimal smoothing parameter.

## 4 Smoothing parameter selection

As already mentioned, the important task in semi-analytical BEP derivation is the selection of the smoothing parameter which impacts the precision of the estimator given in (14). The optimal smoothing parameter is defined to be the value of h that minimizes the error between the estimated pdf and the true pdf. The most common metric to represent this error is the mean integrated squared error (MISE) which is expressed as [12]

$$\begin{array}{@{}rcl@{}} \text{MISE}(h)=\mathbb{E}\left[\int_{-\infty}^{+ \infty} \left[\,\widetilde{f}(x;h) - f(x) \right]^{2} \,dx\right]. \end{array} $$

(16)

The optimal smoothing parameter is selected so that it minimizes MISE with respect to h:

$$\begin{array}{@{}rcl@{}} h_{\text{opt}} = \operatorname*{arg\,min}_{h}(\text{MISE}(h)), \end{array} $$

(17)

Based on kernel estimator [11], the smoothing parameter is calculated as (see Appendix C)

$$\begin{array}{@{}rcl@{}} h_{\text{opt}}&=&\left(\frac{R(K)}{{\mu_{2}}^{2}(K)R\left(f^{\prime\prime}\right)}\right)^{1/5}.N^{-1/5}, \end{array} $$

(18)

where \(R(g)= \int _{}^{} g^{2}(u) \, du\), \(\mu _{k}(g)= \int u^{k} g(u) \, du\), and K(.) represents the kernel function. Until now, it is difficult to measure h

_{opt}since it depends on the unknown quantity R(f^{′′}). To solve this problem, several types of MISE-based methods have been suggested in literature. Hereafter, we detail the most popular ones.### 4.1 Rule-of-thumb method

The idea of rule-of-thumb [13] is to replace the unknown probability density, f, in (18) by a standard normal distribution that has mean μ and variance σ

^{2}, i.e., \(\mathcal {N}(\mu,\sigma ^{2})\). In this manner, we get$$ \begin{aligned} R\left(\,f^{\prime\prime}\right)= (8 \sqrt{\pi}/3)^{1/5} \sigma. \end{aligned} $$

(19)

Consequently, a Gaussian kernel function \(K(x)=\frac {1}{\sqrt {2\pi }} e^{-x^{2}/2}\) leads to

$$ \begin{aligned} R(K) = \left(2 \sqrt{\pi}\right)^{-1/5}~~; ~~{\mu_{2}}^{2}(K)=1, \end{aligned} $$

(20)

It follows that the smoothing parameter is done as

$$ \begin{aligned} h_{\mathrm{opt,ROT}} =(4/3)^{1/5} \sigma N^{-1/5} = 1.06 \sigma N^{-1/5}. \end{aligned} $$

(21)

### 4.2 Cross-validation method

In cross-validation (CV) method, instead of using a reference probability density, the idea is to estimate the unknown quantity R(f

^{′′}) in h_{opt}formula. Furthermore, CV approach considers the integrated squared error (ISE) to select the optimal smoothing parameter. This error metric is expressed as [14]$$ \begin{aligned} \text{ISE} &=\int_{-\infty}^{+ \infty} \left[\widetilde{f}_{X}(x;h) - f(x) \right]^{2} \,dx \\ &=\int_{-\infty}^{+ \infty} \widetilde{f}_{X}^{2}(x;h)\,dx- 2 \int_{-\infty}^{+ \infty} \widetilde{f}_{X}(x;h)f(x)\,dx \\ &\quad + \int_{-\infty}^{+ \infty} f^{2}(x)\,dx. \end{aligned} $$

(22)

The third term \(\int _{-\infty }^{+ \infty } f^{2}(x)\,dx\) does not depend on the sample or on the smoothing parameter. Moreover, the new function used to estimate h is called least squares cross-validation (LSCV)-based method [15] expressed as

$$ \begin{aligned} {}\text{LSCV}(h)=\int_{-\infty}^{+ \infty} \widetilde{f}_{X}^{2}(x;h)\,dx - 2 \int_{-\infty}^{+ \infty} \widetilde{f}_{X}(x;h)f(x)\,dx, \end{aligned} $$

(23)

An approximately unbiased estimator of (23) is given by [16]

$$ \begin{aligned} \text{LSCV}(h)= \int_{-\infty}^{+ \infty} {\widetilde{f}}_{X}^{2}(x,h)\,dx -\frac{2}{N} \sum_{i=1}^{N}~\widetilde{f}_{X,-i}\left(x_{i},h\right). \end{aligned} $$

(24)

where, \(\widetilde {f}_{-i}(x_{i})\), i=1,…,N, is the estimated density using all the original observations except for x

_{ i }.It is well known that LSCV(h) is an unbiased estimator of \(\text {MISE}(h)-\int _{-\infty }^{+ \infty } f^{2}(y)\,dy\). This is expressed as :

$$ {\fontsize{9.1}{6}\begin{aligned} \mathbb{E}\left(\text{LSCV}(h)\right) &= \!\mathbb{E}\left[\int_{-\infty}^{+ \infty} \! \left[\,\widetilde{f}(x;h) - f(x) \right]^{2} dx\right] -\! \int_{-\infty}^{+ \infty} \!\!f^{2}\!(x)\,dx \\ &= \text{MISE}(h)-\int_{-\infty}^{+ \infty} f^{2}(y)\,dy, \end{aligned}} $$

(25)

As developed in Appendix C, the \(\mathbb {E}(\text {LSCV}(h))\) estimator is re-written as [17]

$$ {\fontsize{9.1}{6}\begin{aligned} \mathbb{E}(\text{LSCV}(h))= \frac{1}{N h} R(K)+\frac{h^{4}}{4} \mathbf{\mu}_{2}^{2}(K) R\left(\,f^{\prime\prime}\right)-R(f) + O\!\left(N^{-1}\right)\!. \end{aligned}} $$

(26)

A new method called biased cross-validation (BCV) [18, 19] considers only the asymptotic MISE to estimate h:

$$ \begin{aligned} \text{AMISE}= \frac{R(K)}{N h} +\frac{h^{4}}{4} \mathbf{\mu}_{2}^{2}(K) R\left(\,f^{\prime\prime}\right). \end{aligned} $$

(27)

Its idea is to replace the unknown quantity R(f

^{′′}) by the estimator:$$\begin{array}{@{}rcl@{}} \widetilde{R\left(\,f^{\prime\prime}\right)}&=& R\left(\,\widetilde{f}_{X}^{\prime\prime}\right)- \frac{1}{N.h^{5}}.R\left(K^{\prime\prime}\right) \\ &=& \frac{1}{N^{2}} \sum_{i \neq j}^{} \sum_{}^{} {K_{h}}^{\prime\prime} \ast {K_{h}}^{\prime\prime} \left(x_{i} - x_{j}\right). \end{array} $$

(28)

where \(\widetilde {f}_{X}^{\prime \prime }\) is the second derivative of the kernel density estimate and \(K_{h}(x)=\frac {1}{h} K\left (\frac {x}{h}\right)\). The operator ∗ indicates the convolution product.

By substituting (28) in (27), the BCV-based method is presented as

$$ \begin{aligned} {}\text{BCV}(h) &= \frac{1}{Nh} R(K)+ \frac{h^{4} \mathbf{\mu}_{2}^{2}(K)}{2N^{2}} \sum_{i \neq j}^{} \sum_{}^{} {K_{h}}^{\prime\prime} \!\ast {K_{h}}^{\prime\prime} (x_{i} \,-\, x_{j}). \end{aligned} $$

(29)

Finally, the smoothing parameter based on cross-validation method is done by

$$ \begin{aligned} h_{\mathrm{opt,CV}}= \operatorname*{arg\,min}_{h}(\text{BCV}(h)). \end{aligned} $$

(30)

### 4.3 Bootstrap method

Bootstrap procedures for selecting the smoothing parameter have been studied in previous work [20‐22]. The idea is to estimate the MISE using the bootstrap and then minimize it with respect to h. Let \(\widetilde {f}_{X}(x;g)\) be the estimate of f(x) obtained from {x

_{1},…,x_{ N }}, with a pilot smoothing parameter g.The straight forward approach to use the bootstrap method would be to resample \(\left \{x_{1}^{*},\ldots, x_{N}^{*}\right \}\) from \(\widetilde {f}_{X}(x;g)\) and then construct bootstrap estimates \(\widetilde {f}_{X}^{*}(x;h)\) [23]. The bootstrap estimator of the MISE is defined as

$$\begin{array}{@{}rcl@{}} {}\text{MISE}^{*}(h)&=& \mathbb{E}\left[\int_{-\infty}^{+ \infty} \left[\,\widetilde{f}_{X}^{*}(x;h) - \widetilde{f}_{X}(x;g) \right]^{2} \,dx\right], \end{array} $$

(31)

According to (13), \(\widetilde {f}_{X}^{*}(x;h)\) can be replaced by \(\frac {1}{Nh} \sum _{i=1}^{N} \frac {1}{2\pi } e^{-\left (\frac {x-x_{i}^{*}}{2\sqrt {\pi }h}\right)^{2}}\). Then, Taylor expansion of \(\widetilde {f}_{X}(x;g)\), under the assumption that h→0 as N→∞, leads to an asymptotic approximation to MISE

^{∗}[24] as:$$ {\fontsize{9.1}{6}\begin{aligned} \text{MISE}^{*}(h) = \frac{1}{2Nh\sqrt{2 \pi}} \left[2^{1/2}+1-\frac{4}{3^{1/2}}+ (N-1)h(2\pi)^{1/2} \right.\\ \left\{ 4\int_{}^{} h^{4}~\widetilde{f}_{X}^{(4)}(x;g)\,\widetilde{f}_{X}(x;g)\,dx- \frac{9}{2}\int h^{4}~\widetilde{f}_{X}^{(4)}(x;g)\,\widetilde{f}_{Y}(x;g)\,dy \right.\\ \left.\left.+\int_{}^{} h^{4}~\widetilde{f}_{X}^{(4)}(x;g)\,\widetilde{f}_{Y}(x;g)\,dx\right\}\right] +O\left(h^{6}\right), \end{aligned}} $$

(32)

After calculus simplification, this approximation can be written as [24]:

$$ {\fontsize{9.1}{6}\begin{aligned} \text{MISE}^{*}(h)=\frac{1.074}{2Nh\sqrt{\pi}} + \frac{h^{4}}{4}\int_{}^{} \widetilde{f}_{X}^{(4)}(x;g)\,\widetilde{f}_{X}(x;g)\,dx +O\left(h^{6}\right), \end{aligned}} $$

(33)

By using some standard properties of a density function, we re-express MISE

^{∗}(h) as$$ \begin{aligned} {}\text{MISE}^{*}(h)=\frac{1.074}{2Nh\sqrt{\pi}} + \frac{h^{4}}{4}\int_{}^{} \left(\,\widetilde{f}_{X}^{\prime\prime}(x;g)\right)^{2}\,dx +O(h^{6}). \end{aligned} $$

(34)

The optimal smoothing parameter h

_{opt,boot}is obtained by minimizing MISE^{∗}(h) with respect to h:$$ \begin{aligned} h_{\mathrm{opt,boot}}=\operatorname*{arg\,min}_{h}\left(\text{MISE}^{*}(h)\right), \end{aligned} $$

(35)

The smoothing parameter based on bootstrap method h

_{opt,boot}obtained from (35) is given as (see proof in Appendix D):$$ \begin{aligned} h_{\mathrm{opt,boot}}=\left(\frac{1.074}{2\sqrt{\pi}\int_{}^{}\left(\,\widetilde{f}_{X}^{\prime\prime}(x;g)\right)^{2}\,dx}\right)^{1/5}.N^{-1/5}. \end{aligned} $$

(36)

As it can be seen from this equation, the optimal h

_{opt,boot}value depends on the second derivative of the estimate pdf \(\int _{}^{}(\,\widetilde {f}_{X}^{\prime \prime }(x;g))^{2}\,dx\) where the pilot smoothing parameter g is selected using least squares the cross-validation method [25]. This parameter is chosen so as to minimize$$ \begin{aligned} \text{LSCV}(g)=\int_{}^{}\left(\,\widetilde{f}_{X}(x;g)\right)^{2}\,dx -\frac{2}{N} \sum_{i=1}^{N}~f_{X,-i}(x_{i},g). \end{aligned} $$

(37)

where f

_{ N,−i }(x_{ i },g) is the density estimate based on all of data expect x_{ i }. To justify the choice of the bootstrap method for selecting the optimal smoothing parameter, we have presented the integrated squared error as a function of the smoothing parameter h. Figure 2 shows the obtained results with bootstrap, cross-validation, and rule-of-thumb methods. It is seen that the bootstrap method outperforms the other methods in terms of the integrated squared error between the true probability density and the estimated density.
×

## 5 Simulations and results

In order to verify the obtained semi-analytical expression of error probability, computer simulations were done using the system model presented in Fig. 1. We first validated the probability density estimation using Fourier inversion. We, then, used it to predict the semi-analytical bit error probability of several transmission scenarios. This probability is compared with the BER evaluated using Monte Carlo simulation which considers a 95 % confidence interval for all scenarios.

To measure the semi-analytical probability density of the received sample, we have considered a digital modulation scheme which uses bit-phase-shift keying (BPSK) for bit-to-symbol conversion. The symbol stream is then sent through an AWGN channel. At the matched filter output, the receiver observes N=10,000 samples and estimates the probability density using Fourier inversion method. The obtained probability density is compared to the theoretical density as shown in Fig. 3. It is seen that the density curve corresponding to Fourier inversion method is close to theoretical density curve. Moreover, we have evaluated the semi-analytical bit error probability (BEP) using the expression given in (14) in terms of signal to noise ratio (SNR). The simulation results obtained from the semi-analytical method are compared with those from Monte Carlo simulation, as well as from the analytical method. Besides, the analytical BEP is expressed as

$$ \begin{aligned} {P}_{\text{th-bpsk}}=0.5 erfc\left(\sqrt{\text{SNR}}\right). \end{aligned} $$

(38)

×

where \(erfc(x)=\frac {2}{\sqrt {\pi }} \int _{x}^{+\infty } {e}^{{-x}^{2}}\,dx\).

The simulation results are presented in Fig. 4. It is shown that the proposed Fourier inversion-based semi-analytical method offers the same performance as the other methods. It is also observed that a significant gain in terms of computing time is obtained (see Table 1). In addition, to reach the bit error probability of 10

^{−4}, Monte Carlo simulation requires a number of 1,000,000 samples while Fourier inversion uses only 10,000 observed samples.Table 1

Computing time comparison. This table summarizes an experiment comparing the time (in seconds) to obtain bit error probability

Computing time (s) | |||
---|---|---|---|

BEP | Proposed SPP | MC simulation | |

BPSK | 10 ^{−6}
| 2.106 | 154.179 |

10 ^{−5}
| 1.760 | 14.001 | |

10 ^{−4}
| 1.013 | 1.441 | |

QPSK | 10 ^{−6}
| 2.554 | 86.933 |

10 ^{−5}
| 1.734 | 9.013 | |

10 ^{−4}
| 1.025 | 2.752 | |

4-PAM | 10 ^{−6}
| 5.877 | 71.864 |

10 ^{−5}
| 5.309 | 8.1315 | |

10 ^{−4}
| 3.663 | 4.333 | |

SC-FDMA | 10 ^{−6}
| 2.631 | 54.810 |

10 ^{−5}
| 2.048 | 8.131 | |

10 ^{−4}
| 1.671 | 1.453 |

×

Furthermore, we have applied the proposed semi-analytical approach to a transmission scenario that employs SC-FDMA technique [26] to transmit the symbol stream at the output of the BPSK modulation scheme. The number of subcarrier is taken to be equal to 512. Figure 5 shows the results of the semi-analytical bit error probability in terms of SNR. From the result, it is observed that the proposed semi-analytical approach is accurate compared to the Monte Carlo method with a significant gain in terms of computing time (see Table 1).

×

After that, the semi-analytical performance prediction (SPP) has been extended to a digital communication system which performs the digital modulation using four-state pulse amplitude modulation (4-PAM). The simulations have been carried assuming a transmission through an AWGN channel and with a number of the observed samples equal to 10,000. The measured semi-analytical bit error probability is depicted in Fig. 6. It has been compared to that estimated by Monte Carlo simulation and given analytically:

$$ \begin{aligned} P_{\mathrm{th-pam}}=0.75\,erfc\left(\sqrt{0.2 \,\text{SNR}}\right). \end{aligned} $$

(39)

×

We notice that the Fourier inversion approach provides the same performance as the Monte Carlo simulation and the analytical method. Besides, it has been proven that the computing time is significantly reduced with the Fourier inversion approach (see Table 1). Indeed, to reach a bit error probability of 10

^{−3}, Monte Carlo simulation requires a number of samples equal to 100,000 while the proposed method uses only N=10,000 samples. In addition, the same performance in terms of bit error probability has been obtained when quadrature phase-shift keying (QPSK) modulation is considered. The simulation results are presented in Fig. 7.
×

In another transmission scenario, we have considered that the BPSK symbol stream is sent through a Rayleigh channel generated using two independent Gaussian random variables each with mean zero and variance 0.5. Also, we have assumed that communication is done with the receiver diversity. The number of receiver antennas equals 2. To recover the transmitted information symbols, the outputs of the receiver antennas are combined using maximum ratio combining (MRC). We have evaluated the semi-analytical bit error probability at the output of the MRC combiner. Figure 8 presents the BEP results for N=20,000 observed samples. As for all scenarios, Fourier inversion curves are very close to Monte Carlo simulation curves and analytical method curves where its analytical expression is done by

$$ \begin{aligned} P_{\mathrm{bpsk-mrc}}=p^{2} \,\left(1+2.(1-p)\right). \end{aligned} $$

(40)

×

where \(p=\frac {1}{2} - \frac {1}{2}.\left (1+\frac {1}{\text {SNR}}\right)^{-\frac {1}{2}} \). Also, it is observed that a reduced computing time is obtained. This presents a major strength of the proposed approach and very promising for many practical systems.

## 6 Conclusions

In this paper, we have considered a new semi-analytical method for estimating the error probability for any digital communication system. We have shown that the problem of error probability estimation is equivalent to estimate the conditional probability density function (pdf) of the observed soft samples at the receiver output. The proposed method is based on Fourier inversion approach for predicting the pdf. It has been shown that the accuracy of this approach is very sensitive to the optimum smoothing parameter selection. Furthermore, we have applied the bootstrap method for selecting the optimal smoothing parameter which makes the proposed semi-analytical method more accurate. The simulation results have concluded that with either the Monte Carlo (MC) simulation technique or the new proposed semi-analytical approach, we have the same performance. Moreover, the use of the bootstrap method can decrease the squared error between the true pdf and the estimated one.

## 7 Appendix A Proof of (11)

Using the definition of Fourier transform inversion (7), the probability density function is done as:

$$\begin{array}{@{}rcl@{}} \widetilde{f}(x;h) &=& \frac{1}{2\pi}\int_{-\infty}^{+\infty} e^{-jtx} \widetilde{\varphi}_{X}(t)\,dt, \end{array} $$

(A.1)

where \(\widetilde {\varphi }_{X}\) is the characteristic function defined in (10), so we get :

$$\begin{array}{@{}rcl@{}} \widetilde{f}(x;h) &=& \frac{1}{2\pi}\int_{-\infty}^{+\infty} e^{-jtx} \frac{1}{N} \sum_{i=1}^{N} e^{{jtx}_{i}} \psi_{h}(t)\,dt \\ &=& \frac{1}{N} \sum\limits_{i=1}^{N} \frac{1}{2\pi} \int_{-\infty}^{+\infty} e^{-jt(x-x_{i})}\psi_{h}(t)\,dt \\ &=& \frac{1}{N} \sum\limits_{i=1}^{N} \frac{1}{2\pi} \int_{-\infty}^{+\infty} e^{-jt(x-x_{i})}\psi(ht)\,dt \\ &=& \frac{1}{Nh} \sum\limits_{i=1}^{N} \frac{1}{2\pi} \int_{-\infty}^{+\infty} e^{-jt\left(\frac{x-x_{i}}{h}\right)}\psi(t)\,dt, \end{array} $$

(A.2)

Let us define

$$\begin{array}{@{}rcl@{}} v(x) &=& \frac{1}{2\pi} \int_{\infty}^{+\infty} e^{-jtx}\psi(t)\,dt. \end{array} $$

(A.3)

It follows that the expression of the semi-analytical probability density function is expressed as

$$\begin{array}{@{}rcl@{}} \widetilde{f}(x;h) &=& \frac{1}{Nh} \sum\limits_{i=1}^{N} v\left(\frac{x-x_{i}}{h}\right). \end{array} $$

(A.4)

## 8 Appendix B Proof of (14)

Let us recall that the semi-analytical bit error probability is given by

$$ {\fontsize{9}{6}\begin{aligned} P_{e} &=& P_{1}. \int_{-\infty}^{0} f_{X}^{(1)}(x | b_{i}=1)\,dx + P_{0}. \int_{0}^{+\infty} f_{X}^{(0)}(x | b_{i}=0)\,dx, \end{aligned}} $$

(B.1)

where \(\widetilde {f}_{1}(x)\) and \(\widetilde {f}_{0}(x)\) are the estimated probability density function of the observed samples (x

_{ i })_{1}and (x_{ i })_{0}, respectively, which corresponds to transmitted information bits b_{ i }=1 and b_{ i }=0, respectively. By using the obtained semi-analytical probability density function in (13), we can define$$\begin{array}{@{}rcl@{}} \widetilde{f}_{X}^{(1)}(x;h) = \frac{1}{N_{1}h_{1}} \sum\limits_{i=1}^{N_{1}} \frac{1}{2\pi} e^{-j\left(\frac{x-(x_{i})_{1}}{2 \sqrt{\pi} h_{1}}\right)}, \end{array} $$

(B.2)

and

$$\begin{array}{@{}rcl@{}} \widetilde{f}_{X}^{(0)}(x;h) = \frac{1}{N_{0}h_{0}} \sum\limits_{i=1}^{N_{0}} \frac{1}{2\pi} e^{-j\left(\frac{x-(x_{i})_{0}}{2 \sqrt{\pi} h_{0}}\right)}, \end{array} $$

(B.3)

where h

_{1}(respectively, h_{0}) is the smoothing parameter which depends on the number of observed samples, i.e., N_{1}(respectively, N_{0}). By substituting the estimated pdf \(\widetilde {f}_{X}^{(1)}\) and \(\widetilde {f}_{X}^{(0)}\) in (B.1), we get$$ \begin{aligned} P_{e} &= P_{1}. \int_{-\infty}^{0} \frac{1}{N_{1} h_{1}} \sum\limits_{i=1}^{N_{1}} \frac{1}{2\pi} e^{-\left(\frac{x-({x_{i}})_{1}}{2\sqrt{\pi}h_{1}}\right)^{2}} \,dx \\ &\qquad + P_{0}. \int_{0}^{+\infty} \frac{1}{N_{0} h_{0}} \sum\limits_{i=1}^{N_{0}} \frac{1}{2\pi} e^{-\left(\frac{x-({x_{i}})_{0}}{2\sqrt{\pi}h_{0}}\right)^{2}} \,dx \\ &= \frac{P_{1}}{N_{1} h_{1}} \sum\limits_{i=1}^{N_{1}} \frac{1}{2\pi} \int_{-\infty}^{0} e^{-\left(\frac{x-({x_{i}})_{1}}{\sqrt{2\pi}h_{1}}\right)^{2}/2} \,dx \\ &\qquad + \frac{P_{0}}{N_{0} h_{0}} \sum\limits_{i=1}^{N_{0}} \frac{1}{2\pi} \int_{0}^{+\infty} e^{-\left(\frac{x-({x_{i}})_{0}}{\sqrt{2\pi}h_{0}}\right)^{2}/2} \,dx, \end{aligned} $$

(B.4)

Using the following change of variable \(t_{1}= \frac {x-({x_{i}})_{1}}{\sqrt {2\pi }h_{1}} \) and \(t_{0}= \frac {x-({x_{i}})_{0}}{\sqrt {2\pi }h_{0}} \), we have

$$ \begin{aligned} P_{e} &= \frac{P_{1}}{N_{1}} \sum\limits_{i=1}^{N_{1}} \frac{1}{\sqrt{2\pi}} \int_{-\infty}^{\frac{-({x_{i}})_{1}}{\sqrt{2\pi}h_{1}}} e^{-{t_{1}}^{2}/2} \,{dt}_{1} \\ &\qquad+ \frac{P_{0}}{N_{0}} \sum\limits_{i=1}^{N_{0}} \frac{1}{\sqrt{2\pi}} \int_{\frac{-({x_{i}})_{0}}{\sqrt{2\pi}h_{0}}}^{+\infty} e^{-{t_{0}}^{2}/2} \,{dt}_{0} \\ &= \frac{P_{1}}{N_{1}} \sum\limits_{i=1}^{N_{1}} Q\left(\frac{(x_{i})_{1}}{\sqrt{2\pi}h_{1}}\right)\\ &\qquad+ \frac{P_{0}}{N_{0}} \sum\limits_{i=1}^{N_{0}} Q\left(\frac{-(x_{i})_{0}}{\sqrt{2\pi}h_{1}}\right). \end{aligned} $$

(B.5)

## 9 Appendix C Proof of (18)

We can prove the expression of the smoothing parameter using MISE method.

$$\begin{array}{@{}rcl@{}} \text{MISE} &=& \mathbb{E} \left[ \int_{}^{} \left\{\,\widetilde{f}(x;h) - f(x) \right\}^{2}\,dx\right], \end{array} $$

(C.1)

By using the theory of “Konig Huyghens”, we have

$$ {\fontsize{9}{6}\begin{aligned} \mathbb{E} \left\{\,\widetilde{f}(x;h) - f(x) \right\}^{2} = \text{var}\left(\,\widetilde{f}(x;h)\right) +\left(\mathbb{E}\left(\,\widetilde{f}(x;h)\right) - f(x) \right)^{2}, \end{aligned}} $$

(C.2)

Let us use kernel estimator to estimate the probability density function \(\widetilde {f}\). We define the kernel function K(.) as any function satisfies \(\int _{}^{} K(x) \, dx =1 \) and:

$$\begin{array}{@{}rcl@{}} \widetilde{f}(x;h) &=& \frac{1}{N h} \sum\limits_{i=1}^{N} K\left(\frac{x-x_{i}}{h}\right). \end{array} $$

(C.3)

Estimation bias: Let us consider that the expectation of kernel function can be written as integrals of the convolution of the kernel density and the true density function:

$$\begin{array}{@{}rcl@{}} \mathbb{E}\left(\,\widetilde{f}(x;h)\right) &=& \frac{1}{N} \sum\limits_{i=1}^{N} \mathbb{E} \left(\frac{1}{h} K\left(\frac{x-x_{i}}{h}\right)\right) \\ &=& \int_{}^{} K\left(\frac{z-x}{h}\right) f(z) \, dz, \end{array} $$

(C.4)

By using \(u=\frac {z-x}{h}\), we have

$$\begin{array}{@{}rcl@{}} \mathbb{E}\left(\,\widetilde{f}(x;h)\right) &=& \int_{}^{} K(u) f(x+hu)\,du, \end{array} $$

(C.5)

So, we use a Taylor expansion of f(x+h
u) in the argument hu and with h→0. For a ν

^{′}th-order kernel, we take the expansion out to the ν^{′}th-term to solve this integral:$$\begin{array}{@{}rcl@{}} f(x+hu) &=& f(x) + f^{(1)}(x)hu+\frac{1}{2}f^{(2)}(x)h^{2}u^{2} \\ & &+\frac{1}{3!}\,f^{(3)}(x)h^{3}u^{3} +\ldots \\ & & + \frac{1}{\nu!}\,f^{(\nu)}(x)h^{\nu} u^{\nu}+O(h^{\nu}), \end{array} $$

(C.6)

where \(\mu _{\nu }(K)=\int _{}^{} u^{\nu } K(u)\,du\).

So, integrating term by term and using that \(\int _{}^{} K(x)dx =1 \), to get

$$ {\fontsize{8.6}{6}\begin{aligned} \int_{}^{} \!K(u)\,f(x+hu)\,du &= f(x) + f^{(1)}(x)h \mu_{1}(K)\,+\,\frac{1}{2!}\,f^{(2)}(x)h^{2} \mu_{2}(K) \\ & \quad+\frac{1}{3!}\,f^{(3)}(x)h^{3} \mu_{3}(K) \\ & \quad+\ldots+\frac{1}{\nu!}\,f^{(\nu)}(x)h^{\nu} \mu_{\nu}(K)+O(h^{\nu}) \\ &= f(x)+ \frac{1}{\nu!}\,f^{(\nu)}(x)h^{\nu} \mu_{\nu}(K)+O(h^{\nu}), \end{aligned}} $$

(C.7)

This means that

$$\begin{array}{@{}rcl@{}} {}\mathbb{E}\left(\,\widetilde{f}(x;h)\right) &=& \sum\limits_{i=1}^{n} \mathbb{E}\left(\frac{1}{h} K\left(\frac{x_{i}-x}{h}\right)\right) \\ &=& f(x)+ \frac{1}{\nu!}\,f^{(\nu)}(x)h^{\nu} \mu_{\nu}(K)+O(h^{\nu}). \end{array} $$

(C.8)

The bias of \(\widetilde {f}_{h}(x)\) is then

$$\begin{array}{@{}rcl@{}} {}\text{Bias}\left(\,\widetilde{f}(x;h)\right) &=& \mathbb{E}\left(\,\widetilde{f}(x;h)\right) -f(x) \\ &=& \frac{1}{\nu!}\,f^{(\nu)}(x)h^{\nu} \mu_{\nu}(K)+O(h^{\nu}), \end{array} $$

(C.9)

To simplify the calculus, we take

$$\begin{array}{@{}rcl@{}} {}\text{Bias}\left(\,\widetilde{f}(x;h)\right) &=& \frac{1}{2}\,f^{(2)}(x)h^{2} \mu_{2}(v)+O(h^{2}). \end{array} $$

(C.10)

Estimation variance: Let us compute the variance of \(\widetilde {f}(x;h)\) for a density estimator:

$$ {\fontsize{7.9}{6}\begin{aligned} var\left(\,\widetilde{f}(x;h)\right) &= \mathbb{E}\left[\,\widetilde{f}(x;h) - \mathbb{E}\left(\,\widetilde{f}(x;h)\right) \right]^{2} \\ &= \mathbb{E}\left[\left(\,\widetilde{f}(x;h)\right)^{2} -2\,\widetilde{f}(x;h)\mathbb{E}\left(\,\widetilde{f}(x;h)\right)+ \left(\mathbb{E}\,\widetilde{f}(x;h)\right)^{2} \right] \\ &= \mathbb{E}\left(\!\left(\,\widetilde{f}(x;h)\right)^{2}\right) \,-\, 2\mathbb{E}\left(\,\widetilde{f}(x;h)\right)\!\mathbb{E}\left(\,\widetilde{f}(x;h)\right)\,+\, \left(\mathbb{E}\,\widetilde{f}(x;h)\!\right)^{2} \\ &= \mathbb{E}\left(\left(\,\widetilde{f}(x;h)\right)^{2}\right) -2\left(\mathbb{E}\left(\,\widetilde{f}(x;h)\right)\right)^{2} + \left(\mathbb{E}\,\widetilde{f}(x;h)\right)^{2} \\ &= \mathbb{E}\left(\left(\,\widetilde{f}(x;h)\right)^{2}\right) -\left(\mathbb{E}\left(\,\widetilde{f}(x;h)\right)\right)^{2}, \end{aligned}} $$

(C.11)

The kernel estimator is a linear estimate, so

$$\begin{array}{@{}rcl@{}} {}\text{var}\left(\,\widetilde{f}(x;h)\right) &=& \frac{1}{Nh^{2}} \mathbb{E}\left(K\Big(\frac{x_{i}-x}{h}\Big)\right)^{2} \\ &&-\frac{1}{N}\left(\frac{1}{h}\mathbb{E}\left(K\left(\frac{x_{i}-x}{h}\right)\right)\right)^{2}. \end{array} $$

(C.12)

As developed in the bias, we have \(\frac {1}{h} \mathbb {E}\left (K\left (\frac {x_{i}-x}{h}\right)\right) = f(x) + O(1)\) So, \(\frac {1}{N} \left (\frac {1}{h} \mathbb {E}\left (K\left (\frac {x_{i}-x}{h}\right)\right)\right)^{2}\) is \(O\left (\frac {1}{N}\right) \) For the first term of the variance, we can write the expectation of kernel function as integrals of the convolution of the kernel density and the true density and then use a first-order Taylor expansion, to get

$$\begin{array}{@{}rcl@{}} \frac{1}{h}\mathbb{E}\left(K\left(\frac{x_{i}-x}{h}\right)\right)^{2} &=& \frac{1}{h} \int_{}^{} \left\{K\left(\frac{z-x}{h}\right)\right\}^{2} f(z)\,dz \\ &=& \int_{}^{} K(u)^{2} f(x+hu)\,du \\ &=& f(x)\int_{}^{} K(u)^{2} \,du + O(h) \\ &=& f(x)R(K) + O(h), \end{array} $$

(C.13)

where \(R(K) =\int _{}^{} K(u)^{2} \,du \). Together, the estimation variance is written as

$$\begin{array}{@{}rcl@{}} \text{var}\left(\,\widetilde{f}(x;h)\right)&=& \frac{f(x)R(K)}{Nh} + O\left(\frac{1}{N}\right). \end{array} $$

(C.14)

Mean-squared error As defined, the mean squared error (MSE) is done as

$$\begin{array}{*{20}l} \text{MSE} &= \mathbb{E} \left\{\,\widetilde{f}(x;h) - f(x) \right\}^{2} \\ &= \text{var}\left(\,\widetilde{f}(x;h)\right) +\left(\text{Bias}\left(\,\widetilde{f}(x;h)\right)\right)^{2} \\ &= \frac{f(x)R(K)}{Nh} + \frac{1}{4}\left(\,f^{(2)}(x)\right)^{2}h^{4}{\mu_{2}}^{2}(K). \end{array} $$

(C.15)

By integrating the MSE, the mean integrated squared error (MISE) is done as

$$\begin{array}{*{20}l} \text{MISE} &= \mathbb{E} \int_{}^{} \left\{\,\widetilde{f}(x;h) - f(x) \right\}^{2}\,dx \\ &= \int_{}^{} {\text{Bias}\left(\,\widetilde{f}(x;h)\right)}^{2}\,dx + \int_{}^{} \text{var}\left(\,\widetilde{f}(x;h)\right)\,dx, \end{array} $$

(C.16)

Under an integrability assumption on f, we have

$$\begin{array}{@{}rcl@{}} \text{MISE} &=& \frac{R(K)}{Nh} + \frac{1}{4}h^{4}{\mu_{2}^{2}}(K)R\left(\,f^{\prime\prime}\right). \end{array} $$

(C.17)

where \(R(\,f^{\prime \prime })=\int _{}^{} {f^{\prime \prime }(u)}^{2}\,du\)

The expression (C.17) is the measure that we use to quantify the performance of the estimator. We can find the optimal smoothing parameter by minimizing the expression of (C.17) with respect to h. The first derivative is given by

$$ \begin{aligned} \frac{d(\text{MISE}(h))}{dh}=-\frac{R(K)}{2Nh^{2}} + h^{3}{\mu_{2}^{2}}(K)R\left(\,f^{\prime\prime}\right), \end{aligned} $$

(C.18)

Putting this equal to zero, we will have the optimal smoothing parameter:

$$\begin{array}{@{}rcl@{}} h_{\text{opt}}&=&\left(\frac{R(K)}{{\mu_{2}}^{2}(K)R(\,f^{\prime\prime})}\right)^{1/5}.N^{-1/5}. \end{array} $$

(C.19)

## 10 Appendix D Proof of (36)

In this Appendix, we provide further details related to the asymptotic expressions for the smoothing parameter using the bootstrap method. Here the normal kernel is used.

$$\begin{array}{@{}rcl@{}} \text{MISE}^{*}(h)&=& \mathbb{E} \int_{}^{} \left\{\, \widetilde{f^{*}}_{X}(x;h) - \widetilde{f}_{X}(x;g) \right\}^{2}\,dx \\ &=& \int_{}^{} \text{Bias}^{*}\left\{\,\widetilde{f^{*}}_{X}(x;h)\right\}^{2} \,dx \\ &&- \int_{}^{} Var^{*}\left\{\,\widetilde{f^{*}}_{X}(x;h)\right\} \,dx, \end{array} $$

(D.1)

where \(\mathbb {E}^{*}\), Bias

^{∗}, and Var^{∗}all involve expectations conditionally upon \(x^{*}_{1}, x^{*}_{2},\ldots, x^{*}_{N}\) and all x^{∗}are sampled from the smoothed distribution \(\widetilde {f}_{X}(x;h)\). Making a substitution followed by a Taylor series expansion, this assumes that h→0 as N→∞, gives an asymptotic approximation:$$ {\fontsize{9.2}{6}\begin{aligned} \text{MISE}^{*}(h)&=\frac{1}{2Nh\sqrt{2 \pi}} \left[2^{1/2}+1-\frac{4}{3^{1/2}}+ (N-1)h(2\pi)^{1/2}\right. \\ &\qquad\qquad\quad\left\{ 4\int_{}^{} h^{4} \,\widetilde{f}_{X}^{(4)}(x;g)\,\widetilde{f}_{X}(x;g)\,dx\right.\\ &\qquad\qquad- \frac{9}{2}\int_{}^{} h^{4} \,\widetilde{f}_{X}^{(4)}(x;g)\,\widetilde{f}_{X}(x;g)\,dx\\ &\qquad\quad~~~\left. \left. + \int_{}^{} h^{4} \,\widetilde{f}_{X}^{(4)}(x;g)\,\widetilde{f}_{X}(x;g)\,dx\right\}\right] +O(h^{6}), \end{aligned}} $$

(D.2)

To simplify this, the approximation can be written as

$$ {\fontsize{9.2}{6}\begin{aligned} \text{MISE}^{*}(h)=\frac{1.074}{2Nh\sqrt{\pi}} + \frac{h^{4}}{4}\int_{}^{} \widetilde{f}_{X}^{(4)}(x;g)\,\widetilde{f}_{X}(x;g)\,dx +O\left(h^{6}\right), \end{aligned}} $$

(D.3)

Using the condition that any probability density function satisfies

$$ \begin{aligned} \left[\,\widetilde{f}_{X}^{\prime\prime\prime}(x;g)\,\widetilde{f}_{X}(x;g)\right]_{-\infty}^{+\infty}=\left[\,\widetilde{f}_{X}^{\prime\prime}(x;g)\,\widetilde{f}_{X}^{'}(x;g)\right]_{-\infty}^{+\infty}=0. \end{aligned} $$

(D.4)

The asymptotic expression for bootstrap estimator of MISE is

$$ \begin{aligned} {}\text{MISE}^{*}(h)=\frac{1.074}{2Nh\sqrt{\pi}} + \frac{h^{4}}{4}\int_{}^{} \left(\,\widetilde{f}_{X}^{\prime\prime}(x;g)\right)^{2}\,dx +O\left(h^{6}\right), \end{aligned} $$

(D.5)

The optimal smoothing parameter is selected so that minimizing the expression of (D.5) with respect to h. The first derivative is given by

$$ {\fontsize{9.2}{6}\begin{aligned} \frac{d(\text{MISE}^{*}(h))}{dh}=-\frac{1.074}{2Nh^{2}\sqrt{\pi}} + h^{3} \int_{}^{} \left(\,\widetilde{f}_{X}^{\prime\prime}(x;g)\right)^{2}dx +O(h^{6}). \end{aligned}} $$

(D.6)

Putting this equal to zero, we will have the optimal smoothing parameter

$$ \begin{aligned} h_{\mathrm{opt,boot}}=\left(\frac{1.074}{2\sqrt{\pi}\int_{}^{}\left(\,\widetilde{f}_{X}^{\prime\prime}(x;g)\right)^{2}\,dx}\right)^{1/5}.N^{-1/5}. \end{aligned} $$

(D.7)

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## Competing interests

The authors declare that they have no competing interests.