Skip to main content

2018 | OriginalPaper | Buchkapitel

4. Data Transmission and Channel Capacity

verfasst von : Fady Alajaji, Po-Ning Chen

Erschienen in: An Introduction to Single-User Information Theory

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

A noisy communication channel is an input–output medium in which the output is not completely or deterministically specified by the input. The channel is indeed stochastically modeled, where given channel input x, the channel output y is governed by a transition (conditional) probability distribution denoted by \(P_{Y|X}(y|x)\). Since two different inputs may give rise to the same output, the receiver, upon receipt of an output, needs to guess the most probable sent input.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
See [397] for recent results regarding channel capacity when no coding constraints are applied to the channel input (so that variable-length codes can be employed).
 
2
Strictly speaking, note that \(X\oplus Z\) in not defined in (4.2.9) when \(Z=E\). However, as \({\varvec{1}}\{Z \ne E\}=0\) when \(Z=E\), this is remedied by using the convention that an undefined quantity multiplied by zero is equal to zero.
 
3
Note that this fact holds for single-user channels with known transition distributions (as given in Definition 4.1) that remain constant throughout the transmission of a codeword. It does not however hold for single-user channels whose statistical descriptions may vary in an unknown manner from symbol to symbol during a codeword transmission; such channels, which include the class of “arbitrarily varying channels” (see [87, Chap. 2, Sect. 6]), will not be considered in this textbook.
 
4
We interchangeably use the terms “goodness” or “reliability” for a block code to mean that its (average) probability of error asymptotically vanishes with increasing blocklength.
 
5
First note that the mutual information I(XY) is actually a function of the input statistics \(P_X\) and the channel statistics \(P_{Y|X}\). Hence, we may write it as
$$I(P_X, P_{Y|X})= \sum _{x\in \mathcal{{X}}}\sum _{y\in \mathcal{{Y}}} P_X(x)P_{Y|X}(y|x)\log _2\displaystyle \frac{P_{Y|X}(y|x)}{\sum _{x'\in \mathcal{{X}}}P_X(x')P_{Y|X}(y|x')}.$$
Such an expression is more suitable for calculating the channel capacity.
Note also that the channel capacity C is well-defined since, for a fixed \(P_{Y|X}\), \(I(P_X, P_{Y|X})\) is concave and continuous in \(P_X\) (with respect to both the variational distance and the Euclidean distance (i.e., \(\mathcal {L}_2\)-distance) [415, Chap. 2]), and since the set of all input distributions \(P_X\) is a compact (closed and bounded) subset of \(\mathbb {R}^{|\mathcal{{X}}|}\) due to the finiteness of \(\mathcal{{X}}\). Hence, there exists a \(P_X\) that achieves the supremum of the mutual information and the maximum is attainable.
 
6
Note that (4.3.1) actually implies that \(\liminf _{n\rightarrow \infty }P_e({{\textstyle ~\mathcal{{C}}\!\!\!\!\!\!\!\sim }}_n)\ge \lim _{\epsilon \downarrow 0}(1-\epsilon )\mu =\mu \), where the error probability lower bound has nothing to do with \(\epsilon \). Here, we state the converse of Theorem 4.11 in a form in parallel to the converse statements in Theorems 3.​6 and 3.​15.
 
7
Here, the channel inputs are selected with replacement. That means it is possible and acceptable that all the selected \(M_n\) channel inputs are identical.
 
8
Note that Theorem 4.11 actually implies that \(\lim _{n\rightarrow \infty } P_e=0\) for \(\underline{R}<C_{op}=C\) and that \(\liminf _{n\rightarrow \infty } P_e>0\) for \(\underline{R}>C_{op}=C\); these properties, however, might not hold for more general channels than the DMC. For general channels, three partitions instead of two may result, i.e., \(\underline{R}<C_{op}\), \(C_{op}<\underline{R}<\bar{C}_{op}\) and \(\underline{R}>\bar{C}_{op}\), which, respectively, correspond to \(\limsup _{n\rightarrow \infty } P_e=0\) for the best block code, \(\limsup _{n\rightarrow \infty } P_e>0\) but \(\liminf _{n\rightarrow \infty } P_e=0\) for the best block code, and \(\liminf _{n\rightarrow \infty } P_e>0\) for all channel codes, where \(\bar{C}_{op}\) is called the channel’s optimistic operational capacity [394, 396]. Since \(\bar{C}_{op}=C_{op}=C\) for DMCs, the three regions are reduced to two. A formula for \(\bar{C}_{op}\) in terms of a generalized (spectral) mutual information rate is established in [75].
 
9
It can be seen from the theorem that C can be achieved as an asymptotic transmission rate as long as \((1/n)\log _2 M_n\) approaches C from below with increasing n (see (4.3.2)).
 
10
Indeed, there exist linear codes that can achieve the capacity of memoryless channels with additive noise (e.g., see [87, p. 114]). Such channels include the BSC and the q-ary symmetric channel.
 
11
Recall that in channel coding, a codeword of length n is typically sent by using the channel n consecutive times (i.e., in series). But in polar coding, an equivalent method is applied, which consists of using n identical and independent copies of the channel in parallel, with each channel being utilized only once.
 
12
More precisely, channel \(\mathbb {Q}^-\) has the same behavior as a BEC, and it can be exactly converted to a BEC after relabeling its output pair \((y_1,y_2)\) as an equivalent three-valued symbol \(\mathsf{y}_{1,2}\) as follows:
$$ \mathsf{y}_{1,2} = {\left\{ \begin{array}{ll} 0 &{} \text {if}\, (y_1,y_2) \in \{(0,0),(1,1)\}, \\ E &{} \text {if}\, (y_1,y_2) \in \{(0,E),(1,E),(E,E),(E, 0),(E, 1)\}, \\ 1 &{} \text {if}\, (y_1,y_2) \in \{(0,1),(1,0)\}. \\ \end{array}\right. } $$
A similar conversion can be applied to channel \(\mathbb {Q}^+\).
 
13
The same reasoning can be applied to form the basic transformation for two independent but not identically distributed BECs as shown in Fig. 4.6b, where \(\mathbb {Q}^+\) and \(\mathbb {Q}^-\) become \(\text {BEC}(\varepsilon _1\varepsilon _2)\) and \(\text {BEC}(1-(1-\varepsilon _1)(1-\varepsilon _2))\), respectively. This extension may be useful when combining n independent uses of a channel in a multistage manner (in particular, when the two channels to be combined may become non-identically distributed after the second stage). In Example 4.14, only identically distributed BECs will be combined at each stage, which is a typical design for polar coding.
 
14
This notion of “quasi-symmetry” is slightly more general than Gallager’s notion [135, p. 94], as we herein allow each sub-matrix to be weakly symmetric (instead of symmetric as in [135]).
 
15
The details for taking the derivative are as follows:
$$\begin{aligned}&\frac{\partial }{\partial P_X(x'')} \left\{ \sum _{x\in \mathcal{{X}}}\sum _{y\in \mathcal{{Y}}}P_X(x)P_{Y|X}(y|x)\log _2 P_{Y|X}(y|x)\right. \\&\left. -\sum _{x\in \mathcal{{X}}}\sum _{y\in \mathcal{{Y}}}P_X(x)P_{Y|X}(y|x)\log _2 \left[ \sum _{x'\in \mathcal{{X}}}P_X(x')P_{Y|X}(y|x')\right] +\lambda \left( \sum _{x\in \mathcal{{X}}}P_X(x)-1\right) \right\} \\= & {} \sum _{y\in \mathcal{{Y}}}P_{Y|X}(y|x'')\log _2 P_{Y|X}(y|x'')-\left( \sum _{y\in \mathcal{{Y}}}P_{Y|X}(y|x'')\log _2 \left[ \sum _{x'\in \mathcal{{X}}}P_X(x')P_{Y|X}(y|x')\right] \right. \\&\left. +\log _2(e) \sum _{x\in \mathcal{{X}}}\sum _{y\in \mathcal{{Y}}}P_X(x)P_{Y|X}(y|x) \frac{P_{Y|X}(y|x'')}{\sum _{x'\in \mathcal{{X}}}P_X(x')P_{Y|X}(y|x')} \right) +\lambda \\= & {} I(x'';Y)-\log _2(e) \sum _{y\in \mathcal{{Y}}}\left[ \sum _{x\in \mathcal{{X}}}P_X(x)P_{Y|X}(y|x)\right] \frac{P_{Y|X}(y|x'')}{\sum _{x'\in \mathcal{{X}}}P_X(x')P_{Y|X}(y|x')} +\lambda \\= & {} I(x'';Y)-\log _2(e) \sum _{y\in \mathcal{{Y}}}P_{Y|X}(y|x'')+\lambda \\= & {} I(x'';Y)-\log _2(e)+\lambda . \end{aligned}$$
 
16
This theorem is sometimes referred to as the lossless information transmission theorem.
 
17
The minimal achievable compression rate of such sources is given by the entropy rate, see Theorem 3.​15.
 
18
Note that \(n=n_m\); that is, the channel blocklength n is in general a function of the source blocklength m. Similarly, \(f^{(sc)}=f^{(sc)}_m\) and \(g^{(sc)}=g^{(sc)}_m\); i.e., the encoding and decoding functions are implicitly dependent on m.
 
19
We assume the source entropy rate exists as specified below.
 
20
Note that (4.6.1) actually implies that \(\liminf _{m\rightarrow \infty }P_e({{\textstyle ~\mathcal{{C}}\!\!\!\!\!\!\!\sim }}_{m, m})\ge \lim _{\epsilon \downarrow 0}(1-\epsilon )\mu =\mu \), where the error probability lower bound has nothing to do with \(\epsilon \). Here, we state the converse of Theorem 4.30 in a form in parallel to the converse statements in Theorems 3.​6, 3.​15 and 4.11.
 
21
Theorem 3.​15 indicates that for any \(0<\varepsilon ':=\min \{\varepsilon /2,\gamma /(2\log (2))\}<1\), there exists \(\delta \) with \(0<\delta <\varepsilon '\) and a sequence of binary block codes \(\{{{\textstyle ~\mathcal{{C}}\!\!\!\!\!\!\!\sim }}_m=(m, M_m)\}_{m=1}^\infty \) with and probability of decoding error satisfying \(P_e({{\textstyle ~\mathcal{{C}}\!\!\!\!\!\!\!\sim }}_m)<\varepsilon '\ (\le \varepsilon /2)\) for sufficiently large m, where \(H_2(\mathcal{{V}})\) is the entropy rate measured in bits. Here, (4.6.3) implies that \(\frac{1}{m}\log _2 M_m<H_2(\mathcal{{V}})+\delta \) for sufficiently large m. Hence,
$$\frac{1}{m}\log M_m<H(\mathcal{{V}})+\delta \log (2)<H(\mathcal{{V}})+\varepsilon '\log (2)\le H(\mathcal{{V}})+\gamma /2$$
for sufficiently large m.
 
22
Theorem 4.11 and its proof of forward part indicate that for any \(0<\varepsilon ':=\min \{\varepsilon /4,\gamma /(16\log (2))\}<1\), there exist \(0<\gamma '<\min \{4\varepsilon ', C_2\}=\min \{\varepsilon ,\gamma /(4\log (2)), C_2\}\) and a sequence of data transmission block codes \(\{{{\textstyle ~\mathcal{{C}}\!\!\!\!\!\!\!\sim }}'_m=(m,\bar{M}'_m)\}_{m=1}^\infty \) satisfying and
$$P_e({{\textstyle ~\mathcal{{C}}\!\!\!\!\!\!\!\sim }}'_m)<\varepsilon '\quad \mathrm{for\ sufficiently\ large}\ m,$$
provided that \(C_2>0\), where \(C_2\) is the channel capacity measured in bits.
Observation 4.6 indicates that by throwing away from \({{\textstyle ~\mathcal{{C}}\!\!\!\!\!\!\!\sim }}'_m\) half of its codewords with largest conditional probability of error, a new code \({{\textstyle ~\mathcal{{C}}\!\!\!\!\!\!\!\sim }}_m=(m,\bar{M}_m)=(m,\bar{M}'_m/2)\) is obtained, which satisfies \(\lambda ({{\textstyle ~\mathcal{{C}}\!\!\!\!\!\!\!\sim }}_m)\le 2 P_e({{\textstyle ~\mathcal{{C}}\!\!\!\!\!\!\!\sim }}'_m)<2\varepsilon '\le \varepsilon /2\).
Equation (4.6.4) then implies that for \(m>1/\gamma '\) sufficiently large,
$$\frac{1}{m}\log \bar{M}_m=\frac{1}{m}\log \bar{M}'_m-\frac{1}{m}\log (2)>C-\gamma '\log (2)-\frac{1}{m}\log (2)>C-2\gamma '\log (2)>C-\gamma /2.$$
 
23
See [75, 96, 303, 394] for a definition of information stable sources, whose property is slightly more general than the Generalized AEP property given in Theorem 3.​14.
 
24
The error exponent or reliability function of a coding system is the largest rate of exponential decay of its decoding error probability as the coding blocklength grows without bound [51, 87, 95, 107, 114, 135, 177, 178, 205, 347, 348]. Roughly speaking, the error exponent is a number E with the property that the decoding error probability of a good code is approximately \(e^{-nE}\) for large coding blocklength n. In addition to revealing the fundamental trade-off between the error probability of optimal codes and their blocklength for a given coding rate and providing insight on the behavior of optimal codes, such a function provides a powerful tool for proving the achievability part of coding theorems (e.g., [135]), for comparing the performance of competing coding schemes (e.g., weighing joint against separate coding [422]) and for communications system design [194].
 
25
We underscore that, even though not listed here, the literature on joint source-channel coding for multiuser systems is also quite extensive and ongoing.
 
26
Loosely speaking, a channel is information stable if the input process which maximizes the channel’s block mutual information yields a joint input–output process that behaves ergodically and satisfies the joint AEP (see [75, 96, 191, 303, 394] for a precise definition).
 
27
Note that a formula of the capacity of more general (not necessarily information stable) channels with memory does exist in terms of a generalized (spectral) mutual information rate, see [172, 396].
 
28
This intrinsically natural assumption, which is equivalent to requiring that the channel input and noise processes are independent of each other when no feedback is present, ensures that (4.7.2) holds for this channel.
 
29
For arbitrary channels with memory, a generalized expression for \(C_{op, FB}\) is established in [376, 378] in terms of a generalized (spectral) directed information rate.
 
30
This result was actually shown in [72] for general channels with memory in terms of a generalized (spectral) mutual information rate.
 
Metadaten
Titel
Data Transmission and Channel Capacity
verfasst von
Fady Alajaji
Po-Ning Chen
Copyright-Jahr
2018
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-8001-2_4