• B1.14

State the formula for the capacity of a memoryless channel.

(a) Consider a memoryless channel where there are two input symbols, $A$ and $B$, and three output symbols, $A, B$, *. Suppose each input symbol is left intact with probability $1 / 2$, and transformed into a $*$ with probability $1 / 2$. Write down the channel matrix, and calculate the capacity.

(b) Now suppose the output is further processed by someone who cannot distinguish $A$ and $*$, so that the matrix becomes

$\left(\begin{array}{cc} 1 & 0 \\ 1 / 2 & 1 / 2 \end{array}\right)$

Calculate the new capacity.

comment
• B2.14

For integer-valued random variables $X$ and $Y$, define the relative entropy $h_{Y}(X)$ of $X$ relative to $Y$.

Prove that $h_{Y}(X) \geqslant 0$, with equality if and only if $\mathbb{P}(X=x)=\mathbb{P}(Y=x)$ for all $x$.

By considering $Y$, a geometric random variable with parameter chosen appropriately, show that if the mean $\mathbb{E} X=\mu<\infty$, then

$h(X) \leqslant(\mu+1) \log (\mu+1)-\mu \log \mu,$

with equality if $X$ is geometric.

comment
• B4.13

Define a cyclic code of length $N$.

Show how codewords can be identified with polynomials in such a way that cyclic codes correspond to ideals in the polynomial ring with a suitably chosen multiplication rule.

Prove that any cyclic code $\mathcal{X}$ has a unique generator, i.e. a polynomial $c(X)$ of minimum degree, such that the code consists of the multiples of this polynomial. Prove that the rank of the code equals $N-\operatorname{deg} c(X)$, and show that $c(X)$ divides $X^{N}+1$.

Let $\mathcal{X}$ be a cyclic code. Set

$\mathcal{X}^{\perp}=\left\{y=y_{1} \ldots y_{N}: \sum_{i=1}^{N} x_{i} y_{i}=0 \text { for all } x=x_{1} \ldots x_{N} \in \mathcal{X}\right\}$

(the dual code). Prove that $\mathcal{X}^{\perp}$ is cyclic and establish how the generators of $\mathcal{X}$ and $\mathcal{X}^{\perp}$ are related to each other.

Show that the repetition and parity codes are cyclic, and determine their generators.

comment

• B1.14

A binary Huffman code is used for encoding symbols $1, \ldots, m$ occurring with probabilities $p_{1} \geqslant \ldots \geqslant p_{m}>0$ where $\sum_{1 \leqslant j \leqslant m} p_{j}=1$. Let $s_{1}$ be the length of a shortest codeword and $s_{m}$ of a longest codeword. Determine the maximal and minimal values of $s_{1}$ and $s_{m}$, and find binary trees for which they are attained.

comment
• B2.14

Let $\mathcal{X}$ be a binary linear code of length $n$, rank $k$ and distance $d$. Let $x=$ $\left(x_{1}, \ldots, x_{n}\right) \in \mathcal{X}$ be a codeword with exactly $d$ non-zero digits.

(a) Prove that $n \geqslant d+k-1$ (the Singleton bound).

(b) Prove that truncating $\mathcal{X}$ on the non-zero digits of $x$ produces a code $\mathcal{X}^{\prime}$ of length $n-d$, rank $k-1$ and distance $d^{\prime}$ for some $d^{\prime} \geqslant\left\lceil\frac{d}{2}\right\rceil$. Here $\lceil a\rceil$ is the integer satisfying $a \leqslant\lceil a\rceil.

[Hint: Assume the opposite. Then, given $y \in \mathcal{X}$ and its truncation $y^{\prime} \in \mathcal{X}^{\prime}$, consider the coordinates where $x$ and $y$ have 1 in common (i.e. $x_{j}=y_{j}=1$ ) and where they differ $\left(\right.$ e.g. $x_{j}=1$ and $\left.y_{j}=0\right)$.]

(c) Deduce that $n \geqslant d+\sum_{1 \leqslant \ell \leqslant k-1}\left\lceil\frac{d}{2^{\ell}}\right\rceil$ (an improved Singleton bound).

comment
• B4.13

State and prove the Fano and generalized Fano inequalities.

comment

• B1.14

(a) Define the entropy $h(X)$ and the mutual entropy $i(X, Y)$ of random variables $X$ and $Y$. Prove the inequality

$0 \leqslant i(X, Y) \leqslant \min \{h(X), h(Y)\}$

[You may assume the Gibbs inequality.]

(b) Let $X$ be a random variable and let $\mathbf{Y}=\left(Y_{1}, \ldots, Y_{n}\right)$ be a random vector.

(i) Prove or disprove by producing a counterexample the inequality

$i(X, \mathbf{Y}) \leqslant \sum_{j=1}^{n} i\left(X, Y_{j}\right)$

first under the assumption that $Y_{1}, \ldots, Y_{n}$ are independent random variables, and then under the assumption that $Y_{1}, \ldots, Y_{n}$ are conditionally independent given $X$.

(ii) Prove or disprove by producing a counterexample the inequality

$i(X, \mathbf{Y}) \geqslant \sum_{j=1}^{n} i\left(X, Y_{j}\right)$

first under the assumption that $Y_{1}, \ldots, Y_{n}$ are independent random variables, and then under the assumption that $Y_{1}, \ldots, Y_{n}$ are conditionally independent given $X$.

comment
• B2.14

Define the binary Hamming code of length $n=2^{\ell}-1$ and its dual. Prove that the Hamming code is perfect. Prove that in the dual code:

(i) The weight of any non-zero codeword equals $2^{\ell-1}$;

(ii) The distance between any pair of words equals $2^{\ell-1}$.

[You may quote results from the course provided that they are carefully stated.]

comment
• B4.13

Define the Huffman binary encoding procedure and prove its optimality among decipherable codes.

comment

• B1.14

Let $p_{1}, \ldots, p_{n}$ be a probability distribution, with $p^{*}=\max _{i}\left[p_{i}\right]$. Prove that

\begin{aligned} &(i)-\sum_{i} p_{i} \log p_{i} \geqslant-p^{*} \log p^{*}-\left(1-p^{*}\right) \log \left(1-p^{*}\right) \\ &(\text { ii })-\sum_{i} p_{i} \log p_{i} \geqslant \log \left(1 / p^{*}\right) ; \text { and } \\ &(\text { iii })-\sum_{i} p_{i} \log p_{i} \geqslant 2\left(1-p^{*}\right) \end{aligned}

All logarithms are to base 2 .

[Hint: To prove (iii), it is convenient to use (i) for $p^{*} \geqslant \frac{1}{2}$ and (ii) for $p^{*} \leqslant \frac{1}{2}$.]

Random variables $X$ and $Y$ with values $x$ and $y$ from finite 'alphabets' $I$ and $J$ represent the input and output of a transmission channel, with the conditional probability $p(x \mid y)=\mathbb{P}(X=x \mid Y=y)$. Let $h(p(\cdot \mid y))$ denote the entropy of the conditional distribution $p(\cdot \mid y), y \in J$, and $h(X \mid Y)$ denote the conditional entropy of $X$ given $Y$. Define the ideal observer decoding rule as a map $f: J \rightarrow I$ such that $p(f(y) \mid y)=\max _{x \in I} p(x \mid y)$ for all $y \in J$. Show that under this rule the error probability

$\pi_{\mathrm{er}}(y)=\sum_{\substack{x \in I \\ x \neq f(y)}} p(x \mid y)$

satisfies $\pi_{\mathrm{er}}(y) \leqslant \frac{1}{2} h(p(\cdot \mid y))$, and the expected value satisfies

$\mathbb{E} \pi_{\mathrm{er}}(Y) \leqslant \frac{1}{2} h(X \mid Y)$

comment
• B2.14

A subset $\mathcal{C}$ of the Hamming space $\{0,1\}^{n}$ of cardinality $|\mathcal{C}|=r$ and with the minimal (Hamming) distance $\min \left[d\left(x, x^{\prime}\right): x, x^{\prime} \in \mathcal{C}, x \neq x^{\prime}\right]=\delta$ is called an $[n, r, \delta]$-code (not necessarily linear). An $[n, r, \delta]$-code is called maximal if it is not contained in any $[n, r+1, \delta]$-code. Prove that an $[n, r, \delta]$-code is maximal if and only if for any $y \in\{0,1\}^{n}$ there exists $x \in \mathcal{C}$ such that $d(x, y)<\delta$. Conclude that if there are $\delta$ or more changes made in a codeword then the new word is closer to some other codeword than to the original one.

Suppose that a maximal $[n, r, \delta]$-code is used for transmitting information via a binary memoryless channel with the error probability $p$, and the receiver uses the maximum likelihood decoder. Prove that the probability of erroneous decoding, $\pi_{\mathrm{err}}^{\mathrm{ml}}$, obeys the bounds

$1-b(n, \delta-1) \leqslant \pi_{\mathrm{err}}^{\mathrm{ml}} \leqslant 1-b(n,[(\delta-1) / 2])$

where

$b(n, m)=\sum_{0 \leqslant k \leqslant m}\left(\begin{array}{l} n \\ k \end{array}\right) p^{k}(1-p)^{n-k}$

is a partial binomial sum and $[\cdot]$ is the integer part.

comment
• B4.13

State the Kraft inequality. Prove that it gives a necessary and sufficient condition for the existence of a prefix-free code with given codeword lengths.

comment