Information Theory
Information Theory
Jump to year
B1.14
commentState the formula for the capacity of a memoryless channel.
(a) Consider a memoryless channel where there are two input symbols, and , and three output symbols, , *. Suppose each input symbol is left intact with probability , and transformed into a with probability . Write down the channel matrix, and calculate the capacity.
(b) Now suppose the output is further processed by someone who cannot distinguish and , so that the matrix becomes
Calculate the new capacity.
B2.14
commentFor integer-valued random variables and , define the relative entropy of relative to .
Prove that , with equality if and only if for all .
By considering , a geometric random variable with parameter chosen appropriately, show that if the mean , then
with equality if is geometric.
B4.13
commentDefine a cyclic code of length .
Show how codewords can be identified with polynomials in such a way that cyclic codes correspond to ideals in the polynomial ring with a suitably chosen multiplication rule.
Prove that any cyclic code has a unique generator, i.e. a polynomial of minimum degree, such that the code consists of the multiples of this polynomial. Prove that the rank of the code equals , and show that divides .
Let be a cyclic code. Set
(the dual code). Prove that is cyclic and establish how the generators of and are related to each other.
Show that the repetition and parity codes are cyclic, and determine their generators.
B1.14
commentA binary Huffman code is used for encoding symbols occurring with probabilities where . Let be the length of a shortest codeword and of a longest codeword. Determine the maximal and minimal values of and , and find binary trees for which they are attained.
B2.14
commentLet be a binary linear code of length , rank and distance . Let be a codeword with exactly non-zero digits.
(a) Prove that (the Singleton bound).
(b) Prove that truncating on the non-zero digits of produces a code of length , rank and distance for some . Here is the integer satisfying .
[Hint: Assume the opposite. Then, given and its truncation , consider the coordinates where and have 1 in common (i.e. ) and where they differ e.g. and .]
(c) Deduce that (an improved Singleton bound).
B4.13
commentState and prove the Fano and generalized Fano inequalities.
B1.14
comment(a) Define the entropy and the mutual entropy of random variables and . Prove the inequality
[You may assume the Gibbs inequality.]
(b) Let be a random variable and let be a random vector.
(i) Prove or disprove by producing a counterexample the inequality
first under the assumption that are independent random variables, and then under the assumption that are conditionally independent given .
(ii) Prove or disprove by producing a counterexample the inequality
first under the assumption that are independent random variables, and then under the assumption that are conditionally independent given .
B2.14
commentDefine the binary Hamming code of length and its dual. Prove that the Hamming code is perfect. Prove that in the dual code:
(i) The weight of any non-zero codeword equals ;
(ii) The distance between any pair of words equals .
[You may quote results from the course provided that they are carefully stated.]
B4.13
commentDefine the Huffman binary encoding procedure and prove its optimality among decipherable codes.
B1.14
commentLet be a probability distribution, with . Prove that
All logarithms are to base 2 .
[Hint: To prove (iii), it is convenient to use (i) for and (ii) for .]
Random variables and with values and from finite 'alphabets' and represent the input and output of a transmission channel, with the conditional probability . Let denote the entropy of the conditional distribution , and denote the conditional entropy of given . Define the ideal observer decoding rule as a map such that for all . Show that under this rule the error probability
satisfies , and the expected value satisfies
B2.14
commentA subset of the Hamming space of cardinality and with the minimal (Hamming) distance is called an -code (not necessarily linear). An -code is called maximal if it is not contained in any -code. Prove that an -code is maximal if and only if for any there exists such that . Conclude that if there are or more changes made in a codeword then the new word is closer to some other codeword than to the original one.
Suppose that a maximal -code is used for transmitting information via a binary memoryless channel with the error probability , and the receiver uses the maximum likelihood decoder. Prove that the probability of erroneous decoding, , obeys the bounds
where
is a partial binomial sum and is the integer part.
B4.13
commentState the Kraft inequality. Prove that it gives a necessary and sufficient condition for the existence of a prefix-free code with given codeword lengths.