School of Mathematics and Statistics   Senior
 USyd Home MyUni Library Sitemap

# MATH3010 Online Resources

Note that MATH3010 has now been superseded by MATH3067.

## Information Theory 2004

This is the home page for MATH3010 Information Theory for 2nd semester 2004. The page will be updated with tutorial solutions and so forth as the semester progresses. Any important announcements will also be posted here. Consequently you are urged to bookmark this page and consult it regularly.

The textbook for the course is a set of lecture notes by Dr Nigel O'Brian, which is available for purchase from KopyStop. It is essential for students to obtain a copy of the book.

There were some minor misprints and other errors in last year's version of the book. If anyone finds any in this year's version, please tell me.

For a list of reference books, see the MATH3010 section of the Senior Maths Handbook.

Please read and retain a copy of the information sheet for MATH3010.

## Consultations

The lecturer is A/Prof Bob Howlett, whose room is Carslaw 709. His consultation times are shown on his timetable.

## Handouts

(The tutorial sheets and solutions are no longer available.)

A sample exam has been released.

## Entropies

A probability distribution is essentially a collection of nonnegative numbers that add up to 1. The associated information entropy is obtained by calculating –p log2 p for each of these numbers p and summing over all p (excluding p = 0).

Note that –log2 p can be regarded as a measure of the amount of information one receives when an event of probability p occurs. The entropy is the weighted average of this over all possible outcomes. That is, it is the expectation, or expected value, of the amount of information that will be obtained when the outcome is known.

So if X is a random variable, the entropy H(X) is to be thought of as the amount of information you expect to to get from discovering the value of X.

If X, Y are two random variables, their joint entropy H(X,Y) is the entropy of the joint distribution. That is, you sum –p(x,y) log2 p(x,y) over all x and y. This is the total amount of information you expect to get from knowing both X and Y.

The conditional entropy of X given Y is the amount of additional information you expect to get from learning the value of X given that you already know Y. It is denoted by H(X|Y), and equals H(X,Y) – H(Y).

For each particular value y of Y there is an associated probability distribution for X, made up of the probabilities of the various values of X given that Y = y. The entropy of this is the conditional entropy of X given that Y = y. It is denoted by H(X|Y = y), and is found by summing – p(x|y) log2 p(x|y) over all values of x. (Remember that y is fixed here.) Remember that p(x|y) equals p(x,y)/p(y). The conditional entropy of X given Y (see above) is equal to the weighted average, over all possible values for y, of the conditional entropy of X given that Y = y.

The mutual information of X and Y is the amount of information about X that you expect to get by learning the value of Y. It also equals the amount of information about Y that you expect to get by learning the value of X. This is obviously less than the total amount of information you expect get by learning the value of Y or the total amount of information you expect get by learning the value of X. Indeed, it is given by I(X;Y) = H(X) – H(X|Y). Do not confuse the conditional entropy H(X|Y) (additional information expected to be gained from learning X when you already know Y) with the mutual information (expected information about X gained by learning the value of Y).