CSC 310 - Information Theory (Sep-Dec 2011)

Information theory arises from two important problems: How to represent information compactly, and how to transmit information reliably in the presence of noise. The concept of `entropy' as a measure of information content is central to both of these problems, and is the basis for an elegant theory centred on two famous theorems proved by Claude Shannon over 50 years ago. Only in recent years has the promise of these theorems been fully realized, however. In this course, the elements of information theory will be presented, along with some of the practical techniques needed to construct the data compression and error-correction methods that play a large role in modern communications and storage systems.

ANNOUNCEMENTS: You can collect marked term work on January 25 from 1:10 to 2:00, in my Sidney Smith office, SS6026A.

Instructor: Radford Neal, Office: PT 290E / SS 6026A, Phone: (416) 978-4970, Email: radford@cs.utoronto.ca

Office hours: Tuesdays, 4:10 to 5:00, in PT 290E, starting September 20.

Prerequisites:

CSC 148/150/260 (programming), STA 247/255/257/107 (probability), MAT 135/137 (calculus), MAT 223/240 (linear algebra).

Lectures:

Mondays and Wednesdays, 11:10 to 12:00, in MP 118. Lectures run from September 12 to December 7, except for October 10 (Thanksgiving) and November 7 (Fall break). Note that there will be a class on the "Thanksgiving makeup day" of Wednesday, December 7.

Tutorials:

Fridays, 11:10 to 12:00, from September 23 to December 2, in MP 118. Note that there is no tutorial the first week of classes. You must be able to attend at the tutorial times, since mid-term tests will be held then.

Tutorials will be taught by the TAs, Lilin Zhang and Seyed Amir Hejazi.

Evaluation:

21% Three theory assignments (7% each)
20% Two practical (programming) assignments (10% each)
24% Two one-hour tests (12% each), Oct 14 and Nov 11
35% Final exam, scheduled by the Faculty

You must get at least 40% on the final exam to pass the course.

All assignments are to be done by each student individually.

Requests for assignments or tests to be remarked must be made in writing to the instructor (not the TAs) within one month of the work being returned.

Accomodating illness and other special circumstances:

If due to illness or other serious personal difficulties, you are unable to complete an assignment on time, or unable to write one of the tests, you should contact the instructor as soon as possible. For tests and theory assignments, accomodation will be by taking that portion of the mark from other work. For practical assignments, you may be allowed to hand in the assignment up to one week late; if that is not possible, that portion of the mark will be taken from other work.

Note that for the final exam, accomodation for illness or other difficulties is handled officially through your college/faculty, not by the instructor.

Computing:

Practical assignments will require programming, which may be done one the CDF system, or on your own computer. Programming language used is flexible, though you may have to interface to routines supplied in C.

Course text:

There is no required course text. Lecture slides and other materials will be posted on this web page.

Here are two online books on information theory that may be useful to you:

David J. C. MacKay, Information Theory, Inference, and Learning Algorithms.

Robert M. Gray, Entropy and Information Theory. This link is provided for any who are interested, but note that Gray's book is written at a higher mathematical level than I will assume for this course.

I may sometimes refer to parts of these books, but the course will not generally be following them.

Tests and final exam:

Test 1 is October 14, 11:10-12:00, in MP 118, covering week 1 through week 4. Two partly relevant previous tests:
2002 test, 2002 answers
2004 test, 2004 answers
But note that both tests cover a bit more material than will be covered on this year's first test.

Test 2 marks will be adjusted by setting the new mark to be the old mark times 100/80 if the old mark was less than 60, and leaving the old mark unchanged if it was greater than 80. There were no marks between 60 and 80. The mark written on the paper is the unadjusted mark.

The final exam mark was adjusted by multiplying it by 100/85.

Assignments:

Theory assignment 1: handout, solution.

Theory assignment 2: handout, solution. Marks on this assignment will be adjusted by setting the new mark to be the old mark times 100/95 if the old mark was less than 75, and leaving the old mark unchanged if it was greater than 80. There were no marks between 75 and 80. The mark written on the paper is the unadjusted mark.

Practical assignment 1: handout, C modules and test files as a tar file or a zip file. Programs for solving Parts A, B, C: tar file or a zip file. Marks will be adjusted by mutliplying them by 100/90. The mark written on the paper is the unadjusted mark.

Theory assignment 3: handout, solution. Note that the marks for the three questions add up to 95, not 100. An adjusted mark was used, in which the marks obtained were multiplied by 100/85 (ie, the mark was computed as if the marks for the three questions added to 85 rather than 95).

Practical assignment 2: handout, test files as a tar file or a zip file. Solution: program, output, and discussion.

Tentative schedule:

  Week       Monday           Wednesday       Friday       Theory     Practical
            lectures          lectures       tutorials   assignments assignments

Sep 12:       Intro         Source coding     -----
    19:   Source coding     Source coding      tut
    26:   Source coding     Source coding      tut       ThA1 out W
Oct  3:   Source coding     Source coding      tut       ThA1 due F
    10:       -----         Source coding     Test 1
    17:  Source modeling   Source modeling     tut
    24:  Source modeling   Source modeling     tut       
    31:  Source modeling   Error correction    tut       ThA2 out M   PrA1 out F
Nov  7:       -----        Error correction   Test 2
    14:  Error correction  Error correction    tut       ThA2 due M   
    21:  Error correction  Error correction    tut       ThA3 out W   PrA1 due W
    28:  Error correction  Error correction    tut                    PrA2 out M
Dec  5:  Error correction  Lossy compression  -----      ThA3 due M   PrA2 due W

         Final exam scheduled by the Faculty

Lecture slides:

Week 1: Introduction: data compression, error correction
Week 2: Theory of data compression: what codes are possible?
Week 3: Theory of data compression: entropy, optimal source codes
Week 4: Theory of data compression: extensions, Shannon's noiseless coding theorem, arithmetic coding
Week 5: Practical implementation of arithmetic coding
Week 6: Source modeling: adaptive models, Bayesian inference
Week 7: Source modeling: variable-order Markov models, dictionary methods
Week 8: Error correction: channels, mutual information, capacity
Week 9: Error correction: more on mutual information and capacity
Week 10: Error correction: codes, decoding, can't transmit without error above capacity, linear codes
Week 11: Error correction: minimum distance, Hamming codes, sphere-packing bound, Gilbert-Varshamov bound, product codes
Week 12: Error correction: Shannon's noisy coding thereom.
Week 13: Error correction: Low Density Parity Check Codes; Lossy data compression.

Previous versions of this course:

Here are the web pages for versions of this course taught previously:

Fall 2009, by Periklis Papakonstantinou
Fall 2007, by Matei David
Fall 2006, by Sam Roweis
Fall 2005, by Sam Roweis
Spring 2004, by Radford Neal
Spring 2003, by Ken Sevcik
Spring 2002, by Radford Neal