STA 414/2104: Statistical Methods for Machine Learning and Data Mining (Jan-Apr 2013)

All material handed in is now available for pickup. I'll be in my office 2:00-2:30 on Wednesday, April 24. (Later times will be announced here later.)

Links to the papers grad students presented are here.

Instructor:

Radford Neal, Office: SS6026A, Phone: (416) 978-4970, Email: radford@stat.utoronto.ca

Office Hours: Fridays 1:30-2:30pm, in SS6026A.

Lectures:

Tuesdays 12:10-2:00pm and Thursdays 12:10-1:00pm, in MS 3171. The first lecture is January 8. The last lecture is April 4. There are no lectures February 19 and 21 (Reading Week). Graduate students in STA 2104 will make presentations on Tuesday April 9 (12:10-2:00pm) and Thursday April 11 (12:10-1:00pm).

Evaluation:

For undergraduates in STA 414:
50% Four assignments, worth 10%, 10%, 15%, and 15%.
50% Three 50-minutes tests, worth 16%, 17%, and 17%, held in lecture time on February 7, March 14, and April 4.
For graduate students in STA 2104:
46% Four assignments, worth 10%, 10%, 13%, and 13%.
44% Three 50-minutes tests, worth 14%, 15%, and 15%, held in lecture time on February 7, March 14, and April 4.
10% A 12-minute individual presentation on a conference paper that you have read.

The assignments are to be done by each student individually. Any discussion of the assignments with other students should be about general issues only, and should not involve giving or receiving written, typed, or emailed notes.

Graduate students may discuss the conference paper that they will present with anyone, in order to help understand it, but they must prepare their presentation themselves. (They may if they wish solicit feedback from others after a practice run of their presentation.)

Textbook:

The book Machine Learning: A Probabilistic Perspective, by Kevin P. Murphy, is strongly recommended (but not required).

I will be posting lecture slides, and links to online references.

Computing:

Assignments will be done in R. Statistics Graduate students will use the Statistics research computing system. Undergraduates and graduate students from other departments will use CQUEST. You can request an account on CQUEST if you're an undergraduate student in this course (you need to fill out a form if you're a grad student).

You can also use R on your home computer by downloading it for free from www.R-project.org. From that site, here is the Introduction to R.

Some useful on-line references

Information Theory, Inference, and Learning Algorithms, by David MacKay.

David MacKay's thesis.

The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd edition), by Trevor Hastie, Robert Tibshirani, and Jerome Friedman.

Gaussian Processes for Machine Learning, by Carl Edward Rasmussen and Christopher K. I. Williams.

Proceedings of the International Conference on Machine Learning (ICML)

Proceedings of the annual conference on Neural Information Processing Systems (NIPS)

Links to papers presented by grad students.

Lecture slides:

Note that slides may be updated as mistakes are corrected, or the amount of material covered in the week becomes apparent.

Week 1 (Introduction: Murphy's Ch. 1)
Week 2 (Linear basis functions, penalties, cross-validation)
Week 3 (Introduction to Bayesian methods)
Week 4 (Conjugate priors: Murphy's Ch. 3, Bayesian linear basis function models: Murphy's Ch. 7)
Week 6 (Gaussian process models: Murphy's Ch. 15)
Week 7 (Clustering, mixture models: Murphy's Ch. 11)
Week 8 (More on Gaussian mixtures, EM: Murphy's Ch. 11)
Week 9 (Neural networks: Murphy's Section 16.5)
Week 10 (Dimensionality reduction, factor analysis: Murphy's Chapter 12, Section 28.3.2)
Week 11 (Classification, generative and discriminative models: Murphy's Section 3.5, Chapter 8)
Week 12 (Support Vector Machines, Kernel PCA: Murphy's Chapter 14)

Practice problem sets:

Practice problem set #1 (now with some slight errors corrected), and the answers.

Practice problem set #2, and the answers.

Practice problem set #3, and the answers.

Assignments:

Assignment 1: handout.
Data set 1: training inputs, training responses, test inputs, test responses.
Data set 2: training inputs, training responses, test inputs, test responses.
Solution: R functions,
script for dataset 1, cross-val, and its output,
script for dataset 1, marg-like, and its output,
script for dataset 2, cross-val, and its output,
script for dataset 2, marg-like, and its output,
discussion.

Assignment 2: handout, data, R functions to use, Solution: R script, output, plots, discussion.

Assignment 3: handout, Datasets: dataset1, dataset2, dataset3.
Solution: R functions, script1, script2, script3, plots1, plots2, plots3, discussion.

Assignment 4: handout, dataset1, dataset2,
Zip code images, Zip code labels, R function to display a digit image.
The original MNIST data is here (just in case you're interested; you don't need it).
Solution: R functions, script1, script2, script3, plots1, plots2, plots3, discussion.

Example R programs:

Week 2 example (linear basis function models): script, functions.
Week 3 example (simple Monte Carlo for Bayesian inference): script, functions.
Week 4 lecture example (Bayesian linear basis function models): script, functions.
Week 6 lecture example (Gaussian process regression): script, functions.
Week 8 lecture example (Gaussian mixture model with EM): script, functions.
Week 10 lecture example (neural network regression model): script, functions, plots.

Web pages for past related courses:

STA 414/2104 (Spring 2012)
STA 414/2104 (Spring 2011)
STA 414/2104 (Spring 2007)
STA 414/2104 (Spring 2006)
CSC 411 (Fall 2006)
STA 410/2102 (Spring 2004) - has many examples of R programs