STA 410/2102 - Statistical Computation (Jan-Apr 2003)

Test 3 and assignments 3 and 4 have now been marked, and can be picked up from my office. (Phone to see if I'm in first.) You can check your term marks here. Let me know if you find any errors.

This course will look at how statistical computations are done, and how to write programs for statistical problems that aren't handled by standard packages. Students will program in the R language (a free and improved variant of S), which will be introduced at the start of the course. Topics will include the use of simulation to investigate the properties of statistical methods, matrix computations used to implement linear models, optimization methods used for maximum likelihood estimation, and numerical integration methods for Bayesian inference, including a brief introduction to Markov chain Monte Carlo methods.

Instructor: Radford Neal, Office: SS6016A, Phone: (416) 978-4970, Email: radford@stat.utoronto.ca
Office hours: Wednesdays 2:30 to 3:30.

Lectures:

Tuesdays, Thursdays, and Fridays from 1:10 to 2:00, in WI 524.
From January 7 to April 11, except for Reading Week (February 17-21).

Course Text:

R. A. Thisted, Elements of Statistical Computing.

Computing:

Assignments will be done in R. Graduate students will use the Statistics/Biostatistics computer system. Undergraduates will use CQUEST. You can also use R on your home computer by downloading it for free from http://lib.stat.cmu.edu/R/CRAN

Marking scheme:

Four assignments, worth 17%, 18%, 18%, and 17%.
Three one-hour in-class tests, each worth 10%.
The tests are tentatively scheduled for February 6, March 13, and April 11.

Please read the important course policies

Running R

To run R on photon.utstat or fisher.utstat, just type "R" (upper case) in a Unix command window.

To run R on a CQUEST workstation, or the CQUEST server www.cquest.utoronto.ca, type /u/radford/R in a Unix command window. You can also start R from a CQUEST workstation using a link on the CQUEST STA 410 web page.

Here is some documentation on R:

What to read in the text:

Chapter 2 (Computer arithmetic)
Chapter 3, Sections 3.1, 3.2, 3.3 (Numerical linear algebra for LS regression)
Chapter 4, Sections 4.1, 4.2, 4.3, 4.4 (Finding maximum likelihood estimates)
Chapter 4, Section 4.7 (EM algorithm)
Chapter 5, Sections 5.1, 5.2, 5.6, 5.8 (Numerical integration, Bayesian inference)

Tests:

Here are some practice questions for test 1: Postscript, PDF. Ignore question 1(c) - I don't know what I was thinking!
Here are some practice questions for test 2: Postscript, PDF.
Here are some practice questions for test 3: Postscript, PDF.

Assignments:

Assigment 1: Postscript, PDF. Solution: program, tests, plot1 (postscript, PDF), plot2 (postscript, PDF), discussion

Assigment 2: Postscript, PDF. Solution: program, tests, output, discussion

Assigment 3: Postscript, PDF. Solution: program, tests, output, discussion.
Here is the data for problem 1 and the data for problem 2.

Assigment 4: Postscript, PDF. Solution: program, tests, output.

Example programs:

Creating a data frame with data for several groups: program.
Testing how well t tests work: program, plots: Postscript, PDF.
Computing a Cholesky decomposition: program.
Least-squares regresssion with Householder transformations: program, plus a test program and its output.
Maximum likelihood for no-zero Poisson data: program and test output.
Maximum likelihood for a simple model with two variances: program.
An old maximum likelihood assigment: Handout, Solution to part 1, Solution to part 2.
Symbolic computation and minimization in R: examples
EM algorithm for censored Poission data: program, output of two tests.
An old EM assignment: Handout, Solution.
Bayesian inference for a simple model using the midpoint rule: program, output.
Evaluating a double integral with the midpoint rule: program, output.
An old Bayesian inference assignment: handout, program, output.
Gibbs sampling example: program, plots of output for data of 3.9, 3.6, 3.7: Postscript, PDF
Metropolis algorithm example: program. Run on data of 3.9, 3.6, 3.7.
Results with proposal standard deviations of 0.1: scatterplot (Postscript, PDF), trace (Postscript, PDF),
Results with proposal standard deviations of 1: scatterplot (Postscript, PDF), trace (Postscript, PDF),