STA 250 - Statistical Concepts (Fall 2000)

Final marks: The final marks have now been computed, and should shortly be available from ROSI. Marks for the mid-term, the assignments, and the final exam are available here. Please let me know of any errors.

You can come by my office to pick up your second assignment. I'm around most afternoons; you can phone at 416-978-4970 if you want to check if I'm in.

You can now look at an example solution to assignment two.

The main topics of this course are: (1) How meaningful data can be gathered, and presented in a way that allows one to make intuitive sense of it. (2) The basics of probability, and how it is used in more formal procedures that help distinguish real patterns in data from coincidences. (3) Some fundamental statistical techniques, including t tests, linear regression models, one-way ANOVA, and analysis of contingency tables. Students will learn the theory behind these concepts (at a moderate mathematical level), and will obtain experience in apply them using the Minitab computer package.

Office Hours: Wednesdays 2:10-3:00 and Fridays 3:30-4:00, in SS 6016A

Textbook: Exploring Statistics, by Larry J. Kitchens, Second Edition. You will also need the accompanying Minitab Lab Manual (which is packaged with it in the bookstore).

Lectures: Tuesday, Thursday, Friday from 11:10pm to 12:00pm, in MP 203, from September 12 to December 8.

Tutorials: Thursdays, from September 21 to December 7, as follows:

```  Section      Time       Room    Last name

T0101    12:10-1:00   SS1084   A-Leung
"      12:10-1:00   SS1088   Liu-Z
T0201     1:10-2:00   SS1084   A-Z
T0301     2:10-3:00   SS1073   A-Lim
"       2:10-3:00   SS1087   Lin-Z
```

Statistics help:

The stats aid centre in New College 55B has the following hours:

```                 John Sheriff    Neil Montgomery

Monday         2-5             12:30-2
Tuesday        9-12               -
Wednesday      1-3                -
Thursday       9-11             11-3
```
John Sheriff will be more familiar with STA 250 material.

Computing:

Assignments will be done using the Minitab statistical package on the CQUEST computer system. Your CQUEST account should be posted across from SS 2133, near the terminal room in SS2105.

Here is the CQUEST page for STA 250.

It is also possible to buy Minitab for use on a home computer, but you should not expect any help in installing it.

Grading Scheme: Assignment 1: 17%, Mid-term test: 20%, Assignment 2: 17%, Final exam: 46%

Final exam:

Scheduled by the Faculty of Arts and Science

The final exam will cover the whole course, though there will be more questions on material after the midterm test than before.

No books or notes will be allowed. You may use a calculator. You can write in pencil.

You won't be expected to remember any really complicated formulas. The most complicated one that might be needed is the formula for the t statistic for a two-sample t test. All the formulas you will need should be easy to remember if you understand the concepts involved.

Mid-term test:

Here are the questions: postscript, PDF.

Here are the answers for Questions 1-41 (along with a few comments) and the answer to Question 42 (with marking scheme).

The mark is out of 100 (87 for Questions 1-41, 13 for Question 42).
The mark written on the answer sheet will be adjusted upward by 4.
The median of the adjusted marks was 67.

Assigments: To give you an idea of what is expected for the assignments, here are the assignments and solutions from last year.

Assignment 1: Handed out Oct. 5, due Oct. 31.

Assignment handout: Postscript, PDF

Here is the data for the assignment, for people who want to download it to their home computers. Click on the file corresponding to your CQUEST account.

Here are three examples solutions.

Assignment 2: Handed out Nov. 23, due Dec. 8.

Assignment handout: Postscript, PDF

Here is the corrected list of actual errors for each data set. In the original list, these people were said to have a weight200 error, when really they didn't.

Here is an example solution.

Chapter 1, Chapter 2, Chapter 3 except 3.5, Unit 1 review.
Chapter 4, Chapter 5, Chapter 6, Unit 2 review.
Sections 7.1, 7.2, and 7.3 (remainder of Chapter 7 is optional).
Sections 8.1, 8.2, 8.3, and 8.4 (remainder of Chapter 8 is optional).
Chapter 9 (except 9.6 and the parts of 9.4 and 9.5 on the Wilcoxon test)
Unit 3 review.
Sections 12.1 and 12.2
Chapter 11.
Unit 4 review.

Here are suggestions for (non-credit) exercises to do, some of which may be discussed in lecture or tutorial. You can do others too, of course!

• Exercise 2.56 and 2.57.
• Exercise 2.67, plus think about whether or not computing the mean and/or the median is a useful thing to do.
• Read the section on ``Time series data'' at the bottom of page 58 and top of page 59. What do you think of this discussion?
• Exercises 3.14, 3.32, 3.33, and 3.49. For 3.49, do the regression yourself in MINITAB, look at a scatterplot of the data with the regression line superimposed, and produce the residual plot in MINITAB yourself.
• Exercise 3.88, plus ask yourself what lurking variable or variables might complicate the interpretation of this data.
• Exercise 4.152.
• Exercise 4.92, plus ask yourself what assumption needs to be made for this problem to be solvable. Is making this assumption reasonable?
• Exercises 4.115 and 4.119.
• Exercises 5.23, 5.52, 5.89.
• Exercise 7.24.
• Exercise 8.28, plus ask yourself whether or not a one-sided test is appropriate in this context.
• Exercises 8.42, 8.34, and 8.71.
• Exercises 9.15 and 9.18.
• Exercises 9.35 and 9.36.
• Exercise 9.49. The book asks, "What does this mean about the average grades for the fall and spring classes?" This question is poorly phrased. The average grade was 82.4 for the fall class and 84.2 for the spring class. There's no doubt the actual average was lower for the fall class. What question does this test really answer?
• Exercises 12.1, 12.2, and 12.3.
• Exercise 11.18.
• Exercise 11.15, plus consider the impact of the retailer choosing to mostly increase the number of ads over time. Create a new `time' variable that has values of 0, 1, 2, etc. for increasing months. Do a multiple regression of sales on ads and time, and interpret the results.
• Exercise 11.21 and 11.47.

Slides from lectures

Here are the overhead slides from lectures, four per page, in PDF and Postscript formats, organized by week (1, 2, 3, ...) and by lecture within week (A, B, C). To read these, you'll need one or the other of the free acroread program (for PDF) or the free ghostview program (for both PDF and Postscript). Note that there are links for all lectures, but not all of the slides are actually there. Also, the correspondence with actual lectures may be only approximate.
```          PDF FORMAT    POSTSCRIPT

week1:    A B C        A B C

week2:    A B C        A B C

week3:    A B C        A B C

week4:    - B C        - B C

week5:    A - C        A - C

week6:    A B C        A B C

week7:    A B C        A B C

week8:    A - C        A - C

week9:    A B C        A B C

week10:   A B C        A B C

week11:   A B C        A B C

week12:   - B C        - B C

week13:   A - C        A - C

```