Errors in Moore & McCabe

Radford Neal, November 1999

The following errors and omissions occur in Introduction to the Practice of Statistics, 3rd edition, by D. S. Moore and G. P. McCabe.

Some of these have been corrected in the Second Printing of this edition, as noted.

Note: A 4th edition has now been published. Surprisingly, many of the errors reported here are still present in this new edition.

Page 18, Example 1.10: The ozone hole.

The hole in the ozone layer is not caused by burning fossil fuels. It is caused by CFCs, used in refrigerators and in other applications. Burning fossil fuels is a cause of global warming. The only connection between these two issues is that CFCs also cause global warming, though they aren't the main cause.

Page 54-55, Example 1.19: Returns on stocks and treasury bills.

This example is seriously flawed. See my explanation of what's really going on.

Page 73: Notation N(a,b) for the normal distribution with mean a and standard deviation b.

This abbreviation is not standard. It is far more common for N(a,b) to mean the normal distribution with mean a and variance b.

Page 80: Procedure for producing a normal quantile plot.

This doesn't work. If you try this, you will decide in step 1 that the last observation is at the 100% point. In step 2, you will then need to find the corresponding z-score, which is infinite.
Two sensible procedures are to say that the percentile points for 20 observations are 2.5%, 7.5%, ..., 97.5%, given by (i-1/2)/20 for i from 1 to 20, or alternatively to say that observation i out of n is at percentile i/(n+1).

Page 127, Figure 2.10.

The point of these two scatterplots is to demonstrate that adding whitespace around the data makes the correlation seem stronger. However, the relative scales of the x and y axes are also different: the horizontal scale changes from being 80 wide to being 250 wide, a factor 3.125, while the vertical scale changes from being 120 high to being 250 high, a factor of 2.083. One therefore cannot draw any conclusion from this figure about the effect of whitespace.

Page 153, Exercise 2.47: "How would the slope and intercept of this line change if we measured all heights in centimetres rather than inches?" The answer at back says "The conversion would have no effect".

Actually, the conversion would have no effect on the slope, but it would change the intercept.
Fixed in Second Printing

Page 173, Exercise 2.58: Data on the amount of beef consumed per person and the price of beef for the years 1970 to 1993.

This exercise (and answer in the Instructor's Guide) completely misses what appears to be really going on.
First, contrary to what is said in part (a), economic theory would not lead one to expect consumption to necessarily fall when the price rises. In a competitive market, the price and amount comsumed are set by the intersection of the supply curve and the demand curve. It the price changes, one or both of these curves must have shifted, which will usually also cause the amount consumed to change. Depending on which curve(s) shifted, and which way, any combination of an increase or decrease in price with an increase or decrease in consumption is possible.
Looking at the relationship of price and consumption in this data is pretty meaningless in any case, because of the important lurking variable of time. In this data, both price and consumption are more strongly related to time than to each other, with both decreasing over time. A plausible explanation is that price decreased because of techological change, and at the same time consumption declined because of changing consumer tastes. It may also be relevant that beef is highly substitutable with other foods at both the production and consumption end. The question may therefore be rather meaningless, being analogous to asking about the relationship between the price of red cars and the number of red cars bought, ignoring production and sales of cars painted colours other than red.

Page 174, Exercise 2.59: Data on price of various kinds of seafood in 1970 and 1980.

This is treated as problem regarding the relationship between the 1970 price and the 1980 price, using scatterplots, correlation, and regression. Actually, this is a good example of when one should not use these tools. A pound of scallops is a very different thing than a pound of cod. Comparing their price per pound is not meaningful for most purposes.
Instead, for each kind of seafood, it would make sense to find the ratio of the price in 1980 to the price in 1970. The following is a stem plot of these ratios:
   1 : 5
   2 : 011
   2 : 588
   3 : 000112
   3 :
   4 :
   4 : 7
The extreme points are 1.5, for haddock, and 4.7, for ocean perch. These are the seafoods that might merit a special look. The answer in the back of the book singles out scallops and lobsters as being outliers, but their ratios of 3.0 and 2.0 are not exceptional. They appear as outliers when one inappropriately does a regression only because they are the two most expensive seafoods on a per pound basis, making any variation in their price appear large in absolute terms, even when it is not large in percentage terms.

Pages 208-209, Examples 2.35 and 2.36, concerning causation.

The claim that there is a direct causal link between the height of mothers and the height of daughters is wrong. This is a case of both being influenced by a common cause, namely the genes of the mother. If you starve a teenage mother (after the birth of her child) in order to reduce her adult height, this will not change the adult height of her child (assuming you feed the child a normal diet).
The example regarding the "strong positive correlation between how much money individuals add to mutual funds each month and how well the stock market does the same month" is incomprehensible, because the exact timing of the events is unclear. If the data is aggregated over the month, then some of the money went to mutual funds before part of the stock market change, but part of it followed part of the change. The aggregation makes this a confusing and complex example.

Pages 258-259: Explanation of stratified sampling.

A crucial part of the explanation of stratified sampling is omitted here. The whole procedure makes sense only when you have census data that allows you to combine the results from sampling the various strata into a final estimate that takes account of how many people are in each stratum.

Page 268, Example 3.20: "Are attitudes towards shopping changing?" plus Example 5.7 on page 381 (followed by examples 5.8 and 5.9).

The technical content of this example is correct, but the "real world" context is seriously garbled. See my explanation of what's really going on.

Page 303, Example 4.11: examples of independence and dependence.

The last paragraph attempts to illustrate independence and dependence with real examples. Both examples are actually concerned with random variables, not events. Independence of random variables is discussed later, on page 337. These examples are wrong, or at least require quite strained interpretations if they are to be regarded as right. See my detailed explanation.

Page 312 and following material in Section 4.3, and Exercise 4.103:

In Section 4.3, the book gives the impression that when one talks about a random variable, the sample space consists of the possible numerical values of this random variable. This is generally impossible when one wants to talk about more than one random variable at the same time. The correct view is that a random variable is a rule for assigning a number to each outcome. It's possible that the outcomes are themselves numbers, and that the rule for random variable X is just that X is equal to the outcome itself, but this will not usually be the situation.
This inadequate view of random variables is reinforced by exercise 4.103, which asks "Give a sample space for the sum X of the spots", implying that the concept of a sample space is tied to a particular random variable. The answer given in the back of the book is "S = { 3, 4, 5, ..., 18}". This is not a sample space that is sensible to use, since it is particular to X (even though the question itself introduces other random variables, which can't be made sense of with this sample space), and since we have no direct way of assigning probabilities to the outcomes in this sample space. A sensible answer is S = { (1,1,1), ..., (6,6,6) }, ie, all possible combinations of rolls for the three dice (with the dice being distinguished as first, second, and third).

Page 376: Notation B(n,p).

This notation is also used for the "beta" distribution. To avoid confusion, it is better to use the notation binomial(n,p).

Page 404, Figure 5.9, illustrating the central limit theorem, relating to Example 5.19 on the previous page.

These plots are grossly incorrect, and seriously misleading. Plot (a) seems to be about right, but (b) shows the density being positive at zero, which it isn't. It's (c) that's really bad, though, since the mean of the distribution shown is clearly greater than 1, when it should be equal to 1.
Fixed in Second Printing

Page 405, Figure 5.10, again illustrating the central limit theorem.

The two density curves in this plot are also wrong. The match with reality is improved if one exchanges the two curves (ie, labels the solid as the exact distribution and the dashed as the normal approximation), but they are still wrong in detail.

Page 409, Exercise 5.29(a): The answer given in the back is P(X<295)=0.8413.

The answer is actually 0.1587.
Fixed in Second Printing

Page 412, Exercise 5.39: "An experiment... divides 40 chicks at random into two groups of 20... inference... is based on the mean weight gain ... of the 20 chicks in the high-lysine group and the means weight gain ... of the 20 in the normal-corn group. Because of the randomization, the two sample means are independent."

The last sentence above is not correct. See my detailed explanation.

Page 509-510, Figures 7.2 and 7.3:

The tails shown here are too light. The horizontal axis should be labelled "t", not "Z".

Page 511, Example 7.5: "Here is one way to report the conclusion: the mean of Newcomb's passage times, x-bar = 27.75, is lower than the value calculated by modern methods, mu=33.02 (t=-8.29, df=63, P<0.0001)."

This summary misses the whole point. Of course the mean of Newcomb's times, 27.75, is less than 33.02, the accepted modern value. You don't need statistics to tell you that. What the t-test tells you is that we have good reason to think that Newcomb's experiment suffered from systematic error, since the difference is too large to be plausibly explained by random errors.

Page 541: "This approximation is appealing because it is conservative... P-values are a bit smaller, so we are a little less likely to reject H_0 when it is true."

"P-values are a bit smaller" should be "P-values are a bit bigger".
Fixed in Second Printing

Page 545, discussion of the study in Example 7.14.

Two big potential problems that aren't mentioned here are conscious or unconscious bias on the part of the researcher/teacher and the possibility that the students' improvement was due to a placebo effect.

Page 546, Example 7.16: "Software uses the t(289) distribution and gives P=0.051".

This should be "gives P=0.51".
Fixed in Second Printing

Page 553, discussion after Example 7.20: "Sample size strongly influences the P-value of a test. An effect that fails to be significant at a specified level alpha in a small sample can be significant in a larger sample. In the light of the rather small samples in Example 7.20, the evidence for some effect of calcium on blood pressure is rather good."

This reasoning is circular. Increasing the sample size will tend to result in a smaller P-value only if the null hypothesis is false, which is the point at issue. Here is my more detailed explanation.

Page 673, "...when the logarithm of the sum of the four skinfold thicknesses... is 0, a value that cannot occur."

There's no fundamental reason that the logarithm of the sum of the thicknesses can't be zero (ie, that the sum can't be one), though it is a bit far from the observed values.

Page 675, Figure 10.10.

This figure is wrong in highly misleading ways. It is supposed to show the same data as Figure 10.3, with the least-squares regression line and 95% confidence limits for the mean response. These confidence limits wrongly appear to be given by two straight lines that intersect on the least-squares regression line, implying that the confidence interval has zero width at that point, which is not true. The limits at a point are also not symmetrical about the least-squares regression line at that point, which they should be. Finally, the data points shown are neither the same as the correct points from the CD, nor the same as the points in Figure 10.3 (which are also wrong).

Page 696, Exercise 10.3: The solution in the back gives t=4.61.

Minitab gives t=6.94.

Page 754, Example 12.6: Pretest scores for three groups of children.

The context for this example is not described sufficiently for one to understand why an ANOVA analysis is being done. Were the three groups assigned randomly? If so, the only sense of doing a test of this would be if you suspected that the randomization wasn't really random. If one is worried that a true random assignment might have by chance produced unbalanced groups, it is not at all clear that doing a test of a null hypothesis that you know for certain is true is a sensible approach to assessing whether this is a problem. On the other hand, if the groups were not assigned at random, the ANOVA test might be sensible, but again one might wonder about the relevance of the P-value found to the question of whether or not the differences are large enough for this to be a problem for some (unspecified) later analysis.