Illustrations of independence in Moore & McCabe

Regarding Example 4.11 in Introduction to the Practice of Statistics, 3rd edition, by D. S. Moore and G. P. McCabe:

The last paragraph attempts to illustrate independence and dependence with real examples. Both examples are actually concerned with random variables, not events. Independence of random variables has not been discussed yet (it is on page 337). Worse, however, is that the examples are wrong, or at least require quite strained interpretations if they are to be regarded as right.

First it is said that, "If a doctor measures your blood pressure twice, it is reasonable to assume that the two results are independent because the first result does not influence the instrument that makes the second reading"

First of all, before we can discuss independence, we must be clear on the sample space: that is, on what we are regarding as randomly varying. From the doctor's point of view, two blood pressure measurements on the same patient are not independent if the patient is regarded as being randomly selected from some population. Even if we consider only "your" blood pressure, two measurements made on the same visit to the doctor are not independent if we regard the time of the visit as random, since variation in blood pressure has dependencies over time scales of hours. The example seems to regard the blood pressure as a fixed quantity, with the only thing that is random being the measurement errors. Errors in measurements taken close together in time are often not independent, since the error can depend on things such as the temperature of the device doing the measuring, but aside from that, if this is what is intended, blood pressure is about the worst possible example to use, since it also varies at quite short time scales (minutes).

It is then said that, "But if you take an IQ test or other mental test twice in succession, the two test scores are not independent. The learning that occurs on the first attempt influences your second attempt."

This statement seems to confuse dependence between two random variables with their having different distributions. If the learning occurs as a result of taking the first test, regardless of the score obtained, the two variables will have different distributions (the mean of the second being higher), but they need not be dependent. Only if the random variation in the score on the first test influences the random variation in the score on the second test will the two variables be dependent.

Overall, these examples seem to be confusing independence with lack of causal influence. These are not the same thing. If one variable causally influences another, then they will generally be dependent, but it is quite possible for two variables to be dependent without one having a causal influence on the other. That's why establishing causality is so hard.

Back to main page of Moore & McCabe errata