Emilee Rader Rotating Header Image

more on significance tests

I looked up a few of the references cited in “The Cult of Statistical Significance” over the past few days. As I said before, my major objection to the book is that the authors describe the problem in great detail, but provide no help or guidance for those who want to change their ways but aren’t sure where to start. Most of the references I’ve looked up weren’t all that helpful either; but, there was one paper which I felt was more clear and concise about describing the problem than the entire book:

Cohen, J. (1994). The Earth is Round (p < .05). American Psychologist 49(12), 997-1003. [ pdf ]

To describe the problem with the logic behind NHST, or Null Hypothesis Significance Testing, he wrote:

What we want to know is, “Given these data, what is the probability that H0 [the null hypothesis] is true?” But as most of us know, what it tells us is, “Given that H0 is true, what is the probability of these (or more extreme) data?” These are not the same… (p997)

So, what he’s saying is, when you do your statistical test and find out that your results are significant at the p < .05 level, you’ve learned that (assuming the null hypothesis is true), there is a less than 5% chance of randomly observing the kind of data you’ve collected in your experiment. We take that to mean that, in our well-designed experiment where the only thing that varied between conditions was the experimental treatment, the difference must be due to the independent (treatment) variable. In a poorly designed experiment with confounds, etc. we cannot be so sure.

But what Cohen tells us is that even in a well-designed experiment it is a mistake to assume that just because you find out your data are unlikely to have occurred if the null hypothesis is true, doesn’t necessarily mean the null hypothesis is false. This is where things start to get over my head, because in my entire statistical education (in the social sciences) Bayesian statistics and prior probabilities have never come up in the context of research design and hypothesis testing.

Cohen wrote:

Now, what really is at issue, what is always the real issue, is the probability that H0 is true, given the data, P(H0|D), the inverse probability. When one rejects H0, one wants to conclude that H0 is unlikely, say, p < .01. The very reason the statistical test is done is to be able to reject H0 because of its unlikelihood! But that is the posterior probability, available only through Bayes’s theorem, for which one needs to know P(H0), the probability of the null hypothesis before the experiment, the “prior” probability. (p998)

Sigh. I get a blood clot and suddenly I’m questioning everything I ever learned about statistical inference. What’s up with that? What I think Cohen is saying is we should use information we have about the probability of each hypothesis being true *before* the experimental treatment (prior probability), in addition to an estimation of these probabilities given the new information we have learned from our experimental data (posterior probability), when deciding which hypothesis to reject.

How does one come up with the prior probability of the null and alternate hypotheses? How does one frame hypotheses so this kind of question even make sense? A series of posts by Kimball Atwood (one, two, three) on the Science-Based Medicine blog explains Bayes’s theorem, and discusses its application in medical research. I’m also reading a few more papers and checking out a book from the library, hoping to find some insight. I’ll keep you posted.

Nickerson R. (2000) Null Hypothesis Significance Testing: A Review of an Old and Continuing Controversy. Psychological Methods, 5(2), 241-301. [ pdf ]

Goodman, S. (1999). Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy. Annals of Internal Medicine, 130, 995-1004. [ pdf ]

Goodman, S. (1999). Toward Evidence-Based Medical Statistics. 2: The Bayes Factor. Annals of Internal Medicine, 130, 1005-1013. [ pdf ]

Press, James S. (2001). The subjectivity of scientists and the Bayesian approach. New York: Wiley.

Comments are closed.