Sample Size for Pearson's Correlation
Menu location: Analysis_Sample Size_Correlation.
This function gives you the minimum number of pairs of subjects needed to detect a true difference in Pearson's correlation coefficient between the null (usually 0) and alternative hypothesis levels with power POWER and two sided type I error probability ALPHA (Stuart and Ord, 1994; Draper and Smith, 1998).
Information required
- POWER: probability of detecting a true effect.
- ALPHA: probability of detecting a false effect (two sided: double this if you need one sided).
- R0: correlation coefficient under the null hypothesis (often 0).
- R1: correlation coefficient under the alternative hypothesis.
Practical issues
- Usual values for POWER are 80%, 85% and 90%; try several in order to explore/scope.
- 5% is the usual choice for ALPHA.
- Two sided in this context means R0 not equal to R1, one sided would be R1 either greater than or less than R0 which is rarely appropriate because you can seldom say that a difference in the unexpected direction would be of no interest at all.
- Statistical correlation can be misleading, remember to think beyond the numerical association between two variables, and not to infer causality too easily.
Technical validation
The sample size estimation uses Fisher's classic z-transformation to normalize the distribution of Pearson's correlation coefficient:
This gives rise to the usual test for an observed correlation coefficient (r1) to be tested for its difference from a pre-defined reference value (r0, often 0), and from this the power and sample size (n) can be determined:
StatsDirect makes an initial estimate of n as:
StatsDirect then finds the value of n that satisfies the following power (1-β) equation:
-where norm is the area under the standard normal distribution curve.
The precise value of n is rounded up to the closest integer in the results given by StatsDirect.