FAQs with StatsDirect Statistical Software

Introductory book to StatsDirect

Is there an easy to read text book on Statistics with StatsDirect?

StatsDirect has been designed to be exceptionally easy to use and contains example-rich help. But for those wishing to have more statistical instruction in the form of a book, please see: Statistical Testing in Practice with StatsDirect, by Cole Davis

Written in accessible English without using mathematical formulae, this book introduces statistical testing and takes the reader from simple analyses of differences all the way to multiple regression, factor analysis and survival analysis. The statistics software cited in the text is cheaper and, crucially, easier to use than software such as SPSS, offering an escape route for postgraduates experiencing difficulties on research projects. The book caters for students and professionals in health, social sciences, marketing and other fields. Unlike many statistics texts, it avoids irrelevant themes and mathematical proofs, instead building up the reader’s knowledge in easy steps, with clear worked examples and practical advice on what to do. Readers who learn statistics with this book, however, should then be able to progress to other software as and when they choose to do so.

Please note that the above-mentioned book has been written by a third party and is published independently of StatsDirect. StatsDirect bears no responsibility for any errors or omissions.

Mac version

Is there a Mac version of StatsDirect?

StatsDirect is written for Microsoft Windows but can run on a Mac through virtual Windows.

Many users run StatsDirect on a Mac with Microsoft Windows installed using Vmware Fusion or Parallels Desktop.

Please ensure you have applied all updates to the version of Windows you are running on your Mac before installing StatsDirect.

Citing

How do I cite StatsDirect software in papers?
Buchan I. StatsDirect statistical software. http://www.statsdirect.com. England: StatsDirect Ltd 2024.
The theoretical basis of the methods used in StatsDirect should be cited as listed in the reference section of the help system in StatsDirect.

A doctoral thesis describing the scientific foundations of StatsDirect is at http://www.statsdirect.com/thesis/md.pdf

Problem removing the Excel-Statsdirect link add-in

To remove the Excel add-in

Download and run the sdxlremover.exe file if you need to remove the StatsDirect 2 add-in for Excel.

For either StatsDirect version 2 or 3 go to the Tools_Setup Tools menu and click the Excel On / Off option to Off.

Exact P values more often given than in other software

Why do I sometimes see slightly different P values in StatsDirect compared with other software?

StatsDirect has developed many exact (permutation) and simulated exact (Monte Carlo) algorithms for P values in non-parametric statistical inference.

By comparison, other software may use approximations in places where StatsDirect employs more complex algorithms, for example in the case of Wilcoxon signed ranks test in the presence of ties - here StatsDirect, SAS and StatXact will take a permuation approach where SPSS and R revert to asymptotic approximation.

Confidence intervals for proportions

Why does StatsDirect give slightly different confidence intervals for proportions compared with those calculated by hand?
Many text books contain poorly performing formulae for approximation of confidence intervals for binomial proportions and differences between them. StatsDirect improves upon these formulae, especially in the presence of small numbers, by using methods that have good coverage properties. For more detailed discussion of the reasons behind the choices of methods used in StatsDirect, please see the following series of excellent papers by Robert Newcombe:

Newcombe R. Improved confidence intervals for the difference between binomial proportions based on paired data. Statistics in Medicine 1998;17:2635-2650.

Newcombe R. Interval estimation for the difference between independent proportions. Statistics in Medicine 1998;17:873-890.

Newcombe R. Two sided confidence intervals for the single proportion: a comparative evaluation of seven methods. Statistics in Medicine 1998;17:857-872.
What do I do if there is a proportion with zero numerator?

Example:

Single proportion

Total = 30, response = 0 Proportion = 0

Exact (Clopper-Pearson) 95% confidence interval = 0 to 0.115703

Approximate (Wilson) 95% mid P confidence interval = 0 to 0.113513

Observation:
"...you mention that the Exact Clopper-Pearson) 95% confidence interval for the case where n=30 and r=0 is 0 to .115703. However, if the probability of getting a case in one trial is .115703, then the probability of getting no cases in 30 trials is (1-.115703)^30 = .025. This looks like the 97.5% confidence interval."...
Reply from Dr Robert Newcombe (expert in this field):
"This is a familiar situation. We think of the standard CI as a two-sided 95% CI or a z=(+/-)1.96 CI. It tries to get 2.5% non-coverage at each end. For a Gaussian variable, this is attained, of course. For the binary case, it isn't, and can't be, because of the discrete nature of the outcome space. The "exact" (Clopper- Pearson) continuity-corrected CI aims to make the minimum coverage 95%, and the maximum right and left non-coverage each 2.5%, and achieves this. In an extreme case, when all cases are +ve or all are -ve, an ambiguity arises: should we keep the overall coverage as 95%, or the one relevant non-coverage as 2.5%? It seems to make sense to go for the latter, in order to achieve continuity of interpretation as the number +ve tends to zero. But some would argue, why not get a shorter interval by using an upper limit of 1 - 0.05**(1/30) = 0.095034 instead of 1 - 0.025**(1/30) = 0.115703? I think this is what StatXact does. I would reply that in fact there is a much more comprehensive way to shorten the C-P interval while keeping the defining property of min CP = 0.95. This was developed by Blyth & Still. It isn't often used, I think this is because it's so untransparent how it works, hence little use in presenting results convincingly.

Multiple comparisons

Should I use multiple comparison tests? Is this a statistical fishing expedition?
A number of people have asked questions about multiple comparisons. My favoured approach is to design experiments with clearly defined comparisons in a manner that avoid post-hoc 'dredging' and the need for multiple comparison methods. In the real world, however, I favour Tukey-Kramer as a general method or Dunett as a method if multiple contrasts are being made against a control group. Advice from a statistician is important if you are in any doubt.
In reply to a specific question about Neuman-Keul's test: this is one of the methods of multiple comparison that tries to build in "conservativeness" in order to avoid the type I error that can be associated with dredging your data for differences. It is a controversial area: Peter Armitage gives an excellent discussion of such methods and provides examples in:

P. Armitage & G. Berry, Statistical Methods in Medical Research, Blackwell 1994.

see also:

Miller R. G. (jnr.), Simultaneous Statistical Inference, (2nd edition) Springer-Verlag 1981.

Hsu J.C., Multiple Comparisons. Chapman and Hall, 1996
Is it possible to do a Dunnett's or Dunn's type multiple comparison versus control group procedure following finding a positive Kruskal Wallis ANOVA in StatsDirect?
You can use the all possible contrasts method that is already included with the Kruskal Wallis function in StatsDirect. This has less power to detect a difference between comparison groups and a control group than an hypothetical nonparametric analogue of Dunnett's method, nevertheless, any statistically significant difference detected should be investigated further. We will look into writing a nonparametric analogue to Dunnett's method [Chris Palmer, cp255@cam.ac.uk wants to be notified when this is done].

Graphics

How can I manipulate chart titles etc.?
If you click on a graphic in a StatsDirect report then right click and select copy from the popup menu, you will copy it as a Windows metafile. This can then be pasted into Microsoft Word, then you can select the graphic in Word, right click and choose 'Edit Picture' from the popup menu. Do not use drag and drop as this does not work with all versions of Word. If you use the copy and paste method you can edit the graphic as a line drawing in Word. Make sure you have install the graphics converter options for the WMF format when you install Office.

Diagnostic tests

Is there a way of estimating the required sample sizes for a trial which is designed to compare tests (sensitivity/specificity etc), to give desired confidence intervals on estimates?
Sensitivity and specificity are binomial proportions:
DISEASE: Present Absent

TEST:+ a (true +ve) h3 (false +ve)

- c (false -ve) d (true -ve)

Sensitivity = a/(a+c)

Specificity = d/(h3+d)

So you can use the population survey sample size calculation for the target sensitivity% or specificity% within a specified tolerance and probability of being wrong (i.e. not within that tolerance).
What is the importance of sub-populations in estimating sensitivity and specificity of a diagnostic test?
From Sally Hollis {There is a brief discussion of this in the explanation and elaboration document for the STARD statement (see items 23 and 18) web references: STARD initiative http://www.consort-statement.org Explanatory document http://www.clinchem.org/cgi/content/full/49/1/7}

Meta-analysis

How do I calculate power or sample size for meta-analyses?
In order to calculate the statistical power of a meta-analysis you need a good estimate of pooled variance of the effect size of interest and reasonable assumptions to be made about the effects of inter-study differences in exposure/conditions (heterogeneity) on both the variance and the effect estimate. All of this is non-trivial, and is best handled by a Statistician closely involved with the meta-analysis. It would be possible to add power results to all of the StatsDirect meta-analysis output, but the appropriateness of this needs further debate. Some would argue that under-powered studies should not have been performed, and will therefore question their quality for inclusion in the systematic review. If this approach is taken then pooled power is almost irrelevant.

ROC curves

How do I construct and compare ROC curves?
You can use the ROC function of the StatsDirect graphics menu to construct and ROC curves and to calculate the area under them with a confidence interval.

If you wish to compare the area under two or more ROC curves it is best to consult with a statistician . Different methods may be better for different situations (depends on the measurement scale of the outcome).

Statistical thinking on the use of ROC curves is evolving, here is a reference list

Zou KH, Hall WJ, Shapiro D. Smooth non-parametric ROC curves for continuous diagnostic tests. Statistics in Medicine 1997;16:2143-56.

Here is a reference list for ROC curve comparison:

Metz's program CLABROC avilable as part of ROCFIT via anoymous ftp from random.bsd.uchicago.edu in /roc/ibmpc

Altham, P.M.E. (1973) A non-parametric measure of signal discriminability. Brit. J. Math. Statist. Psychol. 26, 1-12.

Altman & Bland, BMJ vol 309 16July1994 p 188

Bamber, D. (1975) The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J. Math. Pychol. 12, 387-415.

Begg, C.h3.: Advances in statistical methodology for diagnostic medicine in the 1980's. Statistics in Medicine 10,1887-1895 (1991).

Bingham, N.H., Goldie, C and Teugels, J.L. (1987) Regular Variation.Cambridge University Press.

Michael Campbell & David Machin. in Medical Statistics, a common sense approach Section 3.4 (p 40-42) (Wiley)

Campbell, G. and Ratnaparkhi, M.V. (1993) An application of Lomax distributions in receiver operating characteristic (ROC) curve analysis. Comm. Statist. 22, 1681-1697.

Delong et al, Biometrics 44 1988, p837-845

Dorfman, D.D. and Alf, E. Jr. (1968) Maximum likelihood estimation of parameters of signal detection theory -- A direct solution. Psychometrika 33, 117-124.

Dorfman, D.D. and Alf, E. Jr. (1969) Maximum likelihood estimation of parameters of signal detection theory and determination of confidence intervals -Rating method data. J. Math. Psychol. 6, 487-496.

England, W. L. (1988) An exponential model used for optimal threshold selection in ROC curves. Med. Dec. Making 8, 120-131.

Feller, W. (1971) An Introduction to Probability Theory and its Applications, Wiley.

Goddard, M.J. and Hinberg, I. (1990) Receiver operating characteristiic (ROC) curves and non-normal data: an empirical study. Stat. Med 9,325-337.

Green, D.M. and Swets, J.A. (1966) Signal Detection Theory and Psychophysics, Wiley.

Hanley, J.A. and McNeil, h3.J. (1982) The meaning and use of the area under the receiver operating characteristic (ROC) curve. Radiology 143, 29-36.

J Hanley and h3 McNeil, Maximum attainable discrimination and the utilization of radiologic examinations, Journal of Chronic Disease, 1982;35:601-611

Hanley, J.A. and McNeil, h3.J. (1983) A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 148, 839-843.

Hanley, J.A.: Receiver operating characteristic methodology: the state of the art. CRC Critical Reviews in Diagnostic Imaging 29, 307-335 (1989).

Karamata, J. (1930) Sur un mode de croissance reguliere des functions. Mathematica (cluj) 4, 38-53.

Karamata, J. (1933) Sur un mode de croissance reguliere. Theoremes fondamenteaux. Bull. Soc. Math. France 61, 55-62.

H C Kraemer Evaluating medical tests. Objective and quantiatrive guidelines (1992) Sage publications, Beverly Hills

Luce, R.D. (1959) Individual Choice Behaviour Wiley: New York.

McCullagh, P. (1980) regression models for ordinal data (with discussion). J. Roy.Statist. Soc. h3 42, 109-142.

Metz, C.E. and Kronman, H.h3. (1980) Statistical significance tests for binomal ROC curves. J. Math. Psychol. 22, 218-243.

Charles E Metz, Basic principles of ROC analysis, Seminars in Nuclear Medicine, Vol 8, No.4, 1978, 283-298.

Metz CE. ROC Methodology in Radiologic Imaging. Invest.Radiol. 1986;21:720-723.

Moise, A., Clement, h3., Ducimetiere, P. and Bourassa, M.G. (1985) Comparison of receiver operating curves derived from the same population: a bootstrapping approach. Comp. Biom. Res. 18, 218-243.

Moise, A., Clement, h3. and Raissas, M. (1988) A test for crossing receiver operating characteristic (ROC) curves. Comm. Statist. 17, 1985-2003.

Moses, L.E., Shapiro, D. and Littenberg, h3. (1993) Combining independent studies of diagnostic test into a summary roc curve: data-analytic approaches and some additional considerations. Stat. Med. 12, 1293-1316.

Mossman, D., Resampling techniques in the analysis of non-normal ROC data Medical Decision Making 1995 15: 358-366

Rice, S.O. (1944) Mathematical analysis of random noise. h3. Sys. Tech. J. 23, 282-332.

D.Sackett, R. Haynes, G. Guyatt, P. Tugwell. Clinical Epidemiology, Little, Brown @ Company, 1991 pp. 113-119

Shimizu, R. (1962) Characterization of the normal distribution, II. Ann. Inst. Statis. Math. Tokyo 14, 173-178.

Simpson, A.J. and Fitter, M.J. (1973) What is the best index of detectability? Psychol. Bull. 80, 481-488.

P Strike (1996) Measurement in laboratory medicine Butterworth-Heineman, Oxford (this includes a PC disk containing simple-to-use software)

Svensson and Holm, Stats in Medicine 1994 13, 2437-2453 (Separation of systematic and random differences in ordinal rating scales)

Swets JA ROC analysis applied to the evaluation of medical imaging techniques. Invest. Radiol. vol14 109-121 1979

Swets,JA, Pickett RM.Evaluation of Diagnostic Systems. Methods from signal detection theory. Academic Press, 1982.

Taylor I, Mullee MA and Campbell MJ. Prognostic index for the development of liver metastases in patients with colorectal cancer Br J Surg 1990 vol 77 P499-501

Thomas, E.C. and Myers, J.L. (1972) Implications of latency data for threshold and non-threshold models of signal detection. J. Math. Pychol. 9, 253-285.

Thompson, M.L. and Zucchini, W. (1989) On the statistical analysis of ROC curves. Stat. Med. 8, 1277-1290.

Tosteson, A.N. and Begg, C.h3. (1988) A general regression methodology for ROC curve estimation. Med. Dec. Making 8, 205-215.

Vardi, Y. (1982) Non-parametric estimation in the presence of length bias. Ann. Statist, 10, 616-620.

Wieand, S., Gail, M.H., James, h3.R. and James, K.L. (1989) A family of non-parametric statistics for comparing diagnostic markers with paired or unpaired data. Biometrika 76, 585-592.

Zweig and Campbell, "ROC plots: a fundamental evaluation tool in clinical medicine" Clin. Chem. vol 39 no. 4 1993, p561-577

Installation

If I have a single user licence, may I install StatsDirect on different computers?
Yes - the licence is per user and not per computer.

You can install on a second computer via http://www.statsdirect.com/Update.aspx and by using the username⁄email and licence key information that you were originally sent.
Can I install StatsDirect across a network?
Yes, this is achieved either by cloning or scripting a silent/unattended installation follwed by license key burning.

StatsDirect uses standard Microsoft installer technology and can be pushed out to desktops easily via silent install.

Deployment instructions for network managers can be found at http://www.statsdirect.com/download/support/deploy.docx

Repeated measures at different times

Q: Please can you advise me on the best way to analyse repeated measurements performed on different patients, which differ only by time? I understand that measurements that differ by time will be serially correlated, so a 2-way ANOVA is inappropriate. What is the best way to analyse such data and can it be done in StatsDirect?

A: This is a difficult area for which you should seek expert statistical guidance. You may need specialist software for modelling, for example http://tigger.uic.edu/~hedeker/mix.html (best driven by the Statistician involved).
Some useful references are:

Ware JH. Linear models for the analysis of longitudinal studies. The American Statistician 1985;39:95-101.

Davis CS. A computer program for non-parametric analysis of incomplete repeated measures for two samples. Computer Methods and Programs in Biomedicine 1994;42:39-52.

David CS, Hall DB. A computer program for the regression analysis of ordered categorical repeated measurements. Computer Methods and Programs in Biomedicine 1995;51:153-169.

Precision, truncation and rounding

Will I get rounding errors in StatsDirect like I do in Excel (e.g. mod(11.2932, 0.3137) = 1.33227E-15 when it should be equal to zero since 11.2932 = 36*0.3137)?

StatsDirect uses very precise calculation methods in order to keep calculation error to a minimum.

In previous discussion archives, Dr Barry Tennison gave the following easy to understand explanation of rounding error:

"Inside all (normal) computers, numbers are represented in binary (in various forms like fixed point bor floating point). Since one cannot include an infinite number of bits (binary digits) after the decimal point, the only numbers that are represented exactly are those that can be expressed as a fraction with denominator a power of two (just as the only terminating decimals are those expressible as a fraction with denominator a power of ten). For example one third (1/3) cannot be expressed as a terminating (finite) decimal or binary number. Therefore the INTERNAL forms of numbers like this represent the intended (exact) numbers only approximately. The apparent rounding errors result from this, rather than from any inaccuracy of calculation.

The following is a more detailed introduction to this subject:

Numerical precision and error

"Although this may seem a paradox, all exact science is dominated by the idea of approximation."

Russell, Bertrand (1872-1970)

Numbers with fractional parts (real/floating-point as opposed to integer/fixed-point numbers) cannot all be fully represented in binary computers because computers cannot hold an infinite number of bits (binary digits) after the decimal point. The only real numbers that are represented exactly are those that can be expressed as a fraction with denominator that is a power of two (e.g. 0.25); just as the only terminating (finite) decimals are those expressible as a fraction with denominator that is a power of ten (e.g. 0.1). Many real numbers, one third for example, cannot be expressed as a terminating decimal or binary number. Binary computers therefore represent many real numbers in approximate form only, the global standard for doing this is IEEE Standard Floating-Point Representation (IEEE, 1985).

Numerical algorithms written for the Microsoft .Net platform comply with IEEE Standard Floating-Point Representation. All real numbers in StatsDirect are handled in double precision.

Arithmetic among floating point numbers is subject to error. The smallest floating point number which, when added to 1.0, produces a floating-point number different to 1.0 is termed the machine accuracy e_m (Press et al., 1992). In IEEE double precision e_m is approximately 2.22 ´ 10^-16. Most arithmetic operations among floating point numbers produce a so-called round-off error of at least e_m. Some round-off errors are characteristically large, for example the subtraction of two almost equal numbers. Round-off errors in a series of arithmetic operations seldom occur randomly up and down. Large round-off error at the beginning of a series of calculations can become magnified such that the result of the series is substantially imprecise, a condition known as instability. Algorithms in StatsDirect were assessed for likely causes of instability and common stabilising techniques, such as leaving division as late as possible in calculations, were employed.

Another error inherent to numerical algorithms is the error associated with approximation of functions; this is termed truncation error (Press et al., 1992). For example, integration is usually performed by calculating a function at a large discrete number of points, the difference between the solution obtained in this practical manner and the true solution obtained by considering every possible point is the truncation error. Most of the literature on numerical algorithms is concerned with minimisation of truncation error. For each function approximation in StatsDirect, the most precise algorithms practicable were written in the light of relevant, current literature.

References:

IEEE Standard for Binary Floating Point Numbers, ANSI/IEEE Std 754. New York: Institute of Electrical and Electronics Engineers (IEEE) 1985.

Press WH, et al. Numerical Recipes, The Art of Scientific Computing (2nd edition). Cambridge University Press 1992.

Null, zero and missing data

What is the difference between a null and a zero value?
A null value is a missing or excluded observation, recorded as a gap in a worksheet or as the asterisk * symbol. The internal code for a missing value is 3E+300, which would also be treated as missing/null if you entered it as an observed value. Zero must always be entered as 0 or 0.0 or 0.0e0 in order for it to be treated as an observation of zero - but remember that in categorical analysis, such as counts in a contingency table, some researchers may refer to zero as null response.

Ratios

Q: What is the best way of correlating 2 ratios?

I am postulating that low carotid bifurcations have a longer length of disease beyond the bifurcation, and have measured and created 2 ratios (a bifurcation ratio -length from clavicle to bifurcation divided by total carotid length; and a disease ratio (length from bifurcation to end of disease divided by length from clavicle to end of disease). Is it valid to use simple linear regression/correlation with bifurcation ratio as the independent variable and disease ratio as the dependent variable?

A: Ratio measurement scales possess all properties of interval scales plus an absolute zero point. You might want to look at limits of agreement rather than correlation.

FAQ

Is there an easy to read text book on Statistics with StatsDirect?

Is there a Mac version of StatsDirect?

How do I cite StatsDirect software in papers?

To remove the Excel add-in

Why do I sometimes see slightly different P values in StatsDirect compared with other software?

Why does StatsDirect give slightly different confidence intervals for proportions compared with those calculated by hand?

What do I do if there is a proportion with zero numerator?

Observation:

Reply from Dr Robert Newcombe (expert in this field):

Should I use multiple comparison tests? Is this a statistical fishing expedition?

Is it possible to do a Dunnett's or Dunn's type multiple comparison versus control group procedure following finding a positive Kruskal Wallis ANOVA in StatsDirect?

How can I manipulate chart titles etc.?

Is there a way of estimating the required sample sizes for a trial which is designed to compare tests (sensitivity/specificity etc), to give desired confidence intervals on estimates?

What is the importance of sub-populations in estimating sensitivity and specificity of a diagnostic test?

How do I calculate power or sample size for meta-analyses?

How do I construct and compare ROC curves?

If I have a single user licence, may I install StatsDirect on different computers?

Can I install StatsDirect across a network?

Will I get rounding errors in StatsDirect like I do in Excel (e.g. mod(11.2932, 0.3137) = 1.33227E-15 when it should be equal to zero since 11.2932 = 36*0.3137)?

What is the difference between a null and a zero value?

Q: What is the best way of correlating 2 ratios?